[GitHub] [tvm-rfcs] areusch commented on a change in pull request #65: [USMP] Adding U4 usecase

GitBox Wed, 30 Mar 2022 19:42:02 -0700


areusch commented on a change in pull request #65:
URL: https://github.com/apache/tvm-rfcs/pull/65#discussion_r839120089




##########
File path: rfcs/0009_Unified_Static_Memory_Planning.md
##########
@@ -515,4 +663,6 @@ NOTE : to support tir.constants generally, we'll be 
enhancing the bound relay.co
 
 # Drawbacks
 
-* The relay "main" function that describes the call order to operator 
PrimFuncs has to be described in TIR to be able to integrate the USMP into the 
respective executor codegen. However, we dont view this as a major problem as 
the relay "main" function could easily be lowered to TIR.
\ No newline at end of file
+* The relay "main" function that describes the call order to operator 
PrimFuncs has to be described in TIR to be able to integrate the USMP into the 
respective executor codegen. However, we dont view this as a major problem as 
the relay "main" function could easily be lowered to TIR.
+
+* The U4 usecase will only be supported with [Embedded C Runtime 
Interface](https://discuss.tvm.apache.org/t/rfc-utvm-embedded-c-runtime-interface/9951/14).
 This is mainly because the nature of the requirement is associated with 
embedded usecases. However, the USMP changes here should be complimentary to 
support other runtime interfaces such as Module-based Model Runtime Interface's 
set_input and set_output in future.

Review comment:
       if we did implement this for e.g. C++ runtime or a scenario where 
`DLTensor` were passed (perhaps we have dynamic shapes), would we allocate 
DLTensor instances in whatever section of workspace we reserve for them? 
conceptually we could just consider those `DLTensor`s as companion buffers 
which need to be live at the same time as the `data`.

##########
File path: rfcs/0009_Unified_Static_Memory_Planning.md
##########
@@ -349,6 +349,108 @@ tvmc compile my_model.tflite --executor=aot 
--output-format=mlf --target=c
          TVMExecute(&my_model, &inputs, &outputs, &context);
     }
 ```
+
+## U4 : User wants to write/read directly to the workspace buffer
+
+This usecase allows the space used by I/O tensors to be re-used by the 
inference.
+
+### TVMC
+```
+    tvmc compile my_model.tflite 
+    --executor=aot 
+    --target=c
+    --workspace-pools=sram
+    --pass-config tir.usmp.enable=1
+    --pass-config tir.usmp.use_workspace_io=1
+
+```
+### Codegen'd Artifacts
+```
+    //Codegen'd artifacts in metadata.c (lib0.c)
+    
+    int32_t tvmgen_my_model_run(
+        tvmgen_my_model_workspace_pools* workspace_pools, 
+    ){
+         return my_model_main(workspace_pools.sram);
+    }
+
+    // Returns a handle pointing to space inside the
+    // workspace pool where input should be stored
+ 
+    tvmgen_my_model_inputs tvmgen_my_model_map_inputs(
+        tvmgen_my_model_workspace_pools* workspace_pools
+    ) {
+        tvmgen_my_model_inputs = {
+            .input0 = &workspace_pools->sram[<INPUT0_OFFSET>],
+        };
+        return tvmgen_my_model_inputs;
+    }
+ 
+    // Returns a handle pointing to space inside the
+    // workspace pool where output is stored
+    
+    tvmgen_my_model_outputs  tvmgen_my_model_map_outputs(
+        tvmgen_my_model_workspace_pools* workspace_pools
+    ) {
+        tvmgen_my_model_outputs = {
+            .output0 = &workspace_pools->sram[<OUTPUT0_OFFSET>],
+        };
+        return tvmgen_my_model_outputs;
+    }
+```
+```
+// metadata.h
+
+    #define TVM_MY_MODEL_SRAM_WORKSPACE_BUFFER_SIZE xxxx
+
+    typedef struct {
+       uint8_t* sram;
+    }  tvmgen_my_model_workspace_pools;
+
+    typedef struct {
+       uint8_t* input0;
+    }  tvmgen_my_model_inputs;
+
+    typedef struct {
+       uint8_t* output0;
+    }  tvmgen_my_model_outputs;
+
+    tvmgen_my_model_inputs tvmgen_my_model_map_inputs(
+        tvmgen_my_model_workspace_pools* workspace_pools
+    );
+
+    tvmgen_my_model_outputs  tvmgen_my_model_map_outputs(
+        tvmgen_my_model_workspace_pools* workspace_pools
+    );
+```
+### User Application
+```
+    // The User Application model;
+        __attribute__((section( "SRAM" ), aligned( 16 )))  static uint8_t 
workspace_buffer_sram[TVM_MY_MODEL_SRAM_WORKSPACE_BUFFER_SIZE];
+
+    int main(...) {
+         ...
+         tvmgen_my_model_workspace_pools workspaces = {
+             .sram = &workspace_buffer_sram,
+         };
+         tvmgen_my_model_inputs inputs = 
+         tvmgen_my_model_map_inputs(&workspaces);
+         tvmgen_my_model_outputs outputs = 
+         tvmgen_my_model_map_outputs(&workspaces);
+
+         // Generate input tensor by passing the handle
+         // E.g. this could be a driver writing directly to
+         // the workspace buffer
+         GenerateInput(inputs.input0)
+
+         tvmgen_my_model_run(&workspaces);

Review comment:
       right now we mark input nodes as "precious" (figuratively, this isn't a 
literal thing), and i don't think we re-use the memory. in other words, this 
line should be idempotent. I think this RFC seeks to change that, which is a 
perfectly reasonable thing to do but i like that there is a PassOption to 
support this explicitly. should this in fact be a PassOption? another way to do 
this is to annotate the relay program. the benefit of that is that if we ever 
started compiling multiple programs in a single tvm.relay.build call, we 
wouldn't have a singleton global PassOption which could apply differently to 
different programs or parameters. also, if a user didn't particularly care 
about one input, but did care about another, it might be more helpful to mark 
this at the I/O tensor level. what are your thoughts?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [tvm-rfcs] areusch commented on a change in pull request #65: [USMP] Adding U4 usecase

Reply via email to