manupa-arm commented on a change in pull request #65:
URL: https://github.com/apache/tvm-rfcs/pull/65#discussion_r839354482
##########
File path: rfcs/0009_Unified_Static_Memory_Planning.md
##########
@@ -349,6 +349,108 @@ tvmc compile my_model.tflite --executor=aot
--output-format=mlf --target=c
TVMExecute(&my_model, &inputs, &outputs, &context);
}
```
+
+## U4 : User wants to write/read directly to the workspace buffer
+
+This usecase allows the space used by I/O tensors to be re-used by the
inference.
+
+### TVMC
+```
+ tvmc compile my_model.tflite
+ --executor=aot
+ --target=c
+ --workspace-pools=sram
+ --pass-config tir.usmp.enable=1
+ --pass-config tir.usmp.use_workspace_io=1
+
+```
+### Codegen'd Artifacts
+```
+ //Codegen'd artifacts in metadata.c (lib0.c)
+
+ int32_t tvmgen_my_model_run(
+ tvmgen_my_model_workspace_pools* workspace_pools,
+ ){
+ return my_model_main(workspace_pools.sram);
+ }
+
+ // Returns a handle pointing to space inside the
+ // workspace pool where input should be stored
+
+ tvmgen_my_model_inputs tvmgen_my_model_map_inputs(
+ tvmgen_my_model_workspace_pools* workspace_pools
+ ) {
+ tvmgen_my_model_inputs = {
+ .input0 = &workspace_pools->sram[<INPUT0_OFFSET>],
+ };
+ return tvmgen_my_model_inputs;
+ }
+
+ // Returns a handle pointing to space inside the
+ // workspace pool where output is stored
+
+ tvmgen_my_model_outputs tvmgen_my_model_map_outputs(
+ tvmgen_my_model_workspace_pools* workspace_pools
+ ) {
+ tvmgen_my_model_outputs = {
+ .output0 = &workspace_pools->sram[<OUTPUT0_OFFSET>],
+ };
+ return tvmgen_my_model_outputs;
+ }
+```
+```
+// metadata.h
+
+ #define TVM_MY_MODEL_SRAM_WORKSPACE_BUFFER_SIZE xxxx
+
+ typedef struct {
+ uint8_t* sram;
+ } tvmgen_my_model_workspace_pools;
+
+ typedef struct {
+ uint8_t* input0;
+ } tvmgen_my_model_inputs;
+
+ typedef struct {
+ uint8_t* output0;
+ } tvmgen_my_model_outputs;
+
+ tvmgen_my_model_inputs tvmgen_my_model_map_inputs(
+ tvmgen_my_model_workspace_pools* workspace_pools
+ );
+
+ tvmgen_my_model_outputs tvmgen_my_model_map_outputs(
+ tvmgen_my_model_workspace_pools* workspace_pools
+ );
+```
+### User Application
+```
+ // The User Application model;
+ __attribute__((section( "SRAM" ), aligned( 16 ))) static uint8_t
workspace_buffer_sram[TVM_MY_MODEL_SRAM_WORKSPACE_BUFFER_SIZE];
+
+ int main(...) {
+ ...
+ tvmgen_my_model_workspace_pools workspaces = {
+ .sram = &workspace_buffer_sram,
+ };
+ tvmgen_my_model_inputs inputs =
+ tvmgen_my_model_map_inputs(&workspaces);
+ tvmgen_my_model_outputs outputs =
+ tvmgen_my_model_map_outputs(&workspaces);
+
+ // Generate input tensor by passing the handle
+ // E.g. this could be a driver writing directly to
+ // the workspace buffer
+ GenerateInput(inputs.input0)
+
+ tvmgen_my_model_run(&workspaces);
Review comment:
Good points @areusch!
One thing I would like to make clear, the proposal here is to extend the
current TVM's status. Therefore, this change do not consider the following two
scenarios :
A1. using tvm.relay.build(List[IRModule])
A2. Selective re-use of space of some I/O tensors.
Thus, I would consider them out of scope for extension proposed here.
However, I'd like to highlight how this work is unrelated or do not block
such extentions in the future.
For A1,
If and when we want to do this (hopefully in that RFC) -- discuss why would
we ever want to do this. As of now I feel this over-stretching that API. Even
if we wanted do that, there is a larger issue to be solved because in the case
where PassConfig need to be seperately set for the different programs that
coupled in the build call. e.g. we might want not to vectorize one program
while we want to do vectorize the other.
For A2,
This is a perfectly valid extension upon this work.
Right now, we have kept the change self-contained within USMP Pass --
therefore using PassConfig.
I would still consider this route to be valueable because the user does not
need to manually annotate all the I/O tensors otherwise, which would account
for most usage of this feature.
One thing that I find challenging would be to use these annotations, before
relay.build(...) following the agreement reached here :
https://github.com/apache/tvm-rfcs/blob/main/rfcs/0029-migrating-to-irmodule-attributes.md.
i.e. the annotations are being done inside relay.build(...).
Let say if we could overcome that hurdle, then what needs to be done to
support that would be just use the annotation to filter CreateAllocateIO pass
to select which I/O tensors should be selected based on the annotations. Thus,
one could simply extend the feature proposed here to support that derivative
advanced feature.
I can add some text as future considerations, if you like. lmk.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]