shtinsa commented on PR #11800:
URL: https://github.com/apache/tvm/pull/11800#issuecomment-1178861828

   Hello  @SebastianBoblest, sorry for the delayed answer, and I would like to 
clarify the issue from the sample above. According to the c codegen output I 
see that new functionality provides following code snippets:
   ```
   TVM_DLL int32_t tvmgen_default_fused_squeeze_concatenate_1(float* 
placeholder, float* placeholder1, float* placeholder2, float* concatenate_ext, 
uint8_t* global_const_workspace_14_var, uint8_t* global_workspace_15_var) {
     void* T_squeeze_let = (&(global_workspace_15_var[32]));
     for (int32_t ax0_ax1_fused = 0; ax0_ax1_fused < 3; ++ax0_ax1_fused) {
       ((float*)T_squeeze_let)[ax0_ax1_fused] = placeholder1[ax0_ax1_fused];
     }
     for (int32_t j = 0; j < 3; ++j) {
       concatenate_ext[j] = ((float*)T_squeeze_let)[j];
     }
     for (int32_t j1 = 0; j1 < 2; ++j1) {
       concatenate_ext[(j1 + 3)] = placeholder2[j1];
     }
     return 0;
   }
   
   ```
   Where `squeeze` operation is not fused with `concatenation` code and I 
suppose you would like to see something like this:
   ```
   TVM_DLL int32_t tvmgen_default_fused_squeeze_concatenate_1(float* 
placeholder, float* placeholder1, float* placeholder2, float* concatenate_ext, 
uint8_t* global_const_workspace_14_var, uint8_t* global_workspace_15_var) {
     for (int32_t j = 0; j < 3; ++j) {
       concatenate_ext[j] = placeholder1[j];
     }
     for (int32_t j1 = 0; j1 < 2; ++j1) {
       concatenate_ext[(j1 + 3)] = placeholder2[j1];
     }
     return 0;
   }
   
   ```
   where the first and second loops are fused and the T_squeeze_let buffer 
removed from the global space? 
    
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to