zhanghaohit commented on pull request #8274:
URL: https://github.com/apache/tvm/pull/8274#issuecomment-863764072


   In my option, even though this revert makes 
`tests/python/fronend/onnx/test_forward.py:test_loop` pass, there may be a 
hidden bug for the vm runtime/other implementation. What if the condition `if 
(constant_size > 0 && constant_size * nbytes < runtime::kMaxStackAlloca)` is 
not true, it will generate the same IR as the current main.
   
   I admit that the removal of the following optimization:
   ```c++
       if (device_type_.defined()) {
         if (const auto* dev_type = device_type_.as<IntImmNode>()) {
           if (dev_type->value == kDLCPU) {
             int32_t constant_size = op->constant_allocation_size();
             if (constant_size > 0 && constant_size * nbytes < 
runtime::kMaxStackAlloca) {
               return stmt;
             }
           }
         }
       }
   ```
   may cause some performance regressions. But this optimization will raise 
LLVM function signature errors when there are multiple targets (For now, I 
cannot find an alternative fix to make this work).
   
   So there may be two bugs in the current codebase:
   - the VM runtime (also maybe other parts) cannot handle the IR generated by 
the current main (e.g., if `constant_size` > `runtime::kMaxStackAlloca`)
   - when there are multiple targets, the generated IR / LLVM code with this PR 
is not correct.
   
   If this problem blocks something urgent to merge, I think we can do a quick 
fix first and open a bug issue for further fix:
   - to fix the VTA unittest, we can remove the multiple targets test in 
`deploy_classification.py`. Just remove the "sim" in these three lines: 
[deploy_classification.py#L194](https://github.com/apache/tvm/blob/1fac10b359dec1bd6ad45ce36541a882aaba586b/vta/tutorials/frontend/deploy_classification.py#L194),
 
[deploy_classification.py#L206](https://github.com/apache/tvm/blob/1fac10b359dec1bd6ad45ce36541a882aaba586b/vta/tutorials/frontend/deploy_classification.py#L206)
   
[deploy_classification.py#L224](https://github.com/apache/tvm/blob/1fac10b359dec1bd6ad45ce36541a882aaba586b/vta/tutorials/frontend/deploy_classification.py#L224)
   - to quick fix the test 
`tests/python/fronend/onnx/test_forward.py:test_loop`, remove the check `assert 
out[i] == inputs[j][i], "Dims mismatch in the inputs of concatenate."` in the 
[_concatenate_shape_func](https://github.com/apache/tvm/blob/1fac10b359dec1bd6ad45ce36541a882aaba586b/python/tvm/relay/op/_transform.py#L345).
   
   For the two hidden bugs, I think it takes some time to find and fix them.
   
   What do you think?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to