[I] [Bug][meta_schedule] Tutorial `e2e_opt_model.py` fails [tvm]

via GitHub Mon, 26 May 2025 10:15:58 -0700


vacu9708 opened a new issue, #18018:
URL: https://github.com/apache/tvm/issues/18018


   Thanks for participating in the TVM community! We use https://discuss.tvm.ai 
for any general usage questions and discussions. The issue tracker is used for 
actionable items such as feature proposals discussion, roadmaps, and bug 
tracking.  You are always welcomed to post on the forum first :smile_cat:
   
   Issues that are inactive for a period of time may get closed. We adopt this 
policy so that we won't lose track of actionable issues that may fall at the 
bottom of the pile. Feel free to reopen a new one if you feel there is an 
additional problem that needs attention when an old one gets closed.
   
   ### Expected behavior
   
   What you were expecting
   -> The tutorial code `e2e_opt_model.py` should work.
   
   ### Actual behavior
   
   What actually happened
   ```
     File 
"/home/ysy/Documents/open_source/tvm/source/src/tir/transforms/inject_software_pipeline.cc",
 line 1143, in 
tvm::tir::software_pipeline::PipelineInjector::VisitStmt_(tvm::tir::ForNode 
const*)
   InternalError: Check failed: pipeline_stages.size() == original_order.size() 
(3 vs. 4) : PrimFunc "main" has original order ["", "", "", ""], but pipeline 
annotation is [0, 0, 3] with different size
   ```
   
   ### Environment
   
   Any environment details, such as: Operating System, TVM version, etc
   - Ubuntu 22.04, Intel i7 13650hx, RTX 4060
   - **commit:** 2d964b4133aac2f92e4185b3f095df4eb3bf3a90 (0.21.0 dev)
   
   ### Steps to reproduce
   
   Preferably a minimal script to cause the issue to occur.
   -> Execute `e2e_opt_model.py`
   
   ### My analysis
   **Tracing back the bug**
   The error occurs at `inject_software_pipeline.cc:1133` during the post 
process `VerifyGPUCode`
   ```cpp
   auto pipeline_stages =
           
Downcast<Array<Integer>>(op->annotations.at(attr::software_pipeline_stage));
   CHECK_EQ(pipeline_stages.size(), original_order.size())
   ```
   As described in the error message, pipeline_stages.size() is 3 while 
original_order.size() is 4
   There are 4 blocks, while the annotation `software_pipeline_stage` has 3 
elements.
   Before `VerifyGPUCode`, `RewriteReduction` is executed, which adds an extra 
block called `init`.
   This changes original_order.size() from 3 to 4.
   This seems to cause the bug.
   
   **Potential solution**
   In my opinion, `VerifyGPUCode` just validates the normal state and the 
problem lies with `RewriteReduction`.
   `RewriteReduction` should update the size of the annotations after adding 
the extra block `init`, shouldn't it?
   I attempted to but failed to make this modification due to the complexity of 
the optimization algorithm.
   Is there any expert who can manage to do this? I'd appreciate your expertise 
to resolve this bug.
   
   ### Triage
   
   Please refer to the list of label tags 
[here](https://github.com/apache/tvm/wiki/Issue-Triage-Labels) to find the 
relevant tags and add them below in a bullet format (example below).
   
   * tune:meta_schedule
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] [Bug][meta_schedule] Tutorial `e2e_opt_model.py` fails [tvm]

Reply via email to