MasterJH5574 opened a new pull request, #15340:
URL: https://github.com/apache/tvm/pull/15340

   This PR fixes a bug of DecodeGeMV dlight rule when the innermost tensor 
dimension is spatial with `unroll_factor` (for example, the grouping used in 
group quantization).
   
   Prior to this PR, a reduction loop that is bound to threadIdx was reordered 
to reside outside a split spatial loop, which prevents the TIR 
LowerCrossThreadReduction pass to successfully apply due to some safety-guard 
requirement.
   
   This PR fixes this issue by not reordering the split spatial loop after the 
reduction loop, so that the pass can be applied. Note that we can do this as 
the order of thread-binding loops does not matter.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to