MasterJH5574 opened a new pull request, #15340: URL: https://github.com/apache/tvm/pull/15340
This PR fixes a bug of DecodeGeMV dlight rule when the innermost tensor dimension is spatial with `unroll_factor` (for example, the grouping used in group quantization). Prior to this PR, a reduction loop that is bound to threadIdx was reordered to reside outside a split spatial loop, which prevents the TIR LowerCrossThreadReduction pass to successfully apply due to some safety-guard requirement. This PR fixes this issue by not reordering the split spatial loop after the reduction loop, so that the pass can be applied. Note that we can do this as the order of thread-binding loops does not matter. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
