Lunderberg commented on issue #8540:
URL: https://github.com/apache/tvm/issues/8540#issuecomment-888536554


   > cc @Lunderberg if this is related to your recent investigation.
   
   Hmm, looks a bit different from the Vulkan vectorization issues I've been 
investigating.  There, the TIR looks reasonable but the generated code was not. 
 Here, the TIR asks for an improper alignment, and the codegen follows those 
instructions.
   
   > I agree, CUDA Codegen should check alignment before generating vectorized 
loads
   
   I think that would make sense, though as far as I can tell, the [only check 
currently 
present](https://github.com/apache/tvm/blob/main/src/target/source/codegen_c.cc#L711)
 is whether the array stride is 1.  I don't see any calls into cuda-specific 
virtual functions adding any additional checks.
   
   I've been having some conversation with @tqchen ([discuss 
link](https://discuss.tvm.apache.org/t/pre-rfc-vectorized-tir-buffers/10615)) 
about where to best handle this.  I think this would be another instance that 
could be handled in a hypothetical `LegalizeVectorization` pass, to either 
convert the `ramp(base,1,lanes)` into access into a vectorized array, or to 
convert into a loop with scalar offsets, depending on the alignment of `base`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to