Lunderberg commented on issue #8540: URL: https://github.com/apache/tvm/issues/8540#issuecomment-888536554
> cc @Lunderberg if this is related to your recent investigation. Hmm, looks a bit different from the Vulkan vectorization issues I've been investigating. There, the TIR looks reasonable but the generated code was not. Here, the TIR asks for an improper alignment, and the codegen follows those instructions. > I agree, CUDA Codegen should check alignment before generating vectorized loads I think that would make sense, though as far as I can tell, the [only check currently present](https://github.com/apache/tvm/blob/main/src/target/source/codegen_c.cc#L711) is whether the array stride is 1. I don't see any calls into cuda-specific virtual functions adding any additional checks. I've been having some conversation with @tqchen ([discuss link](https://discuss.tvm.apache.org/t/pre-rfc-vectorized-tir-buffers/10615)) about where to best handle this. I think this would be another instance that could be handled in a hypothetical `LegalizeVectorization` pass, to either convert the `ramp(base,1,lanes)` into access into a vectorized array, or to convert into a loop with scalar offsets, depending on the alignment of `base`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
