nikic wrote:

On x86, what we actually end up doing is to combine those to unaligned i64 
loads (see https://godbolt.org/z/P5z674x4r), which is probably the best outcome 
if they are supported. I assume AMDGPU does not support unaligned loads, and 
that's why you want to have single element loads that get inserted into a 
vector and then perform sub-vector extracts on it?

https://github.com/llvm/llvm-project/pull/133301
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to