bruno added a comment.

> Thanks for your comment. We have a pass to undo the vec4 to vec3. I wondered 
> why clang generates the vec4 for vec3 load/store. As you can see the comment 
> on clang's code, they are generated for better performance. I had a questions 
> at this point. clang should consider vector load/store aligned by 4 
> regardless of target???

I believe the assumption is more practical: most part of upstream llvm targets 
only support vectors with even sized number of lanes. And in those cases you 
would have to expand to a 4x vector and leave the 4th element as undef anyway, 
so it was done in the front-end to get rid of it right away. Probably GPU 
targets do some special tricks here during legalization.

> llvm's codegen could handle vec3 according to targets' vector load/store with 
> their alignment. I agree the flag looks odd but I was concerned some llvm's 
> targets depend on the vec4 so I suggested to add the flag. If I missed 
> something, please let me know.

My guess here is that targets that do care are looking through the vector 
shuffle and customizing to whatever seems the best to them. If you wanna change 
the default behavior you need some data to show that your model solves a real 
issue and actually brings benefits; do you have any real examples on where that 
happens, or why GPU targets haven't yet tried to change this? Maybe other 
custom front-ends based on clang do? Finding the historical reason (if any) 
should be a good start point.


https://reviews.llvm.org/D30810



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to