Should code that is directly using the builtins themselves (like
__builtin_ia32_pblendw256) be optimized too? If so wouldn't it be
better to, for example, leave _mm256_blend_epi16 as is, remove
__builtin_ia32_pblendw256 from  BuiltinsX86.def and make it a #define
to shufflevector?

http://reviews.llvm.org/D3601



_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to