Hello,
I was wondering if the new #pragma target in *mmintrin.h make this
approach more acceptable for 4.10?
http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00374.html
On Sun, 7 Apr 2013, Marc Glisse wrote:
Hello,
the attached patch is very incomplete (it passes bootstrap+testsuite on
x86_64-linux-gnu), but it raises a number of questions that I'd like to
settle before continuing.
* Is there any chance of a patch in this direction being accepted?
* May I remove the builtins (from i386.c and the doc) when they become
unused?
* Do we want to keep the casts even when they don't seem strictly necessary?
For instance for _mm_add_ps, we can write:
return __A + __B;
or:
return (__m128) ((__v4sf)__A + (__v4sf)__B);
Note that for _mm_add_epi8 for instance we do need the casts.
* For integer operations like _mm_add_epi16 I should probably use the
unsigned typedefs to make it clear overflow is well defined? (the patch still
has the signed version)
* Any better name than __v4su for the unsigned version of __v4si?
* Other comments?
2013-04-07 Marc Glisse <marc.gli...@inria.fr>
* emmintrin.h (__v2du, __v4su, __v8hu): New typedefs.
(_mm_add_pd, _mm_sub_pd, _mm_mul_pd, _mm_div_pd,
_mm_cmpeq_pd, _mm_cmplt_pd, _mm_cmple_pd, _mm_cmpgt_pd, _mm_cmpge_pd,
_mm_cmpneq_pd, _mm_add_epi8, _mm_add_epi16, _mm_add_epi32,
_mm_add_epi64, _mm_slli_epi16, _mm_slli_epi32, _mm_slli_epi64,
_mm_srai_epi16, _mm_srai_epi32, _mm_srli_epi16, _mm_srli_epi32,
_mm_srli_epi64): Replace builtins with vector extensions.
* xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps, _mm_div_ps,
_mm_cmpeq_ps, _mm_cmplt_ps, _mm_cmple_ps, _mm_cmpgt_ps, _mm_cmpge_ps,
_mm_cmpneq_ps): Likewise.
--
Marc Glisse