Hello,

I was wondering if the new #pragma target in *mmintrin.h make this approach more acceptable for 4.10?

http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00374.html

On Sun, 7 Apr 2013, Marc Glisse wrote:

Hello,

the attached patch is very incomplete (it passes bootstrap+testsuite on x86_64-linux-gnu), but it raises a number of questions that I'd like to settle before continuing.

* Is there any chance of a patch in this direction being accepted?

* May I remove the builtins (from i386.c and the doc) when they become unused?

* Do we want to keep the casts even when they don't seem strictly necessary? For instance for _mm_add_ps, we can write:
        return __A + __B;
or:
        return (__m128) ((__v4sf)__A + (__v4sf)__B);
Note that for _mm_add_epi8 for instance we do need the casts.

* For integer operations like _mm_add_epi16 I should probably use the unsigned typedefs to make it clear overflow is well defined? (the patch still has the signed version)

* Any better name than __v4su for the unsigned version of __v4si?

* Other comments?


2013-04-07  Marc Glisse  <marc.gli...@inria.fr>

        * emmintrin.h (__v2du, __v4su, __v8hu): New typedefs.
        (_mm_add_pd, _mm_sub_pd, _mm_mul_pd, _mm_div_pd,
        _mm_cmpeq_pd, _mm_cmplt_pd, _mm_cmple_pd, _mm_cmpgt_pd, _mm_cmpge_pd,
        _mm_cmpneq_pd, _mm_add_epi8, _mm_add_epi16, _mm_add_epi32,
        _mm_add_epi64, _mm_slli_epi16, _mm_slli_epi32, _mm_slli_epi64,
        _mm_srai_epi16, _mm_srai_epi32, _mm_srli_epi16, _mm_srli_epi32,
        _mm_srli_epi64): Replace builtins with vector extensions.
        * xmmintrin.h (_mm_add_ps, _mm_sub_ps, _mm_mul_ps, _mm_div_ps,
        _mm_cmpeq_ps, _mm_cmplt_ps, _mm_cmple_ps, _mm_cmpgt_ps, _mm_cmpge_ps,
        _mm_cmpneq_ps): Likewise.

--
Marc Glisse

Reply via email to