avxintrin.h is now fixed in r187716.
On Sat, Aug 3, 2013 at 10:45 AM, Craig Topper <[email protected]>wrote: > Commited in r187694. And I'll fix the avxintrin header in just a moment. > > > On Sat, Aug 3, 2013 at 1:08 AM, Craig Topper <[email protected]>wrote: > >> Option (2) directly matches the capabilities of the shufflevector >> instruction in the LLVM IR. I have attached a patch that will allow -1 to >> become undef in the IR. >> >> So >> >> __builtin_shufflevector( x, y, 0, 4, -1, 5 ); >> >> becomes >> >> shufflevector <4 x float> %x, <4 x float> %y, <4 x i32> <i32 0, i32 4, >> i32 undef, i32 5> >> >> >> On Fri, Aug 2, 2013 at 6:15 PM, Katya Romanova < >> [email protected]> wrote: >> >>> >>> >>> Craig Topper <craig.topper@...> writes: >>> >>> > >>> > >>> > Ok so -1 isn't valid for indices, and i have even more questions about >>> __builtin_shufflevector the more i look at it. See my message in cfe-dev. >>> > >>> > >>> > On Thu, Jul 18, 2013 at 6:12 PM, Chandler Carruth >>> <[email protected]> wrote: >>> > >>> > On Thu, Jul 18, 2013 at 6:11 PM, Craig Topper >>> <[email protected]> wrote: >>> > >>> > >>> > >>> > >>> > >>> > >>> > Would __builtin_shufflevector(__a, __a, 0, 1, -1, -1) work? >>> > >>> > >>> > >>> > >>> > >>> > Personally, I would prefer a defined way to produce an undef input in >>> general... but if folks are worried about exposing such an interface, >>> then >>> sure, we could just allow the shuffle builtin itself to designate an >>> "undef" >>> input with goofy indices. >>> > >>> > >>> > >>> > >>> > >>> > >>> > On Thu, Jul 18, 2013 at 5:42 PM, Chandler Carruth >>> <[email protected]> wrote: >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > On Thu, Jul 18, 2013 at 5:32 PM, Katya Romanova >>> <[email protected]> wrote:- >>> __m128d __zero = _mm_setzero_pd(); >>> > - return __builtin_shufflevector(__a, __zero, 0, 1, 2, 2); >>> > + return (__m256d)__builtin_ia32_pd256_pd((__v2df)__a); >>> > >>> > >>> > I think this is the wrong approach. >>> > >>> > Rather than switching these to use an x86-specific builtin, instead it >>> would be better to provide some generic form to produce an undef input >>> to a >>> shufflevector. That is a generally useful and completely target >>> independent >>> concept. >>> > >>> > >>> > >>> > >>> > >>> > _______________________________________________ >>> > cfe-commits mailing listcfe-commits <at> >>> cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >>> > >>> > >>> > -- ~Craig >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > -- ~Craig >>> > >>> > >>> > >>> > _______________________________________________ >>> > cfe-commits mailing list >>> > cfe-commits@... >>> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >>> > >>> >>> >>> >>> I agree with Chandler that it's better to use a shuffle with undef input >>> (which is target independent), even though we generate code for AVX >>> intrinsics. The reason I initially ended up using a x86-specific builtin >>> is >>> because there I couldn't find a generic way to create "undef" input for a >>> shuffle. >>> >>> I tried the following, but I didn't like it, because the compiler gives a >>> warning when compiling avxintrin.h >>> >>> static __inline __m256d __attribute__((__always_inline__, __nodebug__)) >>> _mm256_castpd128_pd256(__m128d in) >>> { >>> __m128d undef; >>> return __builtin_shufflevector(in, undef, 0, 1, 2, 2); >>> } >>> >>> I tried this as well and I didn't like it either: >>> >>> static __inline __m256d __attribute__((__always_inline__, __nodebug__)) >>> _mm256_castpd128_pd256(__m128d in) >>> { >>> __v2df __in = (__v2df) in; >>> __v4df ret; >>> ret[0]=in[0]; >>> ret[1]=in[1]; >>> return (__m256d)ret; >>> } >>> >>> So, I ended up introducing a x86_64 builtin and lowered it later to a >>> shuffle with undef (not a target-independent solution). >>> >>> static __inline __m256d __attribute__((__always_inline__, __nodebug__)) >>> _mm256_castpd128_pd256(__m128d __a) { >>> return (__m256d)__builtin_ia32_pd256_pd((__v2df)__a); >>> } >>> >>> >>> I've read Craig's proposal about using shuffle builtin with negative >>> indeces >>> (-1) to indicate shuffle with undef. This solution looks good. However, >>> "-1" >>> shuffle index is presently considered invalid. We need to discuss >>> extending >>> shuffle syntax/semantics and then implement this extension before I could >>> use a shuffle with negative indices for AVX typecast builtins. It looks >>> like >>> it will take some time... >>> >>> I was wondering if it's possible to check in my current fix that is using >>> x86_86 builtins (instead of a shuffle) for AVX typecast intrinsics for >>> now. >>> When shuffle learns to understand negative indices, I could easily >>> replaces >>> my changes with something like that: >>> >>> __builtin_shufflevector(__a, __a, 0, 1, -1, -1) >>> >>> If this interim solution doesn’t sound inappropriate, we should start a >>> discussion about extending shuffle builtin functionality to understand >>> negative indexes. >>> >>> Here are several ideas: >>> >>> We could use "unary" form of __builtin_shufflevector when negative >>> indices >>> are used. >>> A "binary" form could be used with negive indexes as well, but semantic >>> analysis should ensure that the first and the second parameter is >>> actually >>> the same vector. Here is the reason for this limitation: >>> >>> If negative indices specify "undef" and a binary form of >>> __builtin_shufflevector is used with different first and second >>> parameter, >>> e.g. __builtin_shufflevector(a, b, 0, 1, 7, -1) >>> then, in fact, we will be shuffling 3 vectors (a, b and undef). I don’t >>> think that it’s a good idea to extend __builtin_shufflevector semantic >>> to do >>> that. >>> >>> >>> Which solution is preferred? >>> (1) Support negative indices for unary form of __builtin_shufflevector >>> only. >>> (2) Support negative indices for binary form of __builtin_shufflevector >>> only >>> and ensure that the first and the second parameter is the same vector. >>> (3) Support both (1) and (2). >>> (4) Another possible (though very different from proposed above) solution >>> that allows to use "undef" in shuffles would be adding a >>> target-independent >>> builtin (e.g __builtin_undef(vector a)), which creates an “undef” vector >>> with the same type and the same number of elements as its vector >>> argument. >>> With this "undef" builtin, I could implement AVX typecast builtins like >>> that: >>> >>> static __inline __m256d __attribute__((__always_inline__, __nodebug__)) >>> _mm256_castpd128_pd256(__m128d in) >>> { >>> __m128d undef = __builtin_undef(in); >>> return __builtin_shufflevector(in, undef, 0, 1, 2, 2); >>> } >>> >>> Thoughts? >>> >>> >>> Thank you! >>> Katya. >>> >>> >>> >>> _______________________________________________ >>> cfe-commits mailing list >>> [email protected] >>> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >>> >> >> >> >> -- >> ~Craig >> > > > > -- > ~Craig > -- ~Craig
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
