Commited in r187694. And I'll fix the avxintrin header in just a moment.
On Sat, Aug 3, 2013 at 1:08 AM, Craig Topper <[email protected]> wrote: > Option (2) directly matches the capabilities of the shufflevector > instruction in the LLVM IR. I have attached a patch that will allow -1 to > become undef in the IR. > > So > > __builtin_shufflevector( x, y, 0, 4, -1, 5 ); > > becomes > > shufflevector <4 x float> %x, <4 x float> %y, <4 x i32> <i32 0, i32 4, i32 > undef, i32 5> > > > On Fri, Aug 2, 2013 at 6:15 PM, Katya Romanova < > [email protected]> wrote: > >> >> >> Craig Topper <craig.topper@...> writes: >> >> > >> > >> > Ok so -1 isn't valid for indices, and i have even more questions about >> __builtin_shufflevector the more i look at it. See my message in cfe-dev. >> > >> > >> > On Thu, Jul 18, 2013 at 6:12 PM, Chandler Carruth >> <[email protected]> wrote: >> > >> > On Thu, Jul 18, 2013 at 6:11 PM, Craig Topper >> <[email protected]> wrote: >> > >> > >> > >> > >> > >> > >> > Would __builtin_shufflevector(__a, __a, 0, 1, -1, -1) work? >> > >> > >> > >> > >> > >> > Personally, I would prefer a defined way to produce an undef input in >> general... but if folks are worried about exposing such an interface, then >> sure, we could just allow the shuffle builtin itself to designate an >> "undef" >> input with goofy indices. >> > >> > >> > >> > >> > >> > >> > On Thu, Jul 18, 2013 at 5:42 PM, Chandler Carruth >> <[email protected]> wrote: >> > >> > >> > >> > >> > >> > >> > >> > On Thu, Jul 18, 2013 at 5:32 PM, Katya Romanova >> <[email protected]> wrote:- >> __m128d __zero = _mm_setzero_pd(); >> > - return __builtin_shufflevector(__a, __zero, 0, 1, 2, 2); >> > + return (__m256d)__builtin_ia32_pd256_pd((__v2df)__a); >> > >> > >> > I think this is the wrong approach. >> > >> > Rather than switching these to use an x86-specific builtin, instead it >> would be better to provide some generic form to produce an undef input to >> a >> shufflevector. That is a generally useful and completely target >> independent >> concept. >> > >> > >> > >> > >> > >> > _______________________________________________ >> > cfe-commits mailing listcfe-commits <at> >> cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >> > >> > >> > -- ~Craig >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > -- ~Craig >> > >> > >> > >> > _______________________________________________ >> > cfe-commits mailing list >> > cfe-commits@... >> > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >> > >> >> >> >> I agree with Chandler that it's better to use a shuffle with undef input >> (which is target independent), even though we generate code for AVX >> intrinsics. The reason I initially ended up using a x86-specific builtin >> is >> because there I couldn't find a generic way to create "undef" input for a >> shuffle. >> >> I tried the following, but I didn't like it, because the compiler gives a >> warning when compiling avxintrin.h >> >> static __inline __m256d __attribute__((__always_inline__, __nodebug__)) >> _mm256_castpd128_pd256(__m128d in) >> { >> __m128d undef; >> return __builtin_shufflevector(in, undef, 0, 1, 2, 2); >> } >> >> I tried this as well and I didn't like it either: >> >> static __inline __m256d __attribute__((__always_inline__, __nodebug__)) >> _mm256_castpd128_pd256(__m128d in) >> { >> __v2df __in = (__v2df) in; >> __v4df ret; >> ret[0]=in[0]; >> ret[1]=in[1]; >> return (__m256d)ret; >> } >> >> So, I ended up introducing a x86_64 builtin and lowered it later to a >> shuffle with undef (not a target-independent solution). >> >> static __inline __m256d __attribute__((__always_inline__, __nodebug__)) >> _mm256_castpd128_pd256(__m128d __a) { >> return (__m256d)__builtin_ia32_pd256_pd((__v2df)__a); >> } >> >> >> I've read Craig's proposal about using shuffle builtin with negative >> indeces >> (-1) to indicate shuffle with undef. This solution looks good. However, >> "-1" >> shuffle index is presently considered invalid. We need to discuss >> extending >> shuffle syntax/semantics and then implement this extension before I could >> use a shuffle with negative indices for AVX typecast builtins. It looks >> like >> it will take some time... >> >> I was wondering if it's possible to check in my current fix that is using >> x86_86 builtins (instead of a shuffle) for AVX typecast intrinsics for >> now. >> When shuffle learns to understand negative indices, I could easily >> replaces >> my changes with something like that: >> >> __builtin_shufflevector(__a, __a, 0, 1, -1, -1) >> >> If this interim solution doesn’t sound inappropriate, we should start a >> discussion about extending shuffle builtin functionality to understand >> negative indexes. >> >> Here are several ideas: >> >> We could use "unary" form of __builtin_shufflevector when negative indices >> are used. >> A "binary" form could be used with negive indexes as well, but semantic >> analysis should ensure that the first and the second parameter is actually >> the same vector. Here is the reason for this limitation: >> >> If negative indices specify "undef" and a binary form of >> __builtin_shufflevector is used with different first and second parameter, >> e.g. __builtin_shufflevector(a, b, 0, 1, 7, -1) >> then, in fact, we will be shuffling 3 vectors (a, b and undef). I don’t >> think that it’s a good idea to extend __builtin_shufflevector semantic to >> do >> that. >> >> >> Which solution is preferred? >> (1) Support negative indices for unary form of __builtin_shufflevector >> only. >> (2) Support negative indices for binary form of __builtin_shufflevector >> only >> and ensure that the first and the second parameter is the same vector. >> (3) Support both (1) and (2). >> (4) Another possible (though very different from proposed above) solution >> that allows to use "undef" in shuffles would be adding a >> target-independent >> builtin (e.g __builtin_undef(vector a)), which creates an “undef” vector >> with the same type and the same number of elements as its vector argument. >> With this "undef" builtin, I could implement AVX typecast builtins like >> that: >> >> static __inline __m256d __attribute__((__always_inline__, __nodebug__)) >> _mm256_castpd128_pd256(__m128d in) >> { >> __m128d undef = __builtin_undef(in); >> return __builtin_shufflevector(in, undef, 0, 1, 2, 2); >> } >> >> Thoughts? >> >> >> Thank you! >> Katya. >> >> >> >> _______________________________________________ >> cfe-commits mailing list >> [email protected] >> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits >> > > > > -- > ~Craig > -- ~Craig
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
