Sorry wrong list. Need more coffee!
Quentin On Oct 31, 2012, at 6:27 PM, Quentin Colombet <[email protected]> wrote: > Hi, > > When working with ARM neon intrinsics, I have encountered a hole in how vext > is mapped into the assembly. > Basically, something like this: > > vext a, b, imm > > is perfectly lowered, whereas something like this: > > vext a, a, imm > > generates more or less bad code (depending on the type of a, 128bits or > 64bits) and definitely not the expected vext instruction. > > The short story (long story at the very end of the mail for people interested > in) is that the attached patch fix that and you can find the new test cases > in the patch. > > Cheers, > > Quentin > > ---------- > The long story > > ARM doc: vext instruction extracts elements from the bottom end of the second > operand vector and the top end of the first, concatenates them and places the > result in the destination vector. > > Now, the problem when writing something like: > > vext a, a, imm > > clang translates that into a shufflevector instruction with a sequence of > integer (I simplify a bit) representing in which order each element of both > operands should appear in the result vector. > Assuming 'a' has 8 elements, they would be numbered from 0 to 7 for the first > operand and from 8 to 15 for the second operand. > For this kind of vector, the sequence of integer has the following pattern: > the i+1th element equals the ith+1 (e.g. 2, 3, 4, 5). > This is the pattern that is matched in ARMISelLowering. > > However, when both operands are the same, an instruction combine optimization > (visitShuffleVectorInst) during the late emit pass breaks this pattern. > It transforms a, a into a, undef and updates the sequence of integer > accordingly i.e. all integers point to the first operand (e.g. 2, 3, 4, 5 => > 2, 3, 0, 1). > This pattern was not recognized by as VEXT. > > Note that vrev and vext (with an undef argument) are equivalent for some > patterns. Thus, I placed the new pattern matching after vrev matching instead > of directly after vext with 2 arguments to not change exiting output, in > particular in vrev.ll tests. > > <ARMVextLowering.patch>_______________________________________________ > cfe-commits mailing list > [email protected] > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
