Dorit Nuzman wrote:
Indeed on altivec we implement the 'mask_for_load(addr)' builtin using 'lvsr(neg(addr))', that feeds the 'realign_load' (which is a 'vperm' on altivec). I'm not too familiar with the ARM WMMX ISA, but couldn't you use a similar trick - i.e instead of using the low bits of the address for the shift amount that feeds the realign_load, use shift=(VECSIZE - neg(addr))? I think this should give shift amount VECSIZE for the aligned case (and hopefully the correct shift amounts for the unaligned cases).
On Altivec, which on all targets is apparently big endian, you would think you would want to shift elements left (lower addresses, more significant) in order to align them. Instead we shift right (higher addresses / less significant) the negative amount to be able to get the behavior the hook wants: 0 --> 0 (get more significant vector) 1 --> 15 2 --> 14 ... 15 --> 1 This works because Altivec can shift either way arbitrarily. But on WMMX, which is little endian only, we only have an instruction to shift towards lower addresses. This is of course the behavior you would expect on first glance; to obtain an aligned vector you: and r_floor,r,#-8 wldrd wr0,[r_floor] wldrd wr1,[r_floor+#8] walignr w2,w0,w1,r /* The "r" in the mnemonic is for "register" */ There is no align going the other way, because it would be strange, and (seemingly for the architects I guess) unnecessary if you are only ever little endian. Indeed, in your paper (grin) "Multi-platform Auto-vectorization" you define the functionality of realign load in terms of mis - the misalignment of the address (i.e., address&(VS)), as follows: The last VS-mis bytes of vector vec1 are concatenated to the first mis bytes of the vector vec2. This is what the walign instruction does, but it's not quite what we ended up with in GCC. In the case that mis is 0, the GCC hook wants to end up with vec2, not vec1. So for architectures that can align both ways, the current method is fine, but if the architecture is designed for one endian only we are going to have trouble exploiting the alignment feature. Thanks, Erich -- Why are ``tolerant'' people so intolerant of intolerant people?