https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100866

--- Comment #8 from luoxhu at gcc dot gnu.org ---
(In reply to Jens Seifert from comment #7)
> Regarding vec_revb for vector unsigned int. I agree that
> revb:
> .LFB0:
>         .cfi_startproc
>         vspltish %v1,8
>         vspltisw %v0,-16
>         vrlh %v2,%v2,%v1
>         vrlw %v2,%v2,%v0
>         blr
> 
> works. But in this case, I would prefer the vperm approach assuming that the
> loaded constant for the permute vector can be re-used multiple times.
> But please get rid of the xxlnor 32,32,32. That does not make sense after
> loading a constant. Change the constant that need to be loaded.

xxlnor is LE specific requirement(not existed if build with -mbig), we need to
turn the index {0,1,2,3} to {31, 30,29,28} for vperm usage, it is required
otherwise produces incorrect result:

 6|    0x0000000010000630 <+16>:    lvx     v0,0,r9
 7+>   0x0000000010000634 <+20>:    xxlnor  vs32,vs32,vs32
 8|    0x0000000010000638 <+24>:    vperm   v2,v2,v2,v0
 9|    0x000000001000063c <+28>:    blr

(gdb)
0x0000000010000634 in revb ()
2: /x $vs34.uint128 = 0x42345678323456782234567812345678
5: /x $vs32.uint128 = 0xc0d0e0f08090a0b0405060700010203
(gdb) si
0x0000000010000638 in revb ()
2: /x $vs34.uint128 = 0x42345678323456782234567812345678
5: /x $vs32.uint128 = 0xf3f2f1f0f7f6f5f4fbfaf9f8fffefdfc
(gdb) si
0x000000001000063c in revb ()
2: /x $vs34.uint128 = 0x78563442785634327856342278563412
5: /x $vs32.uint128 = 0xf3f2f1f0f7f6f5f4fbfaf9f8fffefdfc



Quoted from the ISA:

vperm VRT,VRA,VRB,VRC

vsrc.qword[0] ← VSR[VRA+32]
vsrc.qword[1] ← VSR[VRB+32]
do i = 0 to 15
index ← VSR[VRC+32].byte[i].bit[3:7]
VSR[VRT+32].byte[i] ← src.byte[index]
end

Let the source vector be the concatenation of the
contents of VSR[VRA+32] followed by the contents of
VSR[VRB+32].
For each integer value i from 0 to 15, do the following.
Let index be the value specified by bits 3:7 of byte
element i of VSR[VRC+32].
The contents of byte element index of src are
placed into byte element i of VSR[VRT+32].

Reply via email to