On Wed, Jun 3, 2015 at 6:42 AM, Siarhei Siamashka <siarhei.siamas...@gmail.com> wrote: >> + AVV (endian_xor.c[1]),0); >> + perm = vec_xor (perm,(vector unsigned char) AVV ( >> + 0x00, 0x00, 0x00, 0x00, 0x04, 0x04, 0x04, 0x04, >> + 0x08, 0x08, 0x08, 0x08, 0x0C, 0x0C, 0x0C, 0x0C)); >> + return vec_perm (pix, pix, perm); >> } > > For this part, both the original and the patched code resulted in > identical instruction sequences: > > 0000000000000000 <.vmx_splat_alpha>: > 0: 3d 22 00 00 addis r9,r2,0 > 4: 39 29 00 00 addi r9,r9,0 > 8: 7c 00 48 ce lvx v0,0,r9 > c: 10 42 10 2b vperm v2,v2,v2,v0 > 10: 4e 80 00 20 blr > > This is actually good. I was afraid that the compiler might screw up > it a bit and do something stupid like adding an extra VXOR instruction > here (for the 'vec_xor' intrinsic). >
Actually, I get a different disassembly: 0000000000007b10 <vmx_splat_alpha>: 7b10: 00 00 4c 3c addis r2,r12,0 7b14: 00 00 42 38 addi r2,r2,0 7b18: 00 00 22 3d addis r9,r2,0 7b1c: 0c 03 23 10 vspltisb v1,3 7b20: 00 00 29 39 addi r9,r9,0 7b24: 99 4e 00 7c lxvd2x vs32,0,r9 7b28: 57 02 00 f0 xxswapd vs32,vs32 7b2c: d7 04 01 f0 xxlxor vs32,vs33,vs32 7b30: 17 05 00 f0 xxlnor vs32,vs32,vs32 7b34: 2b 10 42 10 vperm v2,v2,v2,v0 7b38: 20 00 80 4e blr And without the patch, I get this: 0000000000007930 <vmx_splat_alpha>: 7930: 00 00 4c 3c addis r2,r12,0 7934: 00 00 42 38 addi r2,r2,0 7938: 00 00 22 3d addis r9,r2,0 793c: 00 00 29 39 addi r9,r9,0 7940: 98 4e 00 7c lxvd2x vs0,0,r9 7944: 50 02 00 f0 xxswapd vs0,vs0 7948: 11 05 00 f0 xxlnor vs32,vs0,vs0 794c: 2b 10 42 10 vperm v2,v2,v2,v0 7950: 20 00 80 4e blr So there is an added vspltisb + xxlxor command. I used the default configure+make. Maybe I need to define some special flag to the compiler ? This is my gcc version: gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-9) I'm running RHEL 7.1 ppc64le on POWER8 machine. Oded _______________________________________________ Pixman mailing list Pixman@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/pixman