On Mon, Feb 13, 2012 at 4:21 PM, Søren Sandmann <[email protected]> wrote: > [email protected] (Søren Sandmann) writes: > >> Matt Turner <[email protected]> writes: >> >>> Although not part of the original MMX instruction set, both SSE and >>> AMD's Extended 3DNow! both provide the pshufw instruction. >>> >>> ARM iwMMXt also has an equivalent instruction, as do the Loongson >>> Multimedia Instructions. >>> >>> We can simplify the expand_alpha, expand_alpha_rev, and invert_colors >>> functions down to this single instruction. >>> >>> The SSE intrinsics provide _mm_shuffle_pi16, but there aren't 3DNow! >>> intrinsics (to my knowledge). This will require a bit of work to >>> configure.ac, which I haven't done yet. >> >>> I'm interested in hearing some opinions on using Extended MMX >>> instructions. >> >> It looks like we already require the "MMX_EXTENSIONS" flag in >> pixman-cpu.c in order to use the MMX implementation, so I can't see any >> reason to not just use these instructions without any ifdefs etc > > Actually, I remember an issue with these instructions. The problem is > that to get gcc to accept them on x86, pixman-mmx.c would have to be > compiled with -msse. Unfortunately, this caused gcc to generate > SSE-but-not-3DNow! instructions that then caused the original OLPC to > SIGILL.
I'll check into that. I have someone who is going to test the patch (as-is) on an XO-1 (3DNow but no SSE), so we'll see if this is still the case. I grepped through the disassembly of pixman-mmx.o and didn't see any SSE/3DNow instructions with or without the patch (with the exception of 95 pshufw instructions after). > It may be that we can get around this problem by using -m3dnow instead > and hope that this won't cause gcc to generate the floating point > instructions that were also part of 3DNow!, but not available for SSE. > > If it *does* generate such instructions, maybe we should just skip MMX > for regular PCs. It's not like there are a lot of Pentium IIIs around > anymore. Even PIII's have SSE. I'm 99% sure that the pshufw instruction is identical whether it comes from SSE or 3DNow. If we care, we could add a configure flag that enables the use of MMX Extension instructions at build time. This would allow CPUs with MMX but without 3DNow/SSE to still use the MMX fast paths. But like you say, there aren't many of these CPUs left. Matt _______________________________________________ Pixman mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/pixman
