Re: [Qemu-devel] SSE 'maxps' instruction bug?
On Tuesday 13 March 2007 10:21 pm, Julian Seward wrote: 0.9.0, or that the compiler/host combination used to build the qemu binary Julian is running generated bad code for the float compares. I used gcc 3.4.6 bootstrapped as normal ('make bootstrap; make install') on a 64-bit machine. If it is qemu generating bad code due to variations in gcc behaviour, that's another argument in favour of scrapping the gcc 3.X based backend and using a self contained, handwritten insn selector and register allocator. Are you referring to https://nowt.dyndns.org/ or something else? Rob -- Vista: Windows Millenium Second Edition ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] SSE 'maxps' instruction bug?
On Friday 16 March 2007 18:07, Rob Landley wrote: On Tuesday 13 March 2007 10:21 pm, Julian Seward wrote: 0.9.0, or that the compiler/host combination used to build the qemu binary Julian is running generated bad code for the float compares. I used gcc 3.4.6 bootstrapped as normal ('make bootstrap; make install') on a 64-bit machine. If it is qemu generating bad code due to variations in gcc behaviour, that's another argument in favour of scrapping the gcc 3.X based backend and using a self contained, handwritten insn selector and register allocator. Are you referring to https://nowt.dyndns.org/ or something else? I was referring to an idea, of which the nowt thing is an implementation. I'm not aware of any other such backends to qemu. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] SSE 'maxps' instruction bug?
On Friday 16 March 2007 2:10 pm, Julian Seward wrote: On Friday 16 March 2007 18:07, Rob Landley wrote: On Tuesday 13 March 2007 10:21 pm, Julian Seward wrote: 0.9.0, or that the compiler/host combination used to build the qemu binary Julian is running generated bad code for the float compares. I used gcc 3.4.6 bootstrapped as normal ('make bootstrap; make install') on a 64-bit machine. If it is qemu generating bad code due to variations in gcc behaviour, that's another argument in favour of scrapping the gcc 3.X based backend and using a self contained, handwritten insn selector and register allocator. Are you referring to https://nowt.dyndns.org/ or something else? I was referring to an idea, of which the nowt thing is an implementation. I'm not aware of any other such backends to qemu. That's Paul Brook's QOPS thing that gets discussed here from time to time. There are vague plans to switch over to that soonish. Rob -- Vista: Windows Millenium Second Edition ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] SSE 'maxps' instruction bug?
QEMU and Core 2 Duo disagree on the handling of NaNs it seems. http://courses.ece.uiuc.edu/ece390/books/labmanual/inst-ref-simd.html - this implies that MAXPS should leave the NaNs alone, no idea how normative that is though (and no IA32 manual at hand) Having looked at an IA32 manual I'd say the inst-ref-simd.html description agrees with it, so the Core 2 behaviour is what qemu should do. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] SSE 'maxps' instruction bug?
On Mar 12, 2007, at 11:27 AM, malc wrote: QEMU and Core 2 Duo disagree on the handling of NaNs it seems. http://courses.ece.uiuc.edu/ece390/books/labmanual/inst-ref-simd.html - this implies that MAXPS should leave the NaNs alone, no idea how normative that is though (and no IA32 manual at hand) I compiled and ran the code that Julian supplied on an AMD processor with SSE, and on qemu-i386 version 0.8.2 built with that system, and both agreed with the Intel Core 2 results that Julian supplied. That means that either qemu changed in this area between v 0.8.2 and 0.9.0, or that the compiler/host combination used to build the qemu binary Julian is running generated bad code for the float compares. The MAXPS instruction is defined to operate on NaNs in such a way that it can be used as a direct replacement for an iterated scalar max operation coded in C like: a = (a b) ? a : b; Which is exactly how it is coded in qemu (at least in v0.8.2). This relies upon the fact that the greater-than comparison returns false anytime there is an unordered operand (NaN), for either operand -- in which case the result is the second argument. -- Tim Olson ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] SSE 'maxps' instruction bug?
0.9.0, or that the compiler/host combination used to build the qemu binary Julian is running generated bad code for the float compares. I used gcc 3.4.6 bootstrapped as normal ('make bootstrap; make install') on a 64-bit machine. If it is qemu generating bad code due to variations in gcc behaviour, that's another argument in favour of scrapping the gcc 3.X based backend and using a self contained, handwritten insn selector and register allocator. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] SSE 'maxps' instruction bug?
The program below tests the 'maxps' instruction. When run on qemu-0.9.0, host amd64, guest x86, guest OS redhat8, it prints: f9a511d1 8d37d67f b34825b8 e2f40739 scp the binary to a Core 2 (real) machine and run: f9a511d1 22dcb9b9 b34825b8 e2f40739 Second 32-bit word is completely different. This is 0.9.0 compiled from source using gcc-3.4.6, host openSuSE 10.2 on a Core 2 Duo in 64-bit mode. Any ideas? I grepped the 0.9.0 sources for maxps but couldn't figure out where/how it is handled. J #include stdio.h #include stdlib.h #include assert.h #include malloc.h #include string.h typedef unsigned char V128[16]; typedef signed int Int; static void showV128 ( V128* v ) { Int i; for (i = 0; i 16; i++) { printf(%02x, (Int)(*v)[i]); if (i 0 (i % 4) == 3) printf( ); } } static V128 arg1 = { 0x28,0x9b,0x57,0xf7,0x22,0xdc,0xb9,0xb9, 0x0a,0xb3,0x8a,0xcf,0x73,0xbb,0xe4,0x0b }; static V128 arg2 = { 0xf9,0xa5,0x11,0xd1,0x8d,0x37,0xd6,0x7f, 0xb3,0x48,0x25,0xb8,0xe2,0xf4,0x07,0x39 }; static V128 res; int main ( int argc, char** argv ) { __asm__ __volatile__( movups (%0),%%xmm6\n\t movups (%1),%%xmm7\n\t maxps %%xmm6,%%xmm7\n\t movups %%xmm7, (%2)\n\t : : r(arg1), r(arg2), r(res) : xmm6, xmm7 ); showV128( res ); printf(\n); return 0; } /* Output on qemu-0.9.0, host amd64, guest x86, guest OS redhat8: f9a511d1 8d37d67f b34825b8 e2f40739 Run same binary on a Core 2: f9a511d1 22dcb9b9 b34825b8 e2f40739 Second 32-bit word is completely different. */ ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] SSE 'maxps' instruction bug?
On Mon, 12 Mar 2007, Julian Seward wrote: The program below tests the 'maxps' instruction. When run on qemu-0.9.0, host amd64, guest x86, guest OS redhat8, it prints: f9a511d1 8d37d67f b34825b8 e2f40739 scp the binary to a Core 2 (real) machine and run: f9a511d1 22dcb9b9 b34825b8 e2f40739 Second 32-bit word is completely different. This is 0.9.0 compiled from source using gcc-3.4.6, host openSuSE 10.2 on a Core 2 Duo in 64-bit mode. Any ideas? I grepped the 0.9.0 sources for maxps but couldn't figure out where/how it is handled. [..snip..] ops_sse.h lines 711 and 670 QEMU and Core 2 Duo disagree on the handling of NaNs it seems. http://courses.ece.uiuc.edu/ece390/books/labmanual/inst-ref-simd.html - this implies that MAXPS should leave the NaNs alone, no idea how normative that is though (and no IA32 manual at hand) -- vale ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel