Re: [Qemu-devel] qemu host-utils.c
On 10/24/07, Thiemo Seufer [EMAIL PROTECTED] wrote: - SPARC and Alpha look like they will break on 32bit hosts, they should do multiplications the same way as the other 64bit targets. I can't see how Sparc would break: smul and umul perform 32x32-64 bit multiplications, Sparc64 mulx does 64x64-64 bit multiplications. There is no overflow detection.
Re: [Qemu-devel] qemu host-utils.c
Blue Swirl wrote: On 10/24/07, Thiemo Seufer [EMAIL PROTECTED] wrote: - SPARC and Alpha look like they will break on 32bit hosts, they should do multiplications the same way as the other 64bit targets. I can't see how Sparc would break: smul and umul perform 32x32-64 bit multiplications, Sparc64 mulx does 64x64-64 bit multiplications. There is no overflow detection. Ah, then it is ok. Thanks for the explanation. Thiemo
Re: [Qemu-devel] qemu host-utils.c
J. Mayer wrote: On Wed, 2007-10-24 at 18:37 +0100, Thiemo Seufer wrote: J. Mayer wrote: On Wed, 2007-10-24 at 12:20 +0200, Fabrice Bellard wrote: I strongly suggest to reuse my code which was in target-i386/helper.c revision 1.80 which was far easier to validate. Moreover, integer divisions from target-i386/helper.c should be put in the same file. I fully agree with this. I still use the same code in the PowerPC op_helper.c file because I never conviced myself that the host_utils version was bug-free. I would likely switch to the common version if I could be sure it cannot lead to any regression. Like this? Questions/Comments I have: [...] - The x86-64 assembler is untested for this version, could you check it works for you? I did a small test program, comparing the result of the Fabrice implementation and the x86_64 optimized implementation results in signed and unsigned case. I used the code from the CVS from host-utils.c for the optimized case and from target-ppc/op_helper.c for the C code case. For my tests vectors, I first used a walking-one like pattern generation algorithm (including the 0 argument cases) then purely random numbers. I did more than 2^32 tests with no differences between the two implementations. Thanks. What I suggest, to be safe: - do not change the current host-utils API and keep the x86_64 optimised case as it is. This way, we are sure not to break anything. If with API you mean the change in argument order, I reverted to Fabrice's original x86-64 API definition, which happens to be the same as used on ppc. I'm reasonably confident that mips64 is correct. - just merge Fabrice's code to replace the non-x86_64 code. As using this API could lead to more optimisations in the PowerPC implementation code, I can wait for you to commit this part and remove the private helpers as soon as you'll have commited. I left out the ppc changes from my commit. I will then also sanitize the Alpha case, which seems broken, even when running on 64 bits hosts. I don't know much for Sparc, then I won't change it. That comment was meant to point out a bug, not to urge you to do all the work. :-) Thiemo
Re: [Qemu-devel] qemu host-utils.c
I strongly suggest to reuse my code which was in target-i386/helper.c revision 1.80 which was far easier to validate. Moreover, integer divisions from target-i386/helper.c should be put in the same file. Regards, Fabrice. Thiemo Seufer wrote: CVSROOT:/sources/qemu Module name:qemu Changes by: Thiemo Seufer ths 07/10/23 23:22:54 Modified files: . : host-utils.c Log message: Fix overflow when multiplying two large positive numbers. CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/host-utils.c?cvsroot=qemur1=1.1r2=1.2
Re: [Qemu-devel] qemu host-utils.c
On Wed, 2007-10-24 at 12:20 +0200, Fabrice Bellard wrote: I strongly suggest to reuse my code which was in target-i386/helper.c revision 1.80 which was far easier to validate. Moreover, integer divisions from target-i386/helper.c should be put in the same file. I fully agree with this. I still use the same code in the PowerPC op_helper.c file because I never conviced myself that the host_utils version was bug-free. I would likely switch to the common version if I could be sure it cannot lead to any regression. Thiemo Seufer wrote: CVSROOT:/sources/qemu Module name:qemu Changes by: Thiemo Seufer ths 07/10/23 23:22:54 Modified files: . : host-utils.c Log message: Fix overflow when multiplying two large positive numbers. CVSWeb URLs: http://cvs.savannah.gnu.org/viewcvs/qemu/host-utils.c?cvsroot=qemur1=1.1r2=1.2 -- J. Mayer [EMAIL PROTECTED] Never organized
Re: [Qemu-devel] qemu host-utils.c
J. Mayer wrote: On Wed, 2007-10-24 at 12:20 +0200, Fabrice Bellard wrote: I strongly suggest to reuse my code which was in target-i386/helper.c revision 1.80 which was far easier to validate. Moreover, integer divisions from target-i386/helper.c should be put in the same file. I fully agree with this. I still use the same code in the PowerPC op_helper.c file because I never conviced myself that the host_utils version was bug-free. I would likely switch to the common version if I could be sure it cannot lead to any regression. Like this? Questions/Comments I have: - Is the BSD-style copyright still ok for this version? - The x86-64 assembler is untested for this version, could you check it works for you? - SPARC and Alpha look like they will break on 32bit hosts, they should do multiplications the same way as the other 64bit targets. Thiemo Index: qemu-cvs/host-utils.c === --- qemu-cvs.orig/host-utils.c +++ qemu-cvs/host-utils.c @@ -1,6 +1,8 @@ /* * Utility compute operations used by translated code. * + * Copyright (c) 2003 Fabrice Bellard + * Copyright (c) 2003-2007 Jocelyn Mayer * Copyright (c) 2007 Aurelien Jarno * * Permission is hereby granted, free of charge, to any person obtaining a copy @@ -24,54 +26,90 @@ #include vl.h -/* Signed 64x64 - 128 multiplication */ +#define DEBUG_MULDEV -void muls64(int64_t *phigh, int64_t *plow, int64_t a, int64_t b) +/* Long integer helpers */ +static void add128 (uint64_t *plow, uint64_t *phigh, uint64_t a, uint64_t b) { -#if defined(__x86_64__) -__asm__ (imul %0\n\t - : =d (*phigh), =a (*plow) - : a (a), 0 (b) - ); -#else -int64_t ph; -uint64_t pm1, pm2, pl; +*plow += a; +/* carry test */ +if (*plow a) +(*phigh)++; +*phigh += b; +} + +static void neg128 (uint64_t *plow, uint64_t *phigh) +{ +*plow = ~*plow; +*phigh = ~*phigh; +add128(plow, phigh, 1, 0); +} -pl = (uint64_t)((uint32_t)a) * (uint64_t)((uint32_t)b); -pm1 = (a 32) * (uint32_t)b; -pm2 = (uint32_t)a * (b 32); -ph = (a 32) * (b 32); - -ph += (int64_t)pm1 32; -ph += (int64_t)pm2 32; -pm1 = (uint64_t)((uint32_t)pm1) + (uint64_t)((uint32_t)pm2) + (pl 32); +static void mul64 (uint64_t *plow, uint64_t *phigh, uint64_t a, uint64_t b) +{ +uint32_t a0, a1, b0, b1; +uint64_t v; -*phigh = ph + ((int64_t)pm1 32); -*plow = (pm1 32) + (uint32_t)pl; -#endif +a0 = a; +a1 = a 32; + +b0 = b; +b1 = b 32; + +v = (uint64_t)a0 * (uint64_t)b0; +*plow = v; +*phigh = 0; + +v = (uint64_t)a0 * (uint64_t)b1; +add128(plow, phigh, v 32, v 32); + +v = (uint64_t)a1 * (uint64_t)b0; +add128(plow, phigh, v 32, v 32); + +v = (uint64_t)a1 * (uint64_t)b1; +*phigh += v; } + /* Unsigned 64x64 - 128 multiplication */ -void mulu64(uint64_t *phigh, uint64_t *plow, uint64_t a, uint64_t b) +void mulu64 (uint64_t *plow, uint64_t *phigh, uint64_t a, uint64_t b) { #if defined(__x86_64__) __asm__ (mul %0\n\t : =d (*phigh), =a (*plow) - : a (a), 0 (b) -); + : a (a), 0 (b)); #else -uint64_t ph, pm1, pm2, pl; +mul64(plow, phigh, a, b); +#endif +#if defined(DEBUG_MULDIV) +printf(mulu64: 0x%016llx * 0x%016llx = 0x%016llx%016llx\n, + a, b, *phigh, *plow); +#endif +} -pl = (uint64_t)((uint32_t)a) * (uint64_t)((uint32_t)b); -pm1 = (a 32) * (uint32_t)b; -pm2 = (uint32_t)a * (b 32); -ph = (a 32) * (b 32); - -ph += pm1 32; -ph += pm2 32; -pm1 = (uint64_t)((uint32_t)pm1) + (uint64_t)((uint32_t)pm2) + (pl 32); +/* Signed 64x64 - 128 multiplication */ +void muls64 (uint64_t *plow, uint64_t *phigh, int64_t a, int64_t b) +{ +#if defined(__x86_64__) +__asm__ (imul %0\n\t + : =d (*phigh), =a (*plow) + : a (a), 0 (b)); +#else +int sa, sb; -*phigh = ph + (pm1 32); -*plow = (pm1 32) + (uint32_t)pl; +sa = (a 0); +if (sa) +a = -a; +sb = (b 0); +if (sb) +b = -b; +mul64(plow, phigh, a, b); +if (sa ^ sb) { +neg128(plow, phigh); +} +#endif +#if defined(DEBUG_MULDIV) +printf(muls64: 0x%016llx * 0x%016llx = 0x%016llx%016llx\n, + a, b, *phigh, *plow); #endif } Index: qemu-cvs/target-ppc/op_helper.c === --- qemu-cvs.orig/target-ppc/op_helper.c +++ qemu-cvs/target-ppc/op_helper.c @@ -217,75 +217,14 @@ /*/ /* Fixed point operations helpers */ #if defined(TARGET_PPC64) -static void add128 (uint64_t *plow, uint64_t *phigh, uint64_t a, uint64_t b) -{ -*plow += a; -/* carry test */ -if (*plow a) -(*phigh)++; -*phigh += b; -} - -static void neg128 (uint64_t *plow, uint64_t *phigh) -{ -
Re: [Qemu-devel] qemu host-utils.c
On Wed, 2007-10-24 at 18:37 +0100, Thiemo Seufer wrote: J. Mayer wrote: On Wed, 2007-10-24 at 12:20 +0200, Fabrice Bellard wrote: I strongly suggest to reuse my code which was in target-i386/helper.c revision 1.80 which was far easier to validate. Moreover, integer divisions from target-i386/helper.c should be put in the same file. I fully agree with this. I still use the same code in the PowerPC op_helper.c file because I never conviced myself that the host_utils version was bug-free. I would likely switch to the common version if I could be sure it cannot lead to any regression. Like this? Questions/Comments I have: - Is the BSD-style copyright still ok for this version? This I cannot tell. Fabrice should say how he feels about it. 1 detail: I just copied Fabrice code from i386 target, then I don't own any copyright on it... - The x86-64 assembler is untested for this version, could you check it works for you? I could check this, as I got an amd64 host. As the optimized version may lead to emit only one or a few host instructions, it may be great to have them be static inline to make gcc able to fully optimize the code. One other point: you may prefer not to change the host-utils API to avoid changes i386 and Mips. It may also be safer, to keep the x86_64 optimized code unchanged. I don't care about the argument order, I can adapt and optimize the code in the PowerPC target for this later. - SPARC and Alpha look like they will break on 32bit hosts, they should do multiplications the same way as the other 64bit targets. I don't think Alpha would not work on 32 bits hosts but I fully agree it should use the same helpers. Especially because it's obvious that umulh is bugged ! -- Jocelyn Mayer [EMAIL PROTECTED]
Re: [Qemu-devel] qemu host-utils.c
Jocelyn Mayer wrote: On Wed, 2007-10-24 at 18:37 +0100, Thiemo Seufer wrote: J. Mayer wrote: On Wed, 2007-10-24 at 12:20 +0200, Fabrice Bellard wrote: I strongly suggest to reuse my code which was in target-i386/helper.c revision 1.80 which was far easier to validate. Moreover, integer divisions from target-i386/helper.c should be put in the same file. I fully agree with this. I still use the same code in the PowerPC op_helper.c file because I never conviced myself that the host_utils version was bug-free. I would likely switch to the common version if I could be sure it cannot lead to any regression. Like this? Questions/Comments I have: - Is the BSD-style copyright still ok for this version? This I cannot tell. Fabrice should say how he feels about it. 1 detail: I just copied Fabrice code from i386 target, then I don't own any copyright on it... BSD-style license is OK. [...] Fabrice.
Re: [Qemu-devel] qemu host-utils.c
On Wed, 2007-10-24 at 18:37 +0100, Thiemo Seufer wrote: J. Mayer wrote: On Wed, 2007-10-24 at 12:20 +0200, Fabrice Bellard wrote: I strongly suggest to reuse my code which was in target-i386/helper.c revision 1.80 which was far easier to validate. Moreover, integer divisions from target-i386/helper.c should be put in the same file. I fully agree with this. I still use the same code in the PowerPC op_helper.c file because I never conviced myself that the host_utils version was bug-free. I would likely switch to the common version if I could be sure it cannot lead to any regression. Like this? Questions/Comments I have: [...] - The x86-64 assembler is untested for this version, could you check it works for you? I did a small test program, comparing the result of the Fabrice implementation and the x86_64 optimized implementation results in signed and unsigned case. I used the code from the CVS from host-utils.c for the optimized case and from target-ppc/op_helper.c for the C code case. For my tests vectors, I first used a walking-one like pattern generation algorithm (including the 0 argument cases) then purely random numbers. I did more than 2^32 tests with no differences between the two implementations. What I suggest, to be safe: - do not change the current host-utils API and keep the x86_64 optimised case as it is. This way, we are sure not to break anything. - just merge Fabrice's code to replace the non-x86_64 code. As using this API could lead to more optimisations in the PowerPC implementation code, I can wait for you to commit this part and remove the private helpers as soon as you'll have commited. I will then also sanitize the Alpha case, which seems broken, even when running on 64 bits hosts. I don't know much for Sparc, then I won't change it. [...] -- J. Mayer [EMAIL PROTECTED] Never organized