Re: [Qemu-devel] What does code_copy_enabled do?
: Any news on the possible cvs-svn migration? : : To be perfectly honest, IMO there is little point moving an existing : project from CVS to SVN. : : I disagree. CVS has several fairly fundamental flaws (no global revision : IDs, unable to move files, and more subtle problems with branches/tags). : SVN fixes these, and in most cases works as a direct drop-in replacement : for CVS. FreeBSD is moving from CVS to SVN for these reasons. Just to second M. Warner Losh: we moved Valgrind from CVS to SVN about 3.5 years ago and it was an excellent thing to do. It is not true to say there is no advantage over CVS -- the global revision IDs, the ability to rename files, and a simpler branching/tagging model are all big advantages. And the fact that it is more-or-less conceptually a drop-in replacement makes it easy for people to make the migration. Sure, Valgrind is a tiny project compared to FreeBSD. But we gain those advantages nonetheless. J
Re: [Qemu-devel] WE NEED GCC 4 please
As it is, Fabrice's code generator will most likely be something similar to Paul's qops, which means that you have to invent a primitive C in which to write the miniops, and you will have to write a backend for _each_ and _every_ host CPU you support. It's not a terribly big deal. Writing backends is a lot easier than writing front ends, since the back end can just emit some small convenient subset of target instructions, whereas the front ends have to deal with every stupid, obscure, weird-ass instruction that ever shows up. QEMU is not the first project to post-process gcc's output. The Glasgow Haskell Compiler (http://en.wikipedia.org/wiki/Glasgow_Haskell_Compiler) did that for many years and it was always an immense amount of hassle tracking the changes to gcc's code generation. Having a completely-independent-of-everything, standalone code generator is definitely a lot easier in the end. Given the unwillingness of Fabrice to rely on some external project, though, I gave up even before I had something even rudimentary. Perhaps Fabrice could commit this code generator on a branch, even if it is not perfect yet. That would at least provide something real to assess; so far all we have is rumour and speculation. J
Re: [Qemu-devel] qemu hw/ppc_oldworld.c target-ppc/cpu.h target-...
Well, I admit I've invented the term ppc32, but there are dozens of 32-bit PowerPC chips. I'd be amazed if they do 64-bit computations or have 64-bit GPRs. Indeed not. Valgrind implements the 32-bit PPC user-space instruction set quite adequately using 32-bit computations throughout. No need for 64-bit computations. J
[Qemu-devel] emu errors for creqv,crnand,crnor,crorc ?
Hi Jocelyn I ran valgrind's ppc32 insn set tests and got the impression that the above insns are not correctly implemented. It seems like 7 bits of CR are set to 1 and one is set to 0, when it should be the other way around. Below is a simple test case. On QEMU it prints result is 000fc000 and on a real 7447 result is 4000 Similar behaviour for creqv, crnand, crnor. But cror, crand, crxor work OK. So maybe it is related to the inverted-result-sense? But the strange thing is, ~0xFC != 0x04. J #include stdio.h void do_crorc_17_14_15 ( void ) { UInt res = 0x; __asm__ __volatile__( li 9,0\n \tmtcr 9\n \tmtxer 9\n \tcrorc 17,14,15\n \tmfcr %0 : /*out*/=b(res) : /*in*/ : /*trash*/ 9 ); printf(result is %08x\n, res ); } int main ( void ) { do_crorc_17_14_15(); return 0; }
[Qemu-devel] Re: emu errors for creqv,crnand,crnor,crorc ?
way around. Below is a simple test case. On QEMU it prints result is 000fc000 and on a real 7447 result is 4000 What is strange is that 0xFC + 0x04... I will have to check all the CR ops, I guess... Another strange thing is that 000fc000 does not have 'fc' byte-aligned inside CR, if you see what I mean. If it was fc00 or 00fc, some byte-inversion mistake would seem likely. This isn't a 74xx specific result. I'm sure any ppc should produce 4000. The test is very simple: make CR=0 then do crorc 17,14,15. So only 1 bit in CR will then be set - all others are zero. J
Re: [Qemu-devel] Kqemu on x86_64 host with x86_64 guest
On Saturday 13 October 2007 16:24, Werner Dittmann wrote: Bruno Cornec wrote: On Sat, Oct 13, 2007 at 01:53:37PM +0200, Bruno Cornec wrote: However, mandriva 2008.0 x86_64 doesn't exhibit this error on the same host. I stand corrected. It also crashed but later during the install process, where the other were at the start. Back to -no-kqemu. Bruno. Even when using -no-kqemu it somehow fails/hangs during setup of Grub when I try to install a openSuse 10.2 or 10.3 . These problems are know for quite some time - but no solution yet. Yes. I also observed that with openSUSE 10.{1,2,3}. After some experimentation I successfully installed 10.1 by asking the installer to use LILO instead of Grub. However, even then, some user space code does not work properly - running the YaST online update inside the successfully-installed 10.1 fails. I wondered if there is some problem in the x86_64 instruction set emulation. I ran some tests from Valgrind, and it appears that some FP-int conversion instructions do not take care of the rounding mode. I did not detect any other errors. See http://lists.gnu.org/archive/html/qemu-devel/2007-10/msg00233.html I tried to build x86_64-softmmu using softfloat.c rather than softfloat-native.c since it looks like softfloat.c emulates these corner cases (rounding mode, etc) more completely. So far I got a lot of compilation errors and did not make much progress. I get the impression x86_64-softmmu and i386-softmmu are intended only to be built with softfloat-native.c. It might be worth installing SuSE 10.1 and finding some small program which fails to work properly. Then we might have a hope of determining what the problem is. J
[Qemu-devel] FP emulation bugs for x86_64-softmmu
Some x86_64 SSE2 instructions that convert floats to ints appear to ignore the rounding mode (in mxcsr), and so produce wrong results if mxcsr is set to anything other than default rounding. For example cvtsd2si et al. I'm looking at softfloat-native.c and softfloat.c and wondering how to fix it. A couple of questions: * is softfloat-native.c intended to handle such corner cases as accurately as softfloat.c ? * is it possible to build x86_64-softmmu to use softfloat.c rather than softfloat-native.c? I hacked ./configure to use CONFIG_SOFTFLOAT for x86_64 (added x86_64 as a softfloat cpu in test at line 1095), but the build then dies like this: target-i386/exec.h:296: warning: conflicting types for built-in function 'sinl' target-i386/exec.h:297: warning: conflicting types for built-in function 'cosl' target-i386/exec.h:298: warning: conflicting types for built-in function 'sqrtl' target-i386/exec.h:299: warning: conflicting types for built-in function 'powl' target-i386/exec.h:300: warning: conflicting types for built-in function 'logl' target-i386/exec.h:301: warning: conflicting types for built-in function 'tanl' target-i386/exec.h:302: warning: conflicting types for built-in function 'atan2l' target-i386/exec.h:303: warning: conflicting types for built-in function 'floorl' target-i386/exec.h:304: warning: conflicting types for built-in function 'ceill' target-i386/exec.h: In function `helper_fldt': target-i386/exec.h:440: error: incompatible types in return target-i386/exec.h: In function `helper_fstt': target-i386/exec.h:447: error: incompatible types in assignment (many more errors like this follow) Is this some minor compile bug, or is it the case that x86_64-softmmu (and i386-softmmu) is not intended to use softfloat.c? J
Re: [Qemu-devel] [PATCH, RFC] More than 2G of memory on 64-bit hosts
Unfortunately C99 relaxed this requirement, and allowed abominations like the win64 ABI. This means you have a choice: Write standard conforming code (long) that works on all known systems except win64, or use features that do't exist on many systems. IIRC C99 types like intptr_t are not supported on several fairly common unix systems. In that case I'll vote for unsigned long. I'd pass the issue to those doing a win64 port, if ever that happens. In Valgrind-world we use an alternative approach, which is to typedef a set of new integral types and use those exclusively, and not use the native 'int', 'long' etc. The new types have a single fixed meaning regardless of the host or guest and it is up to the configure script to set up suitable typedefs. At startup Valgrind checks the size and signedness of these types is as expected, so any configuration errors are caught. This has proved very helpful in porting to a number of platforms. J
Re: [Qemu-devel] Patch: ltr for x86_64 should check the upper descriptor type
Does this fix some specific bug you encountered? J On Monday 26 March 2007 14:53, Bernhard Kauer wrote: The Intel manual states for LTR and 64-Bit Exceptions: #GP(selector) If the descriptor type of the upper 8-byte of the 16-byte descriptor is non-zero. Qemu currently does not check this. The attached patch fixes the bug. Bernhard Kauer
Re: [Qemu-devel] 0.9.0 and svn don't build with -march=pentium2 etc.; was: Latest SVN fails to build on Fedora Core 6 (same with 0.9.0)
As far as X86 is concerned i386/i486/i586 are very different from later generation processors. I am wondering whether another host and target architecture could be created called i686 that makes use of something like MMX or other registers in Intel Pentium II/III/4 and AMD Athlon to negate the lack of general purpose registers. I don't see how. MMX/SSE is suitable for SIMD processing of media data and to some extent for floating point, but is largely unusable for ad-hoc integer computation, especially anything that involves address calculations. The fact that QEMU works and can be optimised on x86_64 is the only saving grace for the architecture, that is still suffering from a lack of registers compared to any other architecture. The lack of registers isn't ideal, but it's not a big deal, and in the grand scheme of things x86_64 has a lot going for it. The most important of which are that (from the software side) all the hard-won knowledge of how to compile good code for x86 carries across more or less directly to x86_64, and (from the hardware side) hardware people already know how to make fast, cheap x86s, so it's easy to move to making fast, cheap x86_64s. The problems of the gcc backend to qemu have already been discussed extensively on this list. Stealing 3+ registers from gcc on x86 really is asking for trouble, and I believe it is generally understood that the best long term solution is to move to a self-contained back end that does not use gcc for dynamic code generation. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] RFC: This project needs a stable branch
On Thursday 22 March 2007 23:27, Paul Brook wrote: Do you mean you're asking me to break up Paul Brook's QOPS tree at https://nowt.dyndns.org and submit it to mainline? I can do this thing, if you really think it would help. If you implement all the missing bits in the process it'll help ;-) What bits would they be then? FWIW, I snarfed the patch last Sunday and tested it on amd64 host / x86 guest, and successfully booted a couple of linux distros. So it's not obviously broken, at least for my mundane host/guest choice. It also seemed marginally slower on a big compile in the guest - 395.4 host cpu seconds for mainline vs 422.9 with qops. Is this expected? J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] RFC: This project needs a stable branch
On Thursday 15 March 2007 14:53, Paul Brook wrote: Subsequent releases of the branch would contain no functionality enhancements, but just bug fixes, with the eventual aim of achieving 'it just works' status for any x86/x86_64 guest I try to install/run. I know that's a tall order, and that 0.9.0 may not be able to supply that for all guests. But it is an important goal to strive for. While I agree stability is a desirable goal, and there is obviously users want a stable product, I'm not sure a qemu is mature enough to make a stable branch worthwhile. Especially considering the very limited technical resources we have available. Limited effort is always a problem, granted. So here's a broader question, which I'm surprised nobody has asked before (afaik). Think forward to a hypothetical QEMU 1.0 release. What criteria are required for such a release? J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [PATCH] softfloat missing functions
Note that float64_to_uint64 functions are not correct, as they won't return results between INT64_MAX and UINT64_MAX. Hope someone may know the proper solution for this. How about this? J uint64_t float64_to_uint64 (float64 a STATUS_PARAM) { uint64_t res; int64_t v; if (isinf(a) || isnan(a)) { return special value ( maybe 163 ?) } else if (a 0.0 || a (float64)UINT64_MAX) { return out-of-range value, whatever that is } else { a += (float64) INT64_MIN; // move a downwards v = llrint(a); // convert v -= INT64_MIN;// move v back up return v; } } ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [PATCH] softfloat missing functions
Thinking about this more, you ask is this correct, but that is only meaningful if you say what the specification is. Correct relative to what? Yes, it seems to be the correct way, but thinking more about the problem, it appeared to me that the implementation could be even easier than yours. It seems to me that this may be sufficient: uint64_t float64_to_uint64 (float64 a STATUS_PARAM) { int64_t v; v = llrint(a + (float64)INT64_MIN); return v - INT64_MIN; } If a is NaN then so is the argument to llrint. 'man llrint' says: If x is infinite or NaN, or if the rounded value is outside the range of the return type, the numeric result is unspecified. So then float64_to_uint64 produces an unspecified result. It seems to me much safer to test and handle NaN, Inf and out-of-range values specially. However, even that does not help unless you say what the specification is. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] [PATCH] Fix guest x86/amd64 helper_fprem/helper_fprem1
The helpers for x86/amd64 fprem and fprem1 in target-i386/helper.c are significantly borked and, for example, cause konqueror in RedHat8 (x86 guest) to go into an infinite loop when displaying http://news.bbc.co.uk. helper_fprem has the following borkage: - various Inf/Nan/zero inputs not handled correctly - incorrect rounding when converting negative 'dblq' to 'q' - incorrect order of assignment to C bits (0,3,1 not 0,1,3) helper_fprem1 has those problems and is also incorrect about the points at which its rounding needs to differ from that of helper_fprem. Patch below fixes all these. It brings the fprem and fprem1 behaviour very much closer to the hardware -- not identical, but close. Some +0.0 results should really be -0.0 and there may still be other differences. Anyway konquerer no longer loops with the patch applied. J --- ../Orig/qemu-0.9.0/target-i386/helper.c 2007-02-05 23:01:54.0 + +++ target-i386/helper.c 2007-03-17 17:21:02.0 + @@ -3097,30 +3097,51 @@ CPU86_LDouble dblq, fpsrcop, fptemp; CPU86_LDoubleU fpsrcop1, fptemp1; int expdif; -int q; +signed long long int q; + +if (isinf(ST0) || isnan(ST0) || isnan(ST1) || (ST1 == 0.0)) { + ST0 = 0.0 / 0.0; /* NaN */ + env-fpus = (~0x4700); /* (C3,C2,C1,C0) -- */ + return; +} fpsrcop = ST0; fptemp = ST1; fpsrcop1.d = fpsrcop; fptemp1.d = fptemp; expdif = EXPD(fpsrcop1) - EXPD(fptemp1); + +if (expdif 0) { +/* optimisation? taken from the AMD docs */ +env-fpus = (~0x4700); /* (C3,C2,C1,C0) -- */ +/* ST0 is unchanged */ + return; +} + if (expdif 53) { dblq = fpsrcop / fptemp; -dblq = (dblq 0.0)? ceil(dblq): floor(dblq); + /* round dblq towards nearest integer */ +dblq = rint(dblq); ST0 = fpsrcop - fptemp*dblq; -q = (int)dblq; /* cutting off top bits is assumed here */ + + /* convert dblq to q by truncating towards zero */ + if (dblq 0.0) + q = (signed long long int)(-dblq); +else + q = (signed long long int)dblq; + env-fpus = (~0x4700); /* (C3,C2,C1,C0) -- */ -/* (C0,C1,C3) -- (q2,q1,q0) */ -env-fpus |= (q0x4) 6; /* (C0) -- q2 */ -env-fpus |= (q0x2) 8; /* (C1) -- q1 */ -env-fpus |= (q0x1) 14; /* (C3) -- q0 */ +/* (C0,C3,C1) -- (q2,q1,q0) */ +env-fpus |= (q0x4) (8-2); /* (C0) -- q2 */ +env-fpus |= (q0x2) (14-1); /* (C3) -- q1 */ +env-fpus |= (q0x1) (9-0); /* (C1) -- q0 */ } else { env-fpus |= 0x400; /* C2 -- 1 */ fptemp = pow(2.0, expdif-50); fpsrcop = (ST0 / ST1) / fptemp; -/* fpsrcop = integer obtained by rounding to the nearest */ -fpsrcop = (fpsrcop-floor(fpsrcop) ceil(fpsrcop)-fpsrcop)? -floor(fpsrcop): ceil(fpsrcop); +/* fpsrcop = integer obtained by chopping */ +fpsrcop = (fpsrcop 0.0)? +-(floor(fabs(fpsrcop))): floor(fpsrcop); ST0 -= (ST1 * fpsrcop * fptemp); } } @@ -3130,26 +3151,48 @@ CPU86_LDouble dblq, fpsrcop, fptemp; CPU86_LDoubleU fpsrcop1, fptemp1; int expdif; -int q; - -fpsrcop = ST0; -fptemp = ST1; +signed long long int q; + +if (isinf(ST0) || isnan(ST0) || isnan(ST1) || (ST1 == 0.0)) { + ST0 = 0.0 / 0.0; /* NaN */ + env-fpus = (~0x4700); /* (C3,C2,C1,C0) -- */ + return; +} + +fpsrcop = (CPU86_LDouble)ST0; +fptemp = (CPU86_LDouble)ST1; fpsrcop1.d = fpsrcop; fptemp1.d = fptemp; expdif = EXPD(fpsrcop1) - EXPD(fptemp1); + +if (expdif 0) { +/* optimisation? taken from the AMD docs */ +env-fpus = (~0x4700); /* (C3,C2,C1,C0) -- */ + /* ST0 is unchanged */ +return; +} + if ( expdif 53 ) { -dblq = fpsrcop / fptemp; +dblq = fpsrcop/*ST0*/ / fptemp/*ST1*/; + /* round dblq towards zero */ dblq = (dblq 0.0)? ceil(dblq): floor(dblq); -ST0 = fpsrcop - fptemp*dblq; -q = (int)dblq; /* cutting off top bits is assumed here */ +ST0 = fpsrcop/*ST0*/ - fptemp*dblq; + + /* convert dblq to q by truncating towards zero */ + if (dblq 0.0) + q = (signed long long int)(-dblq); +else + q = (signed long long int)dblq; + env-fpus = (~0x4700); /* (C3,C2,C1,C0) -- */ -/* (C0,C1,C3) -- (q2,q1,q0) */ -env-fpus |= (q0x4) 6; /* (C0) -- q2 */ -env-fpus |= (q0x2) 8; /* (C1) -- q1 */ -env-fpus |= (q0x1) 14; /* (C3) -- q0 */ +/* (C0,C3,C1) -- (q2,q1,q0) */ +env-fpus |= (q0x4) (8-2); /* (C0) -- q2 */ +env-fpus |= (q0x2) (14-1); /* (C3) -- q1 */ +env-fpus |= (q0x1) (9-0); /* (C1) -- q0 */ } else { +int N = 32 + (expdif % 32); /* as per AMD docs */ env-fpus |= 0x400; /* C2 -- 1 */ -fptemp = pow(2.0,
[Qemu-devel] Redundant repz prefixes in generated amd64 code
I'm seeing redundant repz (0xF3) prefixes in generated code, typically just before jumps: code_gen_buffer+415: repz mov $0xe07f,%eax code_gen_buffer+421: mov%eax,0x20(%rbp) code_gen_buffer+424: lea-25168302(%rip),%ebx # 0xaf0420 tbs+96 code_gen_buffer+430: retq code_gen_buffer+431: mov-25168245(%rip),%eax # 0xaf0460 tbs+160 code_gen_buffer+437: jmpq *%rax code_gen_buffer+439: repz mov $0xe092,%eax code_gen_buffer+445: mov%eax,0x20(%rbp) code_gen_buffer+448: lea-25168325(%rip),%ebx # 0xaf0421 tbs+97 code_gen_buffer+454: retq I assume these are something to do with translation chaining/unchaining but have been unable to figure out where they come from. I know they get executed are so are not data - valgrind barfs on them. This is on a 64-bit host (Core 2) with qemu-0.9.0 compiled from source by gcc-3.4.6, running an x86 (32-bit) guest. At a guess I'd say the mov $imm,%eax is (created by? to do with?) gen_jmp_im in target-i386/translate.c, but I don't see how the F3 got in on the act. Grepping the source for 0xF3 turns up nothing plausible. Any ideas where it comes from and how to get rid of it? J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Redundant repz prefixes in generated amd64 code
On Friday 16 March 2007 14:28, Paul Brook wrote: On Friday 16 March 2007 14:15, Julian Seward wrote: I'm seeing redundant repz (0xF3) prefixes in generated code, typically just before jumps: code_gen_buffer+415: repz mov $0xe07f,%eax code_gen_buffer+421: mov%eax,0x20(%rbp) code_gen_buffer+424: lea-25168302(%rip),%ebx # 0xaf0420 tbs+96 code_gen_buffer+430: retq code_gen_buffer+431: mov-25168245(%rip),%eax # 0xaf0460 tbs+160 code_gen_buffer+437: jmpq *%rax code_gen_buffer+439: repz mov $0xe092,%eax code_gen_buffer+445: mov%eax,0x20(%rbp) code_gen_buffer+448: lea-25168325(%rip),%ebx # 0xaf0421 tbs+97 code_gen_buffer+454: retq I assume these are something to do with translation chaining/unchaining but have been unable to figure out where they come from. 8b50 op_goto_tb1: 8b50: 8b 05 00 00 00 00 mov0(%rip),%eax 8b52: R_X86_64_PC32 __op_param1+0x3c 8b56: ff e0 jmpq *%rax 8b58: f3 c3 repz retq qemu only strips the final ret off. The prefixed ret is to avoid prefetch stalls on amd cpus. So the implication of this is that the generated code just happens to work only because the dangling F3 never ends up in front of some other instruction which it would change the meaning of? J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] SSE 'maxps' instruction bug?
On Friday 16 March 2007 18:07, Rob Landley wrote: On Tuesday 13 March 2007 10:21 pm, Julian Seward wrote: 0.9.0, or that the compiler/host combination used to build the qemu binary Julian is running generated bad code for the float compares. I used gcc 3.4.6 bootstrapped as normal ('make bootstrap; make install') on a 64-bit machine. If it is qemu generating bad code due to variations in gcc behaviour, that's another argument in favour of scrapping the gcc 3.X based backend and using a self contained, handwritten insn selector and register allocator. Are you referring to https://nowt.dyndns.org/ or something else? I was referring to an idea, of which the nowt thing is an implementation. I'm not aware of any other such backends to qemu. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] QCOW(2) image corruption under QEMU 0.9.0 reproducible
I ran QEMU on Valgrind for several hours last night, including a couple of boot-shutdown cycles of RedHat8, and lots of file copying/deletion in the guest to get the qcow2 image up to 8GB and generally cause a lot of disk IO. I got no memory errors whatsoever from Valgrind and got no filesystem corruption, so I guess I didn't trigger the bug. Really the first thing to do is establish a reliable way to reproduce it. J On Friday 16 March 2007 17:00, Ben Taylor wrote: J M Cerqueira Esteves [EMAIL PROTECTED] wrote: herbie hancock wrote: Hello, i had also a reproducible disk crash: info of the last good image, size is about 3,5GB I never experienced such a bad problem with qemu before, maybe it is a problem with qcow2 format ? After the problems with qcow2 images which I reported here a few weeks ago, I've only been using qcow images (under QEMU 0.9.0), without such surprises. So it seems qemu has some bug related to qcow2 images, maybe manifesting itself only after they get larger than 4GB... I suspect I saw problems with qcow2 images as well. I was able to suspend a Solaris Nevada B58 install and use savevm about 30% into the install and restart it later. As the image completd, the file system went all to hell with corruption that was impossible to fix. At the time, I attributed it to the Solaris install (thinking it might have something to do with the cmpxchg8b bug that was later fixed), but I suspect with the multiple reports I've seen, I'm now thinking I saw the same thing. I'm testing conversion of a qcow image to a qcow2 image. We'll see how that goes Ben ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [PATCH] Reducing X communication bandwidth, take 2
On Friday 16 March 2007 18:40, Anthony Liguori wrote: Hi Julian, Julian Seward wrote: Here is a somewhat revised version of a patch I first made nearly three years ago. The original thread is http://lists.gnu.org/archive/html/qemu-devel/2004-07/msg00263.html It still uses a shadow frame buffer. Fabrice mentioned this is not necessary I thought about this a little and here's what I came up with. I think we could change vga_draw_line* so that as part of the drawing process, it compared the newly generated pixel with the previous pixel value and returned back the min, max x-coordinate that changed. Since we tend to only extend the vertical dirty range by a couple pixels, this should be a relatively cheap way of reducing the size of the update region. Sounds plausible - having said that, I have no familiarity with the VGA code. Also sounds like a cleaner solution than mine. Is there something which guarantees that the vertical dirty range only overshoots by some small number of pixels? (thinking more about it .. it doesn't matter - finding min/max that changed for each line will also make it possible to identify the vertical limits of the change). Will this work also for the CL542x adaptor? (Does that fall in the category of vga?) My current hack works for with/without -std-vga and I think that's because it lives underneath both, in the connection to SDL. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Redundant repz prefixes in generated amd64 code
ifeq ($(ARCH),x86_64) +OP_CFLAGS+= -mtune=nocona -W -Wall -O4 BASE_LDFLAGS+=-Wl,-T,$(SRC_PATH)/$(ARCH).ld endif That works. Thanks. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] QCOW(2) image corruption under QEMU 0.9.0 reproducible
Something similar happened to me. At first I thought it was a hardware (host) problem and so do not have good details - this is from memory. - 0.9.0, binary build from qemu.org - i386 openSUSE 10.2 host - RedHat 8 guest - .qcow2 image, max size 8GB - using the Accelerator but not -kernel-kqemu - first sign of trouble was the ext3 driver in RedHat8 complaining of something about disk geometry (something % 4 was not zero when it should have been) - this is when the disk image was about 3G in size - shortly thereafter the disk image was so corrupted that e2fsck could not fix it - IIRC, ls -l on the corrupted image showed some implausibly huge size ( 100GB ?) leading me to believe the file had some huge block of zeroes in the middle J On Wednesday 14 March 2007 23:20, herbie hancock wrote: Hello, i had also a reproducible disk crash: info of the last good image, size is about 3,5GB image: debian4_0.dsk file format: qcow2 virtual size: 16G (16777216000 bytes) disk size: 3.3G cluster_size: 4096 as soon as the image increase to a size of about 7,9 GB the emulator locks up, and after a restart the the image is not readable any more. the start of the image is filled with zeros, the signature of the file at the start (QFI ) is overwritten. I tried it two times, started with a intact image with the size above and in both times the image was corrupted. Host: WIN2K Guest: Debian 4.0 Etch. qemu: 0.90 (build date 2007-02-19, the version that comes with http://www.davereyn.co.uk/qem/setupqemuk40.exe) I tested the image above with virtualbox (installed backup of the qemu disk with acronis trueimage and bart pe boot cd) , started with the above image, and the problems are gone, image is now filled with more than 9GB, no problem so far. I never experienced such a bad problem with qemu before, maybe it is a problem with qcow2 format ? Bye HR _ Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen! http://smartsurfer.web.de/?mc=100071distributionid=0066 ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] RFC: This project needs a stable branch
I am a great fan of QEMU, and have used it more or less continuously for the past 2+ years. Over that time I've installed and operated various Linux and Windows guests with varying degrees of success. The recently released 0.9.0 seems a big step forward in the stability/usability department, which is excellent. But there are still residual worries -- for example, qcow2 images corrupted for no obvious reason -- which, whilst a boring problem, is important for folks like me who want to run VMs 24x7 with the hope of complete reliability. Pretty much all mature projects which have achieved widespread usage have one or more stable branches along with the main development branch (trunk). Think GCC, the kernel, KDE, ... the list is endless. Maintaining a stable branch is extra hassle and overhead, but it is the standard way to operate, for reasons which are obvious: the majority of users care more about stability, reliability and usability than they do about the latest new features, and delivering stability from a branch used for bleeding-edge development work is pretty much impossible. That is not, of course, a criticism of the bleeding edge developers, since it is they who ultimately drive the project along. I am writing to propose that a stable branch be made from the 0.9.0 release point. The aim would be to maximise stability for (IMO) the subset of functionality that has the largest potential user base: i386-softmmu + Accelerator and x86_64-softmmu + Accelerator, but excluding -kernel-kqemu due to limitations described in http://qemu.org/kqemu-doc.html#SEC7. Subsequent releases of the branch would contain no functionality enhancements, but just bug fixes, with the eventual aim of achieving 'it just works' status for any x86/x86_64 guest I try to install/run. I know that's a tall order, and that 0.9.0 may not be able to supply that for all guests. But it is an important goal to strive for. My impression is that (at least as I perceive it) the lack of emphasis on maximising stability on a stable branch, and the lack of a bug tracker, is artificially restricting QEMU's user base, and therefore indirectly its long term prospects. This is a shame, because QEMU is a very remarkable and useful project, which should be used (and usable) by everybody and anybody. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] RFC: This project needs a stable branch
On Thursday 15 March 2007 13:48, Anthony Liguori wrote: I'm not necessarily sure I agree that a stable branch is the best thing to have (verses aiming for never introducing regressions). Aiming for no regressions is a worthy aim, but I believe unachieveable in a project of any size. For sure it's impossible if there is ever a need to make large-scale infrastructural changes, which inevitably is occasionally the case if the project is to live a long time. For example, if the dyngen/gcc-based backend is replaced by a self-contained handwritten one, I would be amazed if there were not a few obscure regressions whilst the new backend is brought up to the same level of stability as the current one. At least, that is what I know from my own code generator hacking. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [PATCH] Reducing X communication bandwidth, take 2
On Wednesday 14 March 2007 04:57, Mark Williamson wrote: Here is a somewhat revised version of a patch I first made nearly three years ago. The original thread is The idea here is quite similar to what the VNC server does now. If this is desirable for SDL too, then perhaps we should find a way to fold this into common code? Although, is there a compelling reason to use SDL over X instead of VNC? I sometimes do this sort of thing because it Just Works with no manual configuring of port forwarding etc. I don't necessarily like to do it for extended usage but it is very convenient. Yes. VNC is all very nice (I use it a lot) but is hassle to set up, what with making holes in firewalls and/or port forwarding etc. This patch has the it just works property. In fact I obtained (by far) the best remote X performance by using both this patch and making the remote X connection with ssh -XC -o CompressionLevel=1. The patch knocks out the majority of the data, and the ssh compression squashed what remained by more than a factor of 10. Doing this it was hard to tell that QEMU was not running locally. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] SSE 'maxps' instruction bug?
QEMU and Core 2 Duo disagree on the handling of NaNs it seems. http://courses.ece.uiuc.edu/ece390/books/labmanual/inst-ref-simd.html - this implies that MAXPS should leave the NaNs alone, no idea how normative that is though (and no IA32 manual at hand) Having looked at an IA32 manual I'd say the inst-ref-simd.html description agrees with it, so the Core 2 behaviour is what qemu should do. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] [PATCH] Reducing X communication bandwidth, take 2
Here is a somewhat revised version of a patch I first made nearly three years ago. The original thread is http://lists.gnu.org/archive/html/qemu-devel/2004-07/msg00263.html The patch makes QEMU's graphics emulation much more usable over remote X connections, by reducing the amount of data sent to the X server. This is particularly noticeable for small display updates, most importantly mouse cursor movements, which become faster and so generally make the guest's GUI more pleasant to use. Compared to the original patch, this one: - is relative to 0.9.0 - handle screen-format-BytesPerPixel values of both 2 and 4, so handles most guest depths - I tested 8, 16, 24bpp. - has unrolled comparison/copy loops for the depth 2 case. Most of the comparisons are short (= 64 bytes) so I don't see much point in taking the overhead of a call to memcmp/memcpy. - most importantly, is optional and disabled by default, so that default performance is unchanged. To use it you need the new -remote-x11 flag (perhaps -low-bandwidth-x11 would be a better name). It still uses a shadow frame buffer. Fabrice mentioned this is not necessary http://lists.gnu.org/archive/html/qemu-devel/2004-07/msg00279.html but I can't see how to get rid of it and still check for redundant updates in NxN pixel blocks (N=32 by default). The point of checking NxN squares is that mouse pointer pixmaps are square and so the most common display updates - mouse pointer movements - are often reduced to transmission of a single 32x32 block using this strategy. The shadow framebuffer is only allocated when -remote-x11 is present, so the patch has no effect on default memory use. I measured the bandwidth saving roughly by resuming a vm snapshot containing a web browser showing a page with a lot of links. I moved the pointer slowly over the links (so they change colour) and scrolled up and down a bit; about 1/2 minute of activity in total. I tried to do the same with and without -remote-x11. Without -remote-x11, 163Mbyte was transmitted to the X server; with it, 20.6Mbyte was, about an 8:1 reduction. J diff -u -r Orig/qemu-0.9.0/sdl.c qemu-0.9.0/sdl.c --- Orig/qemu-0.9.0/sdl.c 2007-02-05 23:01:54.0 + +++ qemu-0.9.0/sdl.c2007-03-13 22:16:40.0 + @@ -29,6 +29,8 @@ #include signal.h #endif +#include assert.h + static SDL_Surface *screen; static int gui_grab; /* if true, all keyboard/mouse events are grabbed */ static int last_vm_running; @@ -44,17 +46,232 @@ static SDL_Cursor *sdl_cursor_hidden; static int absolute_enabled = 0; +/* Mechanism to reduce the total amount of data transmitted to the X + server, often quite dramatically. Keep a shadow copy of video + memory in alt_pixels, and when asked to update a rectangle, use + the shadow copy to establish areas which are the same, and so do + not need updating. +*/ + +static void* alt_pixels = NULL; + +#define THRESH 32 + +/* Return 1 if the area [x .. x+w-1, y .. y+w-1] is different from + the old version and so needs updating. */ +static int cmpArea_16bit ( int x, int y, int w, int h ) +{ + inti, j; + unsigned intsll; + unsigned short* p1base = (unsigned short*)screen-pixels; + unsigned short* p2base = (unsigned short*)alt_pixels; + assert(screen-format-BytesPerPixel == 2); + if (w == 0 || h == 0) + return 0; + assert(w 0 h 0); + sll = ((unsigned int)screen-pitch) 1; + for (j = y; j y+h; j++) { +unsigned short* p1p = p1base[j * sll + x]; +unsigned short* p2p = p2base[j * sll + x]; +for (i = 0; i w-5; i += 5) { + if (p1p[i+0] != p2p[i+0]) return 1; + if (p1p[i+1] != p2p[i+1]) return 1; + if (p1p[i+2] != p2p[i+2]) return 1; + if (p1p[i+3] != p2p[i+3]) return 1; + if (p1p[i+4] != p2p[i+4]) return 1; +} +for (/*fixup*/; i w; i++) { + if (p1p[i+0] != p2p[i+0]) return 1; +} + } + return 0; +} +static void copyArea_16bit ( int x, int y, int w, int h ) +{ + inti, j; + unsigned intsll; + unsigned short* p1base = (unsigned short*)screen-pixels; + unsigned short* p2base = (unsigned short*)alt_pixels; + assert(screen-format-BytesPerPixel == 2); + sll = ((unsigned int)screen-pitch) 1; + if (w == 0 || h == 0) + return; + assert(w 0 h 0); + for (j = y; j y+h; j++) { +unsigned short* p1p = p1base[j * sll + x]; +unsigned short* p2p = p2base[j * sll + x]; +for (i = 0; i w-5; i += 5) { + p2p[i+0] = p1p[i+0]; + p2p[i+1] = p1p[i+1]; + p2p[i+2] = p1p[i+2]; + p2p[i+3] = p1p[i+3]; + p2p[i+4] = p1p[i+4]; +} +for (/*fixup*/; i w; i++) { + p2p[i+0] = p1p[i+0]; +} + } +} + +static int cmpArea_32bit ( int x, int y, int w, int h ) +{ + int i, j; + unsigned int sll; + unsigned int* p1base = (unsigned int*)screen-pixels; + unsigned int* p2base = (unsigned int*)alt_pixels; + assert(screen-format-BytesPerPixel == 4); + sll = ((unsigned int)screen-pitch)
Re: [Qemu-devel] SSE 'maxps' instruction bug?
0.9.0, or that the compiler/host combination used to build the qemu binary Julian is running generated bad code for the float compares. I used gcc 3.4.6 bootstrapped as normal ('make bootstrap; make install') on a 64-bit machine. If it is qemu generating bad code due to variations in gcc behaviour, that's another argument in favour of scrapping the gcc 3.X based backend and using a self contained, handwritten insn selector and register allocator. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] SSE 'maxps' instruction bug?
The program below tests the 'maxps' instruction. When run on qemu-0.9.0, host amd64, guest x86, guest OS redhat8, it prints: f9a511d1 8d37d67f b34825b8 e2f40739 scp the binary to a Core 2 (real) machine and run: f9a511d1 22dcb9b9 b34825b8 e2f40739 Second 32-bit word is completely different. This is 0.9.0 compiled from source using gcc-3.4.6, host openSuSE 10.2 on a Core 2 Duo in 64-bit mode. Any ideas? I grepped the 0.9.0 sources for maxps but couldn't figure out where/how it is handled. J #include stdio.h #include stdlib.h #include assert.h #include malloc.h #include string.h typedef unsigned char V128[16]; typedef signed int Int; static void showV128 ( V128* v ) { Int i; for (i = 0; i 16; i++) { printf(%02x, (Int)(*v)[i]); if (i 0 (i % 4) == 3) printf( ); } } static V128 arg1 = { 0x28,0x9b,0x57,0xf7,0x22,0xdc,0xb9,0xb9, 0x0a,0xb3,0x8a,0xcf,0x73,0xbb,0xe4,0x0b }; static V128 arg2 = { 0xf9,0xa5,0x11,0xd1,0x8d,0x37,0xd6,0x7f, 0xb3,0x48,0x25,0xb8,0xe2,0xf4,0x07,0x39 }; static V128 res; int main ( int argc, char** argv ) { __asm__ __volatile__( movups (%0),%%xmm6\n\t movups (%1),%%xmm7\n\t maxps %%xmm6,%%xmm7\n\t movups %%xmm7, (%2)\n\t : : r(arg1), r(arg2), r(res) : xmm6, xmm7 ); showV128( res ); printf(\n); return 0; } /* Output on qemu-0.9.0, host amd64, guest x86, guest OS redhat8: f9a511d1 8d37d67f b34825b8 e2f40739 Run same binary on a Core 2: f9a511d1 22dcb9b9 b34825b8 e2f40739 Second 32-bit word is completely different. */ ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] How to get 1280x1024 display from guest running Xorg?
Thanks for the feedback. Since I do not wish to be involved in a great battle (as you so nicely put it) I'll stick with VMware (sigh). J On Wednesday 21 February 2007 15:05, Robin Atwood wrote: On Wednesday 21 Feb 2007, Julian Seward wrote: (replying off list) So you have Solaris 10 (x86 ?) running on qemu-0.9 ? Is it stable? Does it work? I have it running on vmware-5.5.3 but would prefer to move to running it on qemu if possible; however I've had mixed results with qemu in the past and don't want to spend loads of time on failed attempts to get it to work. Hence the question. It was a great battle to install but now it is stable. Do the following things: 1. install from the DVD image 2. Use the text console install 3. At the end of the install, backup the image file *before* the first reboot 4. If during the first boot of the image, you get a segfault, restore and try again until you get to a prompt. Ignore any service failures. (the filesystem seems prone corruption at the first boot.) 5. If you have problems caused by damaged files, re-install choosing the Update option: this will restore the damaged files. After that, I was able to boot reliably into X. However, the filesystem seems very fragile if not shut down cleanly, so take regular backups! HTH -Robin. ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [PATCH 1/2] Escape filenames in monitor
On Saturday 16 December 2006 21:11, Anthony Liguori wrote: info block is impossible to parse reliably because there is no escaping done on the filename. Don't you also need to convert \ to \\ ? Else any \ which was in the original string will confuse the parser of the escaped output. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Make -std-vga the default?
Really? My win2k install couldn't do anything useful with -std-vga. It would only do the very basic 640x480x4 mode. I'm fairly sure win9x can't do anything useful with straight VGA either. Same here. Also std-vga seemed to be slower than cirrus when I tried it recently on my linux guests, although I haven't actually measured anything. My mistake; Win2K doesn't like -std-vga. I confused 2K and XP. Overall it seems to work much better than the default 5446 Julian, in what way is std-vga better than the cirrus emulation? I can go above 1024x768, which is realistically something I need in order to use QEMU as a viable replacement for VMware. With SuSE 10.1 guest I can't even get 1024x768 with Cirrus. SuSE claims it's doing 1024x768 but what I get is 1024x600. In my experience the Cirrus emulation just works, and is supported by pretty much every OS out the box. AFAIK Windows earlier than XP doesn't needs additional 3rd party drivers to support anonymous VESA hardware. I agree that avoiding additional drivers is good. However it seems that both cirrus and std-vga have their shortcomings and neither is an ideal out-of-the-box solution right now. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour
It appears that cvttps2dq is indeed the only exception in the range, combined patch that fixes both movd?q2d?q and cvttps2dq is attached. I don't have any kind of SSE on this machine so would apprecaite if someone would run tests/test-i386 with the patch attached. That works for me. Thanks. Valgrind's integer/x87/MMX/SSE/SSE2 tests now all pass on i386-softmmu. I didn't try tests/test-i386 though. Fabrice, can you commit this? J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour
The SSE2 instructions cvttps2dq, movdq2q, movq2dq do not behave correctly, as shown by the attached program. It should print cvttps2dq_1 ... ok cvttps2dq_2 ... ok movdq2q_1 ... ok movq2dq_1 ... ok but instead produces cvttps2dq_1 ... ok cvttps2dq_2 ... not ok result0.sd[0] = 12 (expected 12) result0.sd[1] = 3 (expected 56) result0.sd[2] = -2147483648 (expected 43) result0.sd[3] = 3 (expected 87) movdq2q_1 ... not ok result0.uq[0] = 1302123111658042420 (expected 5124095577148911) movq2dq_1 ... not ok result0.uq[0] = 1302123111658042420 (expected 5124095577148911) result0.uq[1] = 6221254864647256184 (expected 0) I looked at QEMU's instruction decoders for these, and compared them to Valgrind's, but could not see what the problem was. The decode logic looks OK. Maybe the problem is elsewhere. J --- #include math.h #include setjmp.h #include signal.h #include stdio.h #include stdlib.h typedef union { char sb[1]; unsigned char ub[1]; } reg8_t; typedef union { char sb[2]; unsigned char ub[2]; short sw[1]; unsigned short uw[1]; } reg16_t; typedef union { char sb[4]; unsigned char ub[4]; short sw[2]; unsigned short uw[2]; long int sd[1]; unsigned long int ud[1]; float ps[1]; } reg32_t; typedef union { char sb[8]; unsigned char ub[8]; short sw[4]; unsigned short uw[4]; long int sd[2]; unsigned long int ud[2]; long long int sq[1]; unsigned long long int uq[1]; float ps[2]; double pd[1]; } reg64_t __attribute__ ((aligned (8))); typedef union { char sb[16]; unsigned char ub[16]; short sw[8]; unsigned short uw[8]; long int sd[4]; unsigned long int ud[4]; long long int sq[2]; unsigned long long int uq[2]; float ps[4]; double pd[2]; } reg128_t __attribute__ ((aligned (16))); static sigjmp_buf catchpoint; static void handle_sigill(int signum) { siglongjmp(catchpoint, 1); } __attribute__((unused)) static int eq_float(float f1, float f2) { return f1 == f2 || fabsf(f1 - f2) fabsf(f1) * 1.5 * pow(2,-12); } __attribute__((unused)) static int eq_double(double d1, double d2) { return d1 == d2 || fabs(d1 - d2) fabs(d1) * 1.5 * pow(2,-12); } static void cvttps2dq_1(void) { reg128_t arg0 = { .ps = { 12.34F, 56.78F, 43.21F, 87.65F } }; reg128_t arg1 = { .sd = { 1L, 2L, 3L, 4L } }; reg128_t result0; char state[108]; if (sigsetjmp(catchpoint, 1) == 0) { asm( fsave %3\n movlps 0%0, %%xmm4\n movhps 8%0, %%xmm4\n movlps 0%1, %%xmm5\n movhps 8%1, %%xmm5\n cvttps2dq %%xmm4, %%xmm5\n movlps %%xmm5, 0%2\n movhps %%xmm5, 8%2\n frstor %3\n : : m (arg0), m (arg1), m (result0), m (state[0]) : xmm4, xmm5 ); if (result0.sd[0] == 12L result0.sd[1] == 56L result0.sd[2] == 43L result0.sd[3] == 87L ) { printf(cvttps2dq_1 ... ok\n); } else { printf(cvttps2dq_1 ... not ok\n); printf( result0.sd[0] = %ld (expected %ld)\n, result0.sd[0], 12L); printf( result0.sd[1] = %ld (expected %ld)\n, result0.sd[1], 56L); printf( result0.sd[2] = %ld (expected %ld)\n, result0.sd[2], 43L); printf( result0.sd[3] = %ld (expected %ld)\n, result0.sd[3], 87L); } } else { printf(cvttps2dq_1 ... failed\n); } return; } static void cvttps2dq_2(void) { reg128_t arg0 = { .ps = { 12.34F, 56.78F, 43.21F, 87.65F } }; reg128_t arg1 = { .sd = { 1L, 2L, 3L, 4L } }; reg128_t result0; char state[108]; if (sigsetjmp(catchpoint, 1) == 0) { asm( fsave %3\n movlps 0%1, %%xmm5\n movhps 8%1, %%xmm5\n cvttps2dq %0, %%xmm5\n movlps %%xmm5, 0%2\n movhps %%xmm5, 8%2\n frstor %3\n : : m (arg0), m (arg1), m (result0), m (state[0]) : xmm4, xmm5 ); if (result0.sd[0] == 12L result0.sd[1] == 56L result0.sd[2] == 43L result0.sd[3] == 87L ) { printf(cvttps2dq_2 ... ok\n); } else { printf(cvttps2dq_2 ... not ok\n); printf( result0.sd[0] = %ld (expected %ld)\n, result0.sd[0], 12L); printf( result0.sd[1] = %ld (expected %ld)\n, result0.sd[1], 56L); printf( result0.sd[2] = %ld (expected %ld)\n, result0.sd[2], 43L); printf( result0.sd[3] = %ld (expected %ld)\n, result0.sd[3], 87L); } } else { printf(cvttps2dq_2 ... failed\n); } return; } static void movdq2q_1(void) { reg128_t arg0 = { .uq = { 0x012345678abcdefULL, 0xfedcba9876543210ULL } }; reg64_t arg1 = { .uq = { 0x1212121234343434ULL } }; reg64_t result0; char state[108]; if (sigsetjmp(catchpoint, 1) == 0) { asm( fsave %3\n movlps 0%0, %%xmm4\n movhps 8%0, %%xmm4\n movq %1, %%mm6\n movdq2q
Re: [Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour
On Tuesday 20 June 2006 12:29, malc wrote: The signature of movdq2q is Pq, VRq and for movq2dq - Vo, PRq it appears that translate.c gets it backwards, attached patch should deal with it. Cool. As for cvttps2dq i ran it with interpreter which uses outdated(i.e. non soft-float) conversion routines and it passed, so my guess would be that this is float32_to_int32_round_to_zero vs (int32_t) cast issue. I had a feeling this is a garbage-in-memory (or regs, or somewhere) problem. Reason is that the wrong results kept changing as I cut the full test program down to just the small one I posted. Can you try on a vanilla build of i386-softmmu from cvs? J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour
[EMAIL PROTECTED] qemu]$ gcc -msse2 sse2test.c -o sse2test [EMAIL PROTECTED] qemu]$ ./sse2test cvttps2dq_1 ... failed cvttps2dq_2 ... failed movdq2q_1 ... failed movq2dq_1 ... failed what am i doing wrong here ? Running it on a CPU without SSE2, if i'm allowed to venture a gues. Yup. Try 'strace ./sse2test' and see if it gets SIGILLs thrown at it. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour
Malc, your sse-movq.patch works for me. Thanks. soft-float was a red herring, translate.c is at fault here (interpreter does not use it, hence behaved correctly) translate.c:3009 if (b1 = 2 ((b = 0x50 b = 0x5f) || b == 0xc2)) { /* specific case for SSE single instructions */ if (b1 == 2) { /* 32 bit access */ gen_op_ld_T0_A0[OT_LONG + s-mem_index](); gen_op_movl_env_T0(offsetof(CPUX86State,xmm_t0.XMM_L(0))); } else { /* 64 bit access */ gen_ldq_env_A0[s-mem_index 2](offsetof(CPUX86State,xmm_t0.XMM_D(0))); } } else { gen_ldo_env_A0[s-mem_index 2](op2_offset); } cvttps2dq is 0x5b(b=0x5b) with repn prefix (b1=2) the above code is optimized a bit more than it should have been, as it loads only 4 bytes into xmm_t0 instead of 16. Uh, fine, but I don't understand how/what to fix. Can you advise? J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] [PATCH] Bug in target-i386/helper.c:helper_fxam_ST0
I've been doing some instruction set testing on i386-softmmu, with the aim of seeing if I can find any anomalies which might be the cause the of Win2K SP4 installation failure. helper_fxam_ST0 doesn't correctly distinguish infinities from nans, and thereby causes programs that use the x86 'fxam' instruction to occasionally produce incorrect results. That instruction is quite often used as part of transcendentals, for example pow, exp, log. On a Linux guest, it for example causes the libc call pow(0.6, inf) to produce inf when it should produce zero, and causes about 20 cases in the FP correctness suite I'm using to fail. The test case below shows the problem. It should produce 0x4000: 0.00 0x4200: -0.00 0x0500: inf 0x0700: -inf 0x0300: nan 0x0100: nan 0x0400: 0.00 0x0600: -0.00 0x0400: 1.23 0x0600: -1.23 but instead produces (omitting the correct cases) 0x0100: inf 0x0300: -inf What's strange is the logic in helper_fxam_ST0 looks correct. The distinguish-nans-from-infinities part is if (expdif == MAXEXPD) { if (MANTD(temp) == 0) env-fpus |= 0x500 /*Infinity*/; else env-fpus |= 0x100 /*NaN*/; } I suspect the check is correct for 52-bit mantissas (64-bit floats) but not for 64-bit mantissas (80-bit floats), as per these notes: /* 80 and 64-bit floating point formats: 80-bit: S 0 0---0 zero S 0 0X--X denormals S 1-7FFE 1X--X normals (all normals have leading 1) S 7FFF10--0 infinity S 7FFF10X-X snan S 7FFF11X-X qnan S is the sign bit. For runs XX, at least one of the Xs must be nonzero. Exponent is 15 bits, fractional part is 63 bits, and there is an explicitly represented leading 1, and a sign bit, giving 80 in total. 64-bit avoids the confusion of an explicitly represented leading 1 and so is simpler: S 0 0--0 zero S 0 X--X denormals S 1-7FE anynormals S 7FF0--0 infinity S 7FF0X-X snan S 7FF1X-X qnan Exponent is 11 bits, fractional part is 52 bits, and there is a sign bit, giving 64 in total. */ For 52-bit mantissas, the mantissa zero-vs-nonzero check is correct. But for 64-bit mantissas, the check needs to be if (MANTD(temp) == 0x8000ULL) and indeed setting it to that makes the test program run correctly. Patch and testcase follow. I'm still seeing cases where x87-based computation on qemu winds up with a NaN when it shouldn't. I think that's a separate problem. Will investigate. J - Index: target-i386/helper.c === RCS file: /sources/qemu/qemu/target-i386/helper.c,v retrieving revision 1.65 diff -r1.65 helper.c 2952a2953,2955 # ifdef USE_X86LDOUBLE if (MANTD(temp) == 0x8000ULL) # else 2953a2957 # endif - #include stdio.h #include math.h /* FPU flag masks */ #define X86G_FC_SHIFT_C3 14 #define X86G_FC_SHIFT_C2 10 #define X86G_FC_SHIFT_C1 9 #define X86G_FC_SHIFT_C0 8 #define X86G_FC_MASK_C3(1 X86G_FC_SHIFT_C3) #define X86G_FC_MASK_C2(1 X86G_FC_SHIFT_C2) #define X86G_FC_MASK_C1(1 X86G_FC_SHIFT_C1) #define X86G_FC_MASK_C0(1 X86G_FC_SHIFT_C0) #define MASK_C3210 (X86G_FC_MASK_C3 | X86G_FC_MASK_C2 | X86G_FC_MASK_C1 | X86G_FC_MASK_C0) double d; int i; extern void do_fxam ( void ); asm( \n do_fxam:\n \txorl %eax,%eax\n \tfldl d\n \tfxam\n \tfnstsw %ax\n \tffree %st(0)\n \tmovl %eax, i\n \tret\n ); double inf ( void ) { return 1.0 / 0.0; } double nAn ( void ) { return 0.0 / 0.0; } double den ( void ) { return 9.1e-220 / 1e100; } double nor ( void ) { return 1.23; } /* Try positive and negative variants of: zero, infinity, nAn, denorm and normal */ int main ( void ) { d = 0.0; do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); d = -0.0; do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); d = inf(); do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); d = -inf(); do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); d = nAn(); do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); d = -nAn(); do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); d = den(); do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); d = -den(); do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); d = nor(); do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); d = -nor(); do_fxam(); printf(0x%04x: %f\n, i MASK_C3210, d ); return 0; } ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Re: invisible wall patch
On Saturday 17 June 2006 18:03, Rick Vernam wrote: On Saturday 17 June 2006 11:32, Alex wrote: This patch has been around for a while but never committed to the mainstream. Huh? Fabrice committed it some time around Tuesday. I've been using it 8+ hours/day since then and it seems fine to me. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] VMware Player
On Thursday 15 June 2006 14:18, WaxDragon wrote: On 6/15/06, kadil [EMAIL PROTECTED] wrote: On Wed, 2006-06-14 at 18:10 +0200, Oliver Gerlich wrote: Real world, gui's are just so easy desirable, especially if the gui is consistent across os's, and part of the original distro. I think take-up would be huge (well huge-er, current takeup is huge) Kim Some of us appriciate the fact that qemu has no GUI per se. ;0) Sure. But to 'sell' the project to wider audience, which may be helpful for its longer term development, a GUI is necessary. Usability engineering isn't as much fun as hacking the JIT, or whatever, but in the end usability counts. A lot. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] invisible wall patch
Could somebody please commit, or at least consider committing, Anthony Liguori's invisible wall patch, shown at http://lists.gnu.org/archive/html/qemu-devel/2006-05/msg00112.html ? Without it, QEMU is essentially unusable on my SuSE 10 host; with it, the mouse stuff works perfectly. A couple of other people on that thread had similar experiences with it. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] getting the 5446 in 1152x864 mode
On Wednesday 07 June 2006 12:49, Julian Seward wrote: On Wednesday 07 June 2006 02:31, Ben Taylor wrote: I've been able to get 1152x864 out of the Solaris Xorg gdm5446 driver so there must be something else that's causing you problems. I think I've gotten win98se to do it as well. Thanks for the confirmation. So, I re-tried (extensively) to get 1152x864. 1152x864 doesn't work in WinXP either, and I am also still getting 'invisible wall' mouse problems. I wonder if my build is broken; but I'm not sure how. It's a clean build of cvs from two days ago, using gcc 3.4.3 on SuSE 10 (x86). The only change I made is to increase the translation cache size. J $ cvs diff -rHEAD Index: exec-all.h === RCS file: /sources/qemu/qemu/exec-all.h,v retrieving revision 1.47 diff -r1.47 exec-all.h 140c140 #define CODE_GEN_BUFFER_SIZE (16 * 1024 * 1024) --- #define CODE_GEN_BUFFER_SIZE (64 * 1024 * 1024) 149c149 #define CODE_GEN_AVG_BLOCK_SIZE 128 --- #define CODE_GEN_AVG_BLOCK_SIZE 256 //128 ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] getting the 5446 in 1152x864 mode
On Wednesday 07 June 2006 02:31, Ben Taylor wrote: I've been able to get 1152x864 out of the Solaris Xorg gdm5446 driver so there must be something else that's causing you problems. I think I've gotten win98se to do it as well. Thanks for the confirmation. So, I re-tried (extensively) to get 1152x864. That resolution is listed by Windows as possible at 70Hz and 75Hz (monitor), so I set the monitor refresh rates to those values in the Windows display settings, but still no success. Even with rebooting after changing the settings. I still also have sometimes an 'invisible wall' problem for the mouse pointer problem that was discussed on this list a few weeks ago. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] PATCH: fix bgr color mapping on qemu on Solaris/SPARC
autodetect what color format to use. Also putting if inside the inner loop of the low-level conversion routines is a bad idea. While that's per-se true, maybe it's not such a big deal. The branch is going to be perfectly predictable since the condition stays the same for the entire run, so I'd be surprised if you even lost one host cycle per iteration overall. Basically the hardware will fold it out. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [PATCH] Fix overflow conditions for MIPS add / subtract
-if ((T0 31) ^ (T1 31) ^ (tmp 31)) { +if (((tmp ^ T1 ^ (-1)) (T0 ^ T1)) 31) { + /* operands of same sign, result different sign */ CALL_FROM_TB1(do_raise_exception_direct, EXCP_OVERFLOW); } I see this went in, but - huh? The math doesn't make sense. T0 ^ T1 - operands of different sign tmp ^ T1 ^ (-1) - result has same sign as T1 The definitive reference for all this bit twiddling magic and much more besides is an excellent book, Hacker's Delight, by Hank Warren. It has loads of stuff about integer overflow and whatnot. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] Emulation differences, qemu-system-x86_64 vs Athlon64
Recently I've been playing with CVS qemu-system (softmmu) on amd64 and had some stability problems. I decided to run Valgrind's amd64 instruction-set tests (derived from qemu's) to see if they picked up anything. Resulting diffs are attached. There are a bunch of differences for the C flag for rotates (rol/ror) by multiples of the word size. I don't think these are significant, but who knows. Perhaps more worryingly are the 20 or so lines at the bottom of the diff. These I believe are for double-to-int/short conversions for a value which is out of range for an int/short; the hardware produces 0x8000/0x8000 respectively, which is the integer indefinite; QEMU produces zero. I can imagine some obscure routine somewhere checking for integer indefinite after conversion and being confused as a result. J diffs-qemu-vs-Athlon64.txt.bz2 Description: BZip2 compressed data ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] Emulation differences, qemu-system-x86_64 vs Athlon64
I guess the problem comes from the usage of lrintl() on x86_64 in fpu/softfloat-native.c, but I cannot test it yet. It might be that you have to pass in an extra value into those float - int conversion routines, which describes what to do if the conversion is going to overflow. That's because the behaviour is different depending on the guest architecture. x86/amd64 always give 0x8000, whereas ppc gives either 0x8000... or 0x7FFF depending on the sign of the argument (IIRC). J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] Translation cache sizes
Using qemu from cvs simulating x86-softmmu (no kqemu) on x86, booting SuSE 9.1 and getting to the xdm (kdm?) graphical login screen, requires making about 1088000 translations, and the translation cache is flushed 17 times. Booting is not too bad, but once user-mode starts to run the translation cache is pretty much hammered. I made 2 changes: * increase CODE_GEN_BUFFER_SIZE from 16*1024*1024 to 64*1024*1024, * observe that CODE_GEN_AVG_BLOCK_SIZE of 128 for the softmmu case is too low; my measurements put it at about 247. So I changed it to 256. With those changes in place, the same boot-to-kdm process requires only about 57 translations to be made, and 2 cache flushes to happen. Of course the cost is an extra 48M of memory use. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] vmware puts up specs for it's disk format
Basically, if u want split images to be supported in qemu, speak up now. ;) I speak! Me too! I've always used split images with vmware; they are easier to manage than files tens of gigabytes long. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
[Qemu-devel] Experiences with qemu-system-ppc
What are the prospects for qemu-system-ppc being improved to the point where I can do a vanilla install of popular ppc32-linux distros? I've tried a vanilla install of Debian Sarge (3.1). That went smoothly, except for the stage right at the end, where the disk image is made bootable. The resulting disk image is, despite much messing around, not bootable (using QEMU from cvs). Vesselin Peev made a ppc32 Debian Sarge image with a specially modified external kernel to work around this problem (http://free.oszoo.org/ftp/images/debian_sarge_ppc.tar.torrent) and that does work. Unfortunately QEMU consumes 100% host CPU even when the virtual machine is idle, which means it's not useful for a long-running virtual server. And the reliance on a specially modified kernel means normal updating/upgrading of the virtual machine isn't possible. My goal is to run a farm of QEMU virtual ppc32 machines to do overnight builds/tests of Valgrind on different ppc32 distros. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] [PATCH] SSE3 emulation
it, if someone could test this patch and/or give me links to programs (Linux/Win98) that use SSE3 instructions (and preferably also prove Valgrind (current svn trunk) has some pretty extensive SSE/SSE2 tests; you could use those. J ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel
Re: [Qemu-devel] patch for qemu with newer gcc-3.4.x (support repz retq optimization for amd processors correctly)
The use of gcc to generate the back end in QEMU's early days was a clever way to get the project up and running quickly. But surely now it would be better to transition to a handwritten backend, so as to be independent future changes in gcc, and generally more robust? J On Wednesday 09 November 2005 19:51, Igor Kovalenko wrote: Paul Brook wrote: Notice the 'repz mov' sequence, which seems to be undocumented instruction. It seems to work somehow but chokes valgrind decoder. The following patch (against current CVS) fixes this problem, This patch is incorrect. It could match any number of other instructions that happen to end in 0xf3. eg 0: c7 45 00 00 00 00 f3movl $0xf300,0x0(%ebp) 7: c3 ret IIRC the rep; ret sequence is to avoid a pipeline stall on Athlon CPUs. Try tuning for a different CPU. Paul Index: dyngen.c === RCS file: /cvsroot/qemu/qemu/dyngen.c,v retrieving revision 1.40 diff -u -r1.40 dyngen.c --- dyngen.c27 Apr 2005 19:55:58 - 1.40 +++ dyngen.c9 Nov 2005 19:12:38 - @@ -1387,6 +1387,12 @@ error(empty code for %s, name); if (p_end[-1] == 0xc3) { len--; +/* This can be 'rep ; ret' optimized return sequence, + * need to check further and strip the 'rep' prefix + */ +if (len != 0 p_end[-2] == 0xf3) { +len--; +} } else { error(ret or jmp expected at the end of %s, name); } OK I missed that... Then a discussion about gcc-4 turns into something much more interesting :) ___ Qemu-devel mailing list Qemu-devel@nongnu.org http://lists.nongnu.org/mailman/listinfo/qemu-devel