Re: [Qemu-devel] What does code_copy_enabled do?

2008-02-12 Thread Julian Seward

 :   Any news on the possible cvs-svn migration?
 : 
 :  To be perfectly honest, IMO there is little point moving an existing
 :  project from CVS to SVN.
 :
 : I disagree.  CVS has several fairly fundamental flaws (no global revision
 : IDs, unable to move files, and more subtle problems with branches/tags).
 : SVN fixes these, and in most cases works as a direct drop-in replacement
 : for CVS.

 FreeBSD is moving from CVS to SVN for these reasons.

Just to second M. Warner Losh: we moved Valgrind from CVS to SVN about
3.5 years ago and it was an excellent thing to do.  It is not true to say
there is no advantage over CVS -- the global revision IDs, the ability to
rename files, and a simpler branching/tagging model are all big advantages.
And the fact that it is more-or-less conceptually a drop-in replacement makes
it easy for people to make the migration.

Sure, Valgrind is a tiny project compared to FreeBSD.  But we gain those
advantages nonetheless.

J




Re: [Qemu-devel] WE NEED GCC 4 please

2008-01-21 Thread Julian Seward

   As it is, Fabrice's code generator will most likely be something
   similar to Paul's qops, which means that you have to invent a
   primitive C in which to write the miniops, and you will have to
   write a backend for _each_ and _every_ host CPU you support.

It's not a terribly big deal.  Writing backends is a lot easier than
writing front ends, since the back end can just emit some small convenient
subset of target instructions, whereas the front ends have to deal
with every stupid, obscure, weird-ass instruction that ever shows up.

QEMU is not the first project to post-process gcc's output.  The
Glasgow Haskell Compiler
(http://en.wikipedia.org/wiki/Glasgow_Haskell_Compiler)
did that for many years and it was always an immense amount of
hassle tracking the changes to gcc's code generation.  Having a
completely-independent-of-everything, standalone code generator is
definitely a lot easier in the end.

 Given the unwillingness of Fabrice to rely on some external project,
 though, I gave up even before I had something even rudimentary.

Perhaps Fabrice could commit this code generator on a branch, even if
it is not perfect yet.  That would at least provide something real
to assess; so far all we have is rumour and speculation.

J




Re: [Qemu-devel] qemu hw/ppc_oldworld.c target-ppc/cpu.h target-...

2007-11-23 Thread Julian Seward

 Well, I admit I've invented the term ppc32, but there are dozens of
 32-bit PowerPC chips. I'd be amazed if they do 64-bit computations or have
 64-bit GPRs.

Indeed not.  Valgrind implements the 32-bit PPC user-space instruction set
quite adequately using 32-bit computations throughout.  No need for 64-bit
computations.

J




[Qemu-devel] emu errors for creqv,crnand,crnor,crorc ?

2007-10-31 Thread Julian Seward

Hi Jocelyn

I ran valgrind's ppc32 insn set tests and got the impression that
the above insns are not correctly implemented.  It seems like 7 bits
of CR are set to 1 and one is set to 0, when it should be the other
way around.  Below is a simple test case.  On QEMU it prints

  result is 000fc000

and on a real 7447

  result is 4000

Similar behaviour for creqv, crnand, crnor.  But cror, crand, crxor work
OK.  So maybe it is related to the inverted-result-sense?  But the strange
thing is, ~0xFC != 0x04.

J

#include stdio.h
void do_crorc_17_14_15 ( void )
{
  UInt res = 0x;
  __asm__ __volatile__(
li 9,0\n
\tmtcr 9\n
\tmtxer 9\n
\tcrorc 17,14,15\n
\tmfcr %0
: /*out*/=b(res) : /*in*/ : /*trash*/ 9
);
  printf(result is %08x\n, res );
}

int main ( void )
{
  do_crorc_17_14_15();
  return 0;
}




[Qemu-devel] Re: emu errors for creqv,crnand,crnor,crorc ?

2007-10-31 Thread Julian Seward

  way around.  Below is a simple test case.  On QEMU it prints
 
result is 000fc000
 
  and on a real 7447
 
result is 4000
 
  What is strange is that 0xFC + 0x04... I will have to check all the CR
 ops, I guess...

Another strange thing is that 000fc000 does not have 'fc' byte-aligned
inside CR, if you see what I mean.  If it was fc00 or 00fc, some
byte-inversion mistake would seem likely.

This isn't a 74xx specific result.  I'm sure any ppc should produce
4000.  The test is very simple: make CR=0 then do crorc 17,14,15.
So only 1 bit in CR will then be set - all others are zero.

J




Re: [Qemu-devel] Kqemu on x86_64 host with x86_64 guest

2007-10-13 Thread Julian Seward
On Saturday 13 October 2007 16:24, Werner Dittmann wrote:
 Bruno Cornec wrote:
  On Sat, Oct 13, 2007 at 01:53:37PM +0200, Bruno Cornec wrote:
  However, mandriva 2008.0 x86_64 doesn't exhibit this error on the same
  host.
 
  I stand corrected. It also crashed but later during the install process,
  where the other were at the start. Back to -no-kqemu.
 
  Bruno.

 Even when using -no-kqemu it somehow fails/hangs during setup of Grub
 when I try to install a openSuse 10.2 or 10.3 . These problems are know
 for quite some time - but no solution yet.

Yes.  I also observed that with openSUSE 10.{1,2,3}.  After some
experimentation I successfully installed 10.1 by asking the installer
to use LILO instead of Grub.  However, even then, some user space
code does not work properly - running the YaST online update inside the
successfully-installed 10.1 fails.

I wondered if there is some problem in the x86_64 instruction set emulation.
I ran some tests from Valgrind, and it appears that some FP-int conversion
instructions do not take care of the rounding mode.  I did not detect
any other errors.  See
http://lists.gnu.org/archive/html/qemu-devel/2007-10/msg00233.html

I tried to build x86_64-softmmu using softfloat.c rather than
softfloat-native.c since it looks like softfloat.c emulates these
corner cases (rounding mode, etc) more completely.  So far I got 
a lot of compilation errors and did not make much progress.  I get
the impression x86_64-softmmu and i386-softmmu are intended only to
be built with softfloat-native.c.

It might be worth installing SuSE 10.1 and finding some small program
which fails to work properly.  Then we might have a hope of determining
what the problem is.

J




[Qemu-devel] FP emulation bugs for x86_64-softmmu

2007-10-10 Thread Julian Seward

Some x86_64 SSE2 instructions that convert floats to ints appear
to ignore the rounding mode (in mxcsr), and so produce wrong results
if mxcsr is set to anything other than default rounding.  For example
cvtsd2si et al.

I'm looking at softfloat-native.c and softfloat.c and wondering how
to fix it.  A couple of questions:

* is softfloat-native.c intended to handle such corner cases as 
  accurately as softfloat.c ?

* is it possible to build x86_64-softmmu to use softfloat.c 
  rather than softfloat-native.c?

  I hacked ./configure to use CONFIG_SOFTFLOAT for x86_64 (added
  x86_64 as a softfloat cpu in test at line 1095), but the build
  then dies like this:

  target-i386/exec.h:296: warning: conflicting types for built-in
  function 'sinl'
  target-i386/exec.h:297: warning: conflicting types for built-in  
  function 'cosl'
  target-i386/exec.h:298: warning: conflicting types for built-in
  function 'sqrtl'
  target-i386/exec.h:299: warning: conflicting types for built-in
  function 'powl'
  target-i386/exec.h:300: warning: conflicting types for built-in
  function 'logl'
  target-i386/exec.h:301: warning: conflicting types for built-in
  function 'tanl'
  target-i386/exec.h:302: warning: conflicting types for built-in
  function 'atan2l'
  target-i386/exec.h:303: warning: conflicting types for built-in
  function 'floorl'
  target-i386/exec.h:304: warning: conflicting types for built-in
  function 'ceill'
  target-i386/exec.h: In function `helper_fldt':
  target-i386/exec.h:440: error: incompatible types in return
  target-i386/exec.h: In function `helper_fstt':
  target-i386/exec.h:447: error: incompatible types in assignment
  (many more errors like this follow)

  Is this some minor compile bug, or is it the case that x86_64-softmmu
  (and i386-softmmu) is not intended to use softfloat.c?

J




Re: [Qemu-devel] [PATCH, RFC] More than 2G of memory on 64-bit hosts

2007-06-27 Thread Julian Seward

  Unfortunately C99 relaxed this requirement, and allowed abominations like
  the win64 ABI.
 
  This means you have a choice: Write standard conforming code (long) that
  works on all known systems except win64, or use features that do't exist
  on many systems. IIRC C99 types like intptr_t are not supported on
  several fairly common unix systems.

 In that case I'll vote for unsigned long. I'd pass the issue to those
 doing a win64 port, if ever that happens.

In Valgrind-world we use an alternative approach, which is to typedef
a set of new integral types and use those exclusively, and not use the
native 'int', 'long' etc.  The new types have a single fixed meaning
regardless of the host or guest and it is up to the configure script
to set up suitable typedefs.  At startup Valgrind checks the size and
signedness of these types is as expected, so any configuration errors
are caught.  This has proved very helpful in porting to a number of 
platforms.

J




Re: [Qemu-devel] Patch: ltr for x86_64 should check the upper descriptor type

2007-03-26 Thread Julian Seward

Does this fix some specific bug you encountered?

J

On Monday 26 March 2007 14:53, Bernhard Kauer wrote:
 The Intel manual states for LTR and 64-Bit Exceptions:

 #GP(selector)
If the descriptor type of the upper 8-byte of the 16-byte descriptor
is non-zero.

 Qemu currently does not check this. The attached patch fixes the bug.


 Bernhard Kauer




Re: [Qemu-devel] 0.9.0 and svn don't build with -march=pentium2 etc.; was: Latest SVN fails to build on Fedora Core 6 (same with 0.9.0)

2007-03-24 Thread Julian Seward

 As far as X86 is concerned i386/i486/i586 are very different from later
 generation
 processors. I am wondering whether another host and target architecture
 could be
 created called i686 that makes use of something like MMX or other
 registers in Intel
 Pentium II/III/4 and AMD Athlon to negate the lack of general purpose
 registers.

I don't see how.  MMX/SSE is suitable for SIMD processing of media data
and to some extent for floating point, but is largely unusable for ad-hoc
integer computation, especially anything that involves address calculations.

 The fact that QEMU works and can be optimised on x86_64 is the only
 saving grace
 for the architecture, that is still suffering from a lack of registers
 compared to any
 other architecture.

The lack of registers isn't ideal, but it's not a big deal, and in the
grand scheme of things x86_64 has a lot going for it.  The most 
important of which are that (from the software side) all the hard-won 
knowledge of how to compile good code for x86 carries across more or less
directly to x86_64, and (from the hardware side) hardware people already
know how to make fast, cheap x86s, so it's easy to move to making fast,
cheap x86_64s.

The problems of the gcc backend to qemu have already been discussed
extensively on this list.  Stealing 3+ registers from gcc on x86 really
is asking for trouble, and I believe it is generally understood that the
best long term solution is to move to a self-contained back end that 
does not use gcc for dynamic code generation.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] RFC: This project needs a stable branch

2007-03-22 Thread Julian Seward
On Thursday 22 March 2007 23:27, Paul Brook wrote:
  Do you mean you're asking me to break up Paul Brook's QOPS tree at
  https://nowt.dyndns.org and submit it to mainline?  I can do this thing,
  if you really think it would help.

 If you implement all the missing bits in the process it'll help ;-)

What bits would they be then?

FWIW, I snarfed the patch last Sunday and tested it on amd64 host / 
x86 guest, and successfully booted a couple of linux distros.  So it's
not obviously broken, at least for my mundane host/guest choice.
It also seemed marginally slower on a big compile in the guest - 
395.4 host cpu seconds for mainline vs 422.9 with qops.  Is this 
expected?

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] RFC: This project needs a stable branch

2007-03-20 Thread Julian Seward
On Thursday 15 March 2007 14:53, Paul Brook wrote:
  Subsequent releases of the branch would contain no functionality
  enhancements, but just bug fixes, with the eventual aim of achieving
  'it just works' status for any x86/x86_64 guest I try to install/run.
  I know that's a tall order, and that 0.9.0 may not be able to supply
  that for all guests.  But it is an important goal to strive for.

 While I agree stability is a desirable goal, and there is obviously users
 want a stable product, I'm not sure a qemu is mature enough to make a
 stable branch worthwhile.  Especially considering the very limited
 technical resources we have available.

Limited effort is always a problem, granted.

So here's a broader question, which I'm surprised nobody has asked
before (afaik).  Think forward to a hypothetical QEMU 1.0 release.
What criteria are required for such a release?

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] [PATCH] softfloat missing functions

2007-03-19 Thread Julian Seward

 Note that float64_to_uint64 functions are not correct, as they won't
 return results between INT64_MAX and UINT64_MAX. Hope someone may know
 the proper solution for this.

How about this?

J

uint64_t float64_to_uint64 (float64 a STATUS_PARAM)
{
uint64_t res;
int64_t v;

if (isinf(a) || isnan(a)) {
   return special value (  maybe 163 ?)
}
else
if (a  0.0 || a  (float64)UINT64_MAX) {
   return out-of-range value, whatever that is
} else {

   a += (float64) INT64_MIN;  // move a downwards 
   v = llrint(a); // convert
   v -= INT64_MIN;// move v back up

   return v;
}
}


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] [PATCH] softfloat missing functions

2007-03-19 Thread Julian Seward

Thinking about this more, you ask is this correct, but that
is only meaningful if you say what the specification is.  
Correct relative to what?

 Yes, it seems to be the correct way, but thinking more about the
 problem, it appeared to me that the implementation could be even easier
 than yours. It seems to me that this may be sufficient:
 uint64_t float64_to_uint64 (float64 a STATUS_PARAM)
 {
 int64_t v;

 v = llrint(a + (float64)INT64_MIN);

 return v - INT64_MIN;
 }

If a is NaN then so is the argument to llrint.  'man llrint' says:

  If x is infinite or NaN, or if the rounded value is
  outside  the  range  of  the  return type, the numeric result
  is unspecified. 

So then float64_to_uint64 produces an unspecified result.

It seems to me much safer to test and handle NaN, Inf and
out-of-range values specially.  However, even that does not help
unless you say what the specification is.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] [PATCH] Fix guest x86/amd64 helper_fprem/helper_fprem1

2007-03-17 Thread Julian Seward

The helpers for x86/amd64 fprem and fprem1 in target-i386/helper.c are
significantly borked and, for example, cause konqueror in RedHat8 (x86
guest) to go into an infinite loop when displaying http://news.bbc.co.uk.

helper_fprem has the following borkage:
- various Inf/Nan/zero inputs not handled correctly
- incorrect rounding when converting negative 'dblq' to 'q'
- incorrect order of assignment to C bits (0,3,1 not 0,1,3)

helper_fprem1 has those problems and is also incorrect about the points
at which its rounding needs to differ from that of helper_fprem.

Patch below fixes all these.  It brings the fprem and fprem1 behaviour 
very much closer to the hardware -- not identical, but close.  Some
+0.0 results should really be -0.0 and there may still be other differences.

Anyway konquerer no longer loops with the patch applied.

J
--- ../Orig/qemu-0.9.0/target-i386/helper.c	2007-02-05 23:01:54.0 +
+++ target-i386/helper.c	2007-03-17 17:21:02.0 +
@@ -3097,30 +3097,51 @@
 CPU86_LDouble dblq, fpsrcop, fptemp;
 CPU86_LDoubleU fpsrcop1, fptemp1;
 int expdif;
-int q;
+signed long long int q;
+
+if (isinf(ST0) || isnan(ST0) || isnan(ST1) || (ST1 == 0.0)) {
+   ST0 = 0.0 / 0.0; /* NaN */
+   env-fpus = (~0x4700); /* (C3,C2,C1,C0) --  */
+   return;
+}
 
 fpsrcop = ST0;
 fptemp = ST1;
 fpsrcop1.d = fpsrcop;
 fptemp1.d = fptemp;
 expdif = EXPD(fpsrcop1) - EXPD(fptemp1);
+
+if (expdif  0) {
+/* optimisation? taken from the AMD docs */
+env-fpus = (~0x4700); /* (C3,C2,C1,C0) --  */
+/* ST0 is unchanged */
+	return;
+}
+
 if (expdif  53) {
 dblq = fpsrcop / fptemp;
-dblq = (dblq  0.0)? ceil(dblq): floor(dblq);
+	/* round dblq towards nearest integer */
+dblq = rint(dblq);
 ST0 = fpsrcop - fptemp*dblq;
-q = (int)dblq; /* cutting off top bits is assumed here */
+
+	/* convert dblq to q by truncating towards zero */
+	if (dblq  0.0)
+   q = (signed long long int)(-dblq);
+else
+   q = (signed long long int)dblq;
+
 env-fpus = (~0x4700); /* (C3,C2,C1,C0) --  */
-/* (C0,C1,C3) -- (q2,q1,q0) */
-env-fpus |= (q0x4)  6; /* (C0) -- q2 */
-env-fpus |= (q0x2)  8; /* (C1) -- q1 */
-env-fpus |= (q0x1)  14; /* (C3) -- q0 */
+/* (C0,C3,C1) -- (q2,q1,q0) */
+env-fpus |= (q0x4)  (8-2);  /* (C0) -- q2 */
+env-fpus |= (q0x2)  (14-1); /* (C3) -- q1 */
+env-fpus |= (q0x1)  (9-0);  /* (C1) -- q0 */
 } else {
 env-fpus |= 0x400;  /* C2 -- 1 */
 fptemp = pow(2.0, expdif-50);
 fpsrcop = (ST0 / ST1) / fptemp;
-/* fpsrcop = integer obtained by rounding to the nearest */
-fpsrcop = (fpsrcop-floor(fpsrcop)  ceil(fpsrcop)-fpsrcop)?
-floor(fpsrcop): ceil(fpsrcop);
+/* fpsrcop = integer obtained by chopping */
+fpsrcop = (fpsrcop  0.0)?
+-(floor(fabs(fpsrcop))): floor(fpsrcop);
 ST0 -= (ST1 * fpsrcop * fptemp);
 }
 }
@@ -3130,26 +3151,48 @@
 CPU86_LDouble dblq, fpsrcop, fptemp;
 CPU86_LDoubleU fpsrcop1, fptemp1;
 int expdif;
-int q;
-
-fpsrcop = ST0;
-fptemp = ST1;
+signed long long int q;
+
+if (isinf(ST0) || isnan(ST0) || isnan(ST1) || (ST1 == 0.0)) {
+   ST0 = 0.0 / 0.0; /* NaN */
+   env-fpus = (~0x4700); /* (C3,C2,C1,C0) --  */
+   return;
+}
+
+fpsrcop = (CPU86_LDouble)ST0;
+fptemp = (CPU86_LDouble)ST1;
 fpsrcop1.d = fpsrcop;
 fptemp1.d = fptemp;
 expdif = EXPD(fpsrcop1) - EXPD(fptemp1);
+
+if (expdif  0) {
+/* optimisation? taken from the AMD docs */
+env-fpus = (~0x4700); /* (C3,C2,C1,C0) --  */
+	/* ST0 is unchanged */
+return;
+}
+
 if ( expdif  53 ) {
-dblq = fpsrcop / fptemp;
+dblq = fpsrcop/*ST0*/ / fptemp/*ST1*/;
+	/* round dblq towards zero */
 dblq = (dblq  0.0)? ceil(dblq): floor(dblq);
-ST0 = fpsrcop - fptemp*dblq;
-q = (int)dblq; /* cutting off top bits is assumed here */
+ST0 = fpsrcop/*ST0*/ - fptemp*dblq;
+
+	/* convert dblq to q by truncating towards zero */
+	if (dblq  0.0)
+   q = (signed long long int)(-dblq);
+else
+   q = (signed long long int)dblq;
+
 env-fpus = (~0x4700); /* (C3,C2,C1,C0) --  */
-/* (C0,C1,C3) -- (q2,q1,q0) */
-env-fpus |= (q0x4)  6; /* (C0) -- q2 */
-env-fpus |= (q0x2)  8; /* (C1) -- q1 */
-env-fpus |= (q0x1)  14; /* (C3) -- q0 */
+/* (C0,C3,C1) -- (q2,q1,q0) */
+env-fpus |= (q0x4)  (8-2);  /* (C0) -- q2 */
+env-fpus |= (q0x2)  (14-1); /* (C3) -- q1 */
+env-fpus |= (q0x1)  (9-0);  /* (C1) -- q0 */
 } else {
+int N = 32 + (expdif % 32); /* as per AMD docs */
 env-fpus |= 0x400;  /* C2 -- 1 */
-fptemp = pow(2.0, 

[Qemu-devel] Redundant repz prefixes in generated amd64 code

2007-03-16 Thread Julian Seward

I'm seeing redundant repz (0xF3) prefixes in generated code, typically
just before jumps:

code_gen_buffer+415:  repz mov $0xe07f,%eax
code_gen_buffer+421:  mov%eax,0x20(%rbp)
code_gen_buffer+424:  lea-25168302(%rip),%ebx  # 0xaf0420 tbs+96
code_gen_buffer+430:  retq
code_gen_buffer+431:  mov-25168245(%rip),%eax  # 0xaf0460 tbs+160
code_gen_buffer+437:  jmpq   *%rax
code_gen_buffer+439:  repz mov $0xe092,%eax
code_gen_buffer+445:  mov%eax,0x20(%rbp)
code_gen_buffer+448:  lea-25168325(%rip),%ebx   # 0xaf0421 tbs+97
code_gen_buffer+454:  retq

I assume these are something to do with translation chaining/unchaining but
have been unable to figure out where they come from.  I know they get executed
are so are not data - valgrind barfs on them.

This is on a 64-bit host (Core 2) with qemu-0.9.0 compiled from source by
gcc-3.4.6, running an x86 (32-bit) guest.

At a guess I'd say the mov $imm,%eax is (created by? to do with?) 
gen_jmp_im in target-i386/translate.c, but I don't see how the F3 
got in on the act.  Grepping the source for 0xF3 turns up nothing 
plausible.  Any ideas where it comes from and how to get rid of it?

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] Redundant repz prefixes in generated amd64 code

2007-03-16 Thread Julian Seward
On Friday 16 March 2007 14:28, Paul Brook wrote:
 On Friday 16 March 2007 14:15, Julian Seward wrote:
  I'm seeing redundant repz (0xF3) prefixes in generated code, typically
  just before jumps:
 
  code_gen_buffer+415:  repz mov $0xe07f,%eax
  code_gen_buffer+421:  mov%eax,0x20(%rbp)
  code_gen_buffer+424:  lea-25168302(%rip),%ebx  # 0xaf0420 tbs+96
  code_gen_buffer+430:  retq
  code_gen_buffer+431:  mov-25168245(%rip),%eax  # 0xaf0460 tbs+160
  code_gen_buffer+437:  jmpq   *%rax
  code_gen_buffer+439:  repz mov $0xe092,%eax
  code_gen_buffer+445:  mov%eax,0x20(%rbp)
  code_gen_buffer+448:  lea-25168325(%rip),%ebx   # 0xaf0421 tbs+97
  code_gen_buffer+454:  retq
 
  I assume these are something to do with translation chaining/unchaining
  but have been unable to figure out where they come from.

 8b50 op_goto_tb1:
 8b50:   8b 05 00 00 00 00   mov0(%rip),%eax
 8b52: R_X86_64_PC32 __op_param1+0x3c
 8b56:   ff e0   jmpq   *%rax
 8b58:   f3 c3   repz retq

 qemu only strips the final ret off.
 The prefixed ret is to avoid prefetch stalls on amd cpus.

So the implication of this is that the generated code just happens to
work only because the dangling F3 never ends up in front of some other
instruction which it would change the meaning of?

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] SSE 'maxps' instruction bug?

2007-03-16 Thread Julian Seward
On Friday 16 March 2007 18:07, Rob Landley wrote:
 On Tuesday 13 March 2007 10:21 pm, Julian Seward wrote:
   0.9.0, or that the compiler/host combination used to build the qemu
   binary Julian is running generated bad code for the float compares.
 
  I used gcc 3.4.6 bootstrapped as normal ('make bootstrap; make install')
  on a 64-bit machine.  If it is qemu generating bad code due to variations
  in gcc behaviour, that's another argument in favour of scrapping the gcc
  3.X based backend and using a self contained, handwritten insn selector
  and register allocator.

 Are you referring to https://nowt.dyndns.org/ or something else?

I was referring to an idea, of which the nowt thing is an implementation.
I'm not aware of any other such backends to qemu.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] QCOW(2) image corruption under QEMU 0.9.0 reproducible

2007-03-16 Thread Julian Seward

I ran QEMU on Valgrind for several hours last night, including a couple
of boot-shutdown cycles of RedHat8, and lots of file copying/deletion 
in the guest to get the qcow2 image up to 8GB and generally cause a lot
of disk IO.  I got no memory errors whatsoever from Valgrind and got no
filesystem corruption, so I guess I didn't trigger the bug.

Really the first thing to do is establish a reliable way to reproduce it.

J


On Friday 16 March 2007 17:00, Ben Taylor wrote:
  J M Cerqueira Esteves [EMAIL PROTECTED] wrote:
  herbie hancock wrote:
   Hello, i had also a reproducible disk crash:
   info of the last good image, size is about 3,5GB
  
   I never experienced such a bad problem with qemu before, maybe it is a
   problem with qcow2 format ?
 
  After the problems with qcow2 images which I reported here a few weeks
  ago, I've only been using qcow images (under QEMU 0.9.0), without such
  surprises.  So it seems qemu has some bug related to qcow2 images,
  maybe manifesting itself only after they get larger than 4GB...

 I suspect I saw problems with qcow2 images as well.  I was able to suspend
 a Solaris Nevada B58 install and use savevm about 30% into the install and
 restart it later.  As the image completd, the file system went all to hell
 with corruption that was impossible to fix.  At the time, I attributed it
 to the Solaris install (thinking it might have something to do with the
 cmpxchg8b bug that was later fixed), but I suspect with the multiple
 reports I've seen, I'm now thinking I saw the same thing.

 I'm testing conversion of a qcow image to a qcow2 image.  We'll see how
 that goes

 Ben


 ___
 Qemu-devel mailing list
 Qemu-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/qemu-devel


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] [PATCH] Reducing X communication bandwidth, take 2

2007-03-16 Thread Julian Seward
On Friday 16 March 2007 18:40, Anthony Liguori wrote:
 Hi Julian,

 Julian Seward wrote:
  Here is a somewhat revised version of a patch I first made nearly
  three years ago.  The original thread is
 
  http://lists.gnu.org/archive/html/qemu-devel/2004-07/msg00263.html
 
  It still uses a shadow frame buffer.  Fabrice mentioned this is not
  necessary

 I thought about this a little and here's what I came up with.

 I think we could change vga_draw_line* so that as part of the drawing
 process, it compared the newly generated pixel with the previous pixel
 value and returned back the min, max x-coordinate that changed.

 Since we tend to only extend the vertical dirty range by a couple
 pixels, this should be a relatively cheap way of reducing the size of
 the update region.

Sounds plausible - having said that, I have no familiarity with the VGA
code.  Also sounds like a cleaner solution than mine.

Is there something which guarantees that the vertical dirty range only
overshoots by some small number of pixels?  (thinking more about it ..
it doesn't matter - finding min/max that changed for each line will also
make it possible to identify the vertical limits of the change).

Will this work also for the CL542x adaptor?  (Does that fall in the category
of vga?)  My current hack works for with/without -std-vga and I think 
that's because it lives underneath both, in the connection to SDL.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] Redundant repz prefixes in generated amd64 code

2007-03-16 Thread Julian Seward

  ifeq ($(ARCH),x86_64)
 +OP_CFLAGS+= -mtune=nocona -W -Wall -O4
  BASE_LDFLAGS+=-Wl,-T,$(SRC_PATH)/$(ARCH).ld
  endif

That works.  Thanks.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] QCOW(2) image corruption under QEMU 0.9.0 reproducible

2007-03-15 Thread Julian Seward

Something similar happened to me.  At first I thought it was a hardware
(host) problem and so do not have good details - this is from memory.

- 0.9.0, binary build from qemu.org
- i386 openSUSE 10.2 host
- RedHat 8 guest
- .qcow2 image, max size 8GB
- using the Accelerator but not -kernel-kqemu

- first sign of trouble was the ext3 driver in RedHat8 complaining of
  something about disk geometry (something % 4 was not zero when it
  should have been)

- this is when the disk image was about 3G in size

- shortly thereafter the disk image was so corrupted that e2fsck
  could not fix it

- IIRC, ls -l on the corrupted image showed some implausibly huge size
  ( 100GB ?) leading me to believe the file had some huge block of
  zeroes in the middle

J


On Wednesday 14 March 2007 23:20, herbie hancock wrote:
 Hello, i had also a reproducible disk crash:
 info of the last good image, size is about 3,5GB

 image: debian4_0.dsk file format: qcow2 virtual size: 16G (16777216000
 bytes) disk size: 3.3G cluster_size: 4096

 as soon as the image increase to a size of about 7,9 GB the emulator locks
 up, and after a restart the the image is not readable any more. the start
 of the image is filled with zeros, the signature of the file  at the start
 (QFI )  is overwritten.

 I tried it two times, started with a intact image with the size above and
 in both times the image was corrupted.

 Host: WIN2K
 Guest: Debian 4.0 Etch.
 qemu: 0.90 (build date 2007-02-19, the version that comes with
 http://www.davereyn.co.uk/qem/setupqemuk40.exe)

 I tested the image above with virtualbox (installed backup of the qemu disk
 with acronis trueimage and bart pe boot cd) , started with the above image,
 and the problems are gone, image is now filled with more than 9GB, no
 problem so far.

 I never experienced such a bad problem with qemu before, maybe it is a
 problem with qcow2 format ?

 Bye
 HR
 _
 Der WEB.DE SmartSurfer hilft bis zu 70% Ihrer Onlinekosten zu sparen!
 http://smartsurfer.web.de/?mc=100071distributionid=0066



 ___
 Qemu-devel mailing list
 Qemu-devel@nongnu.org
 http://lists.nongnu.org/mailman/listinfo/qemu-devel


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] RFC: This project needs a stable branch

2007-03-15 Thread Julian Seward

I am a great fan of QEMU, and have used it more or less continuously
for the past 2+ years.  Over that time I've installed and operated
various Linux and Windows guests with varying degrees of success.

The recently released 0.9.0 seems a big step forward in the
stability/usability department, which is excellent.  But there are
still residual worries -- for example, qcow2 images corrupted for no
obvious reason -- which, whilst a boring problem, is important for
folks like me who want to run VMs 24x7 with the hope of complete
reliability.

Pretty much all mature projects which have achieved widespread usage
have one or more stable branches along with the main development
branch (trunk).  Think GCC, the kernel, KDE, ... the list is endless.

Maintaining a stable branch is extra hassle and overhead, but it is
the standard way to operate, for reasons which are obvious: the
majority of users care more about stability, reliability and usability
than they do about the latest new features, and delivering stability
from a branch used for bleeding-edge development work is pretty much
impossible.  That is not, of course, a criticism of the bleeding edge
developers, since it is they who ultimately drive the project along.

I am writing to propose that a stable branch be made from the 0.9.0
release point.  The aim would be to maximise stability for (IMO) the
subset of functionality that has the largest potential user base:
i386-softmmu + Accelerator and x86_64-softmmu + Accelerator, but
excluding -kernel-kqemu due to limitations described in
http://qemu.org/kqemu-doc.html#SEC7.

Subsequent releases of the branch would contain no functionality
enhancements, but just bug fixes, with the eventual aim of achieving
'it just works' status for any x86/x86_64 guest I try to install/run.
I know that's a tall order, and that 0.9.0 may not be able to supply
that for all guests.  But it is an important goal to strive for.

My impression is that (at least as I perceive it) the lack of emphasis
on maximising stability on a stable branch, and the lack of a bug
tracker, is artificially restricting QEMU's user base, and therefore
indirectly its long term prospects.  This is a shame, because QEMU is
a very remarkable and useful project, which should be used (and
usable) by everybody and anybody.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] RFC: This project needs a stable branch

2007-03-15 Thread Julian Seward

On Thursday 15 March 2007 13:48, Anthony Liguori wrote:
 I'm not necessarily sure I agree that a stable branch is the best thing
 to have (verses aiming for never introducing regressions).

Aiming for no regressions is a worthy aim, but I believe unachieveable
in a project of any size.  For sure it's impossible if there is ever a
need to make large-scale infrastructural changes, which inevitably is
occasionally the case if the project is to live a long time.

For example, if the dyngen/gcc-based backend is replaced by a
self-contained handwritten one, I would be amazed if there were not
a few obscure regressions whilst the new backend is brought up to the 
same level of stability as the current one.  At least, that is what
I know from my own code generator hacking.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] [PATCH] Reducing X communication bandwidth, take 2

2007-03-14 Thread Julian Seward
On Wednesday 14 March 2007 04:57, Mark Williamson wrote:
   Here is a somewhat revised version of a patch I first made nearly
   three years ago.  The original thread is
 
  The idea here is quite similar to what the VNC server does now.
 
  If this is desirable for SDL too, then perhaps we should find a way to
  fold this into common code?
 
  Although, is there a compelling reason to use SDL over X instead of VNC?

 I sometimes do this sort of thing because it Just Works with no manual
 configuring of port forwarding etc.  I don't necessarily like to do it for
 extended usage but it is very convenient.

Yes.  VNC is all very nice (I use it a lot) but is hassle to set up,
what with making holes in firewalls and/or port forwarding etc.  This
patch has the it just works property.

In fact I obtained (by far) the best remote X performance by using both
this patch and making the remote X connection with 
ssh -XC -o CompressionLevel=1.  The patch knocks out the majority of
the data, and the ssh compression squashed what remained by more than a
factor of 10.  Doing this it was hard to tell that QEMU was not running
locally.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] SSE 'maxps' instruction bug?

2007-03-13 Thread Julian Seward

 QEMU and Core 2 Duo disagree on the handling of NaNs it seems.

 http://courses.ece.uiuc.edu/ece390/books/labmanual/inst-ref-simd.html
 - this implies that MAXPS should leave the NaNs alone, no idea how
 normative that is though (and no IA32 manual at hand)

Having looked at an IA32 manual I'd say the inst-ref-simd.html
description agrees with it, so the Core 2 behaviour is what 
qemu should do.

J




___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] [PATCH] Reducing X communication bandwidth, take 2

2007-03-13 Thread Julian Seward

Here is a somewhat revised version of a patch I first made nearly
three years ago.  The original thread is 

http://lists.gnu.org/archive/html/qemu-devel/2004-07/msg00263.html

The patch makes QEMU's graphics emulation much more usable over remote
X connections, by reducing the amount of data sent to the X server.
This is particularly noticeable for small display updates, most
importantly mouse cursor movements, which become faster and so 
generally make the guest's GUI more pleasant to use.

Compared to the original patch, this one:

- is relative to 0.9.0

- handle screen-format-BytesPerPixel values of both 2 and 4,
  so handles most guest depths - I tested 8, 16, 24bpp.

- has unrolled comparison/copy loops for the depth 2 case.  Most
  of the comparisons are short (= 64 bytes) so I don't see much
  point in taking the overhead of a call to memcmp/memcpy.

- most importantly, is optional and disabled by default, so that
  default performance is unchanged.  To use it you need the new
  -remote-x11 flag (perhaps -low-bandwidth-x11 would be a better 
  name).

It still uses a shadow frame buffer.  Fabrice mentioned this is not
necessary

 http://lists.gnu.org/archive/html/qemu-devel/2004-07/msg00279.html

but I can't see how to get rid of it and still check for redundant
updates in NxN pixel blocks (N=32 by default).  The point of checking
NxN squares is that mouse pointer pixmaps are square and so the most
common display updates - mouse pointer movements - are often reduced
to transmission of a single 32x32 block using this strategy.

The shadow framebuffer is only allocated when -remote-x11 is present,
so the patch has no effect on default memory use.

I measured the bandwidth saving roughly by resuming a vm snapshot
containing a web browser showing a page with a lot of links.  I moved
the pointer slowly over the links (so they change colour) and scrolled
up and down a bit; about 1/2 minute of activity in total.  I tried to do
the same with and without -remote-x11.  Without -remote-x11, 163Mbyte
was transmitted to the X server; with it, 20.6Mbyte was, about an 8:1
reduction.

J


diff -u -r Orig/qemu-0.9.0/sdl.c qemu-0.9.0/sdl.c
--- Orig/qemu-0.9.0/sdl.c   2007-02-05 23:01:54.0 +
+++ qemu-0.9.0/sdl.c2007-03-13 22:16:40.0 +
@@ -29,6 +29,8 @@
 #include signal.h
 #endif
 
+#include assert.h
+
 static SDL_Surface *screen;
 static int gui_grab; /* if true, all keyboard/mouse events are grabbed */
 static int last_vm_running;
@@ -44,17 +46,232 @@
 static SDL_Cursor *sdl_cursor_hidden;
 static int absolute_enabled = 0;
 
+/* Mechanism to reduce the total amount of data transmitted to the X
+   server, often quite dramatically.  Keep a shadow copy of video
+   memory in alt_pixels, and when asked to update a rectangle, use
+   the shadow copy to establish areas which are the same, and so do
+   not need updating.
+*/
+
+static void* alt_pixels = NULL;
+
+#define THRESH 32
+
+/* Return 1 if the area [x .. x+w-1, y .. y+w-1] is different from
+   the old version and so needs updating. */
+static int cmpArea_16bit ( int x, int y, int w, int h )
+{
+   inti, j;
+  unsigned intsll;
+  unsigned short* p1base = (unsigned short*)screen-pixels;
+  unsigned short* p2base = (unsigned short*)alt_pixels;
+  assert(screen-format-BytesPerPixel == 2);
+  if (w == 0 || h == 0)
+ return 0;
+  assert(w  0  h  0);
+  sll = ((unsigned int)screen-pitch)  1;
+  for (j = y; j  y+h; j++) {
+unsigned short* p1p =  p1base[j * sll + x];
+unsigned short* p2p =  p2base[j * sll + x];
+for (i = 0; i  w-5; i += 5) {
+  if (p1p[i+0] != p2p[i+0]) return 1;
+  if (p1p[i+1] != p2p[i+1]) return 1;
+  if (p1p[i+2] != p2p[i+2]) return 1;
+  if (p1p[i+3] != p2p[i+3]) return 1;
+  if (p1p[i+4] != p2p[i+4]) return 1;
+}
+for (/*fixup*/; i  w; i++) {
+  if (p1p[i+0] != p2p[i+0]) return 1;
+}
+  }
+  return 0;
+}
+static void copyArea_16bit ( int x, int y, int w, int h )
+{
+   inti, j;
+  unsigned intsll;
+  unsigned short* p1base = (unsigned short*)screen-pixels;
+  unsigned short* p2base = (unsigned short*)alt_pixels;
+  assert(screen-format-BytesPerPixel == 2);
+  sll = ((unsigned int)screen-pitch)  1;
+  if (w == 0 || h == 0)
+ return;
+  assert(w  0  h  0);
+  for (j = y; j  y+h; j++) {
+unsigned short* p1p =  p1base[j * sll + x];
+unsigned short* p2p =  p2base[j * sll + x];
+for (i = 0; i  w-5; i += 5) {
+  p2p[i+0] = p1p[i+0];
+  p2p[i+1] = p1p[i+1];
+  p2p[i+2] = p1p[i+2];
+  p2p[i+3] = p1p[i+3];
+  p2p[i+4] = p1p[i+4];
+}
+for (/*fixup*/; i  w; i++) {
+  p2p[i+0] = p1p[i+0];
+}
+  }
+}
+
+static int cmpArea_32bit ( int x, int y, int w, int h )
+{
+  int   i, j;
+  unsigned int  sll;
+  unsigned int* p1base = (unsigned int*)screen-pixels;
+  unsigned int* p2base = (unsigned int*)alt_pixels;
+  assert(screen-format-BytesPerPixel == 4);
+  sll = ((unsigned int)screen-pitch)  

Re: [Qemu-devel] SSE 'maxps' instruction bug?

2007-03-13 Thread Julian Seward

 0.9.0, or that the compiler/host combination used to build the qemu
 binary Julian is running generated bad code for the float compares.

I used gcc 3.4.6 bootstrapped as normal ('make bootstrap; make install')
on a 64-bit machine.  If it is qemu generating bad code due to variations
in gcc behaviour, that's another argument in favour of scrapping the gcc 
3.X based backend and using a self contained, handwritten insn selector
and register allocator.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] SSE 'maxps' instruction bug?

2007-03-12 Thread Julian Seward

The program below tests the 'maxps' instruction.  When run on 
qemu-0.9.0, host amd64, guest x86, guest OS redhat8, it prints:

   f9a511d1 8d37d67f b34825b8 e2f40739

scp the binary to a Core 2 (real) machine and run:

   f9a511d1 22dcb9b9 b34825b8 e2f40739

Second 32-bit word is completely different.

This is 0.9.0 compiled from source using gcc-3.4.6, host openSuSE
10.2 on a Core 2 Duo in 64-bit mode.

Any ideas?  I grepped the 0.9.0 sources for maxps but couldn't
figure out where/how it is handled.

J



#include stdio.h
#include stdlib.h
#include assert.h
#include malloc.h
#include string.h

typedef  unsigned char  V128[16];
typedef  signed int Int;

static void showV128 ( V128* v )
{
   Int i;
   for (i = 0; i  16; i++) {
  printf(%02x, (Int)(*v)[i]);
  if (i  0  (i % 4) == 3) printf( );
   }
}

static V128 arg1 = { 0x28,0x9b,0x57,0xf7,0x22,0xdc,0xb9,0xb9,
 0x0a,0xb3,0x8a,0xcf,0x73,0xbb,0xe4,0x0b };
static V128 arg2 = { 0xf9,0xa5,0x11,0xd1,0x8d,0x37,0xd6,0x7f,
 0xb3,0x48,0x25,0xb8,0xe2,0xf4,0x07,0x39 };
static V128 res;

int main ( int argc, char** argv )
{
   __asm__ __volatile__(
  movups (%0),%%xmm6\n\t
  movups (%1),%%xmm7\n\t
  maxps %%xmm6,%%xmm7\n\t
  movups %%xmm7, (%2)\n\t
  : : r(arg1), r(arg2), r(res)
  : xmm6, xmm7
   );
   showV128( res );
   printf(\n);
   return 0;
}

/* Output on qemu-0.9.0, host amd64, guest x86, guest OS redhat8:
   f9a511d1 8d37d67f b34825b8 e2f40739

   Run same binary on a Core 2:
   f9a511d1 22dcb9b9 b34825b8 e2f40739

   Second 32-bit word is completely different.
*/


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] How to get 1280x1024 display from guest running Xorg?

2007-02-23 Thread Julian Seward

Thanks for the feedback.  Since I do not wish to be involved in a 
great battle (as you so nicely put it) I'll stick with VMware (sigh).

J


On Wednesday 21 February 2007 15:05, Robin Atwood wrote:
 On Wednesday 21 Feb 2007, Julian Seward wrote:
  (replying off list)
 
  So you have Solaris 10 (x86 ?) running on qemu-0.9 ?  Is it stable?
  Does it work?  I have it running on vmware-5.5.3 but would prefer to
  move to running it on qemu if possible; however I've had mixed
  results with qemu in the past and don't want to spend loads of time
  on failed attempts to get it to work.  Hence the question.

 It was a great battle to install but now it is stable. Do the following
 things:
 1. install from the DVD image
 2. Use the text console install
 3. At the end of the install, backup the image file *before* the first
 reboot 4. If during the first boot of the image, you get a segfault,
 restore  and try again until you get to a prompt. Ignore any service
 failures. (the filesystem seems prone corruption at the first boot.)
 5. If you have problems caused by damaged files, re-install choosing
 the Update option: this will restore the damaged files.

 After that, I was able to boot reliably into X. However, the filesystem
 seems very fragile if not shut down cleanly, so take regular backups!

 HTH
 -Robin.


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] [PATCH 1/2] Escape filenames in monitor

2006-12-17 Thread Julian Seward
On Saturday 16 December 2006 21:11, Anthony Liguori wrote:
 info block is impossible to parse reliably because there is no escaping
 done on the filename.

Don't you also need to convert \ to \\ ?  Else any \ which was in
the original string will confuse the parser of the escaped output.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] Make -std-vga the default?

2006-06-28 Thread Julian Seward

  Really? My win2k install couldn't do anything useful with -std-vga.
  It would only do the very basic 640x480x4 mode. I'm fairly sure win9x
  can't do anything useful with straight VGA either.

 Same here.  Also std-vga seemed to be slower than cirrus when I tried
 it recently on my linux guests, although I haven't actually measured
 anything.

My mistake; Win2K doesn't like -std-vga.  I confused 2K and XP.

   Overall it seems to work much better than the default 5446

 Julian, in what way is std-vga better than the cirrus emulation?

I can go above 1024x768, which is realistically something I need in
order to use QEMU as a viable replacement for VMware.

With SuSE 10.1 guest I can't even get 1024x768 with Cirrus.  SuSE
claims it's doing 1024x768 but what I get is 1024x600.

  In my experience the Cirrus emulation just works, and is supported
  by pretty much every OS out the box. AFAIK Windows earlier than XP
  doesn't needs additional 3rd party drivers to support anonymous VESA
  hardware.

I agree that avoiding additional drivers is good.  However it seems that
both cirrus and std-vga have their shortcomings and neither is an ideal
out-of-the-box solution right now.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour

2006-06-21 Thread Julian Seward

 It appears that cvttps2dq is indeed the only exception in the range,
 combined patch that fixes both movd?q2d?q and cvttps2dq is attached.

 I don't have any kind of SSE on this machine so would apprecaite if
 someone would run tests/test-i386 with the patch attached.

That works for me.  Thanks.  Valgrind's integer/x87/MMX/SSE/SSE2 tests 
now all pass on i386-softmmu.  I didn't try tests/test-i386 though.

Fabrice, can you commit this?

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour

2006-06-20 Thread Julian Seward

The SSE2 instructions cvttps2dq, movdq2q, movq2dq do not behave
correctly, as shown by the attached program.  It should print

  cvttps2dq_1 ... ok
  cvttps2dq_2 ... ok
  movdq2q_1 ... ok
  movq2dq_1 ... ok

but instead produces

  cvttps2dq_1 ... ok
  cvttps2dq_2 ... not ok
result0.sd[0] = 12 (expected 12)
result0.sd[1] = 3 (expected 56)
result0.sd[2] = -2147483648 (expected 43)
result0.sd[3] = 3 (expected 87)
  movdq2q_1 ... not ok
result0.uq[0] = 1302123111658042420 (expected 5124095577148911)
  movq2dq_1 ... not ok
result0.uq[0] = 1302123111658042420 (expected 5124095577148911)
result0.uq[1] = 6221254864647256184 (expected 0)

I looked at QEMU's instruction decoders for these, and compared them
to Valgrind's, but could not see what the problem was.  The decode
logic looks OK.  Maybe the problem is elsewhere.

J

---

#include math.h
#include setjmp.h
#include signal.h
#include stdio.h
#include stdlib.h

typedef union {
  char sb[1];
  unsigned char ub[1];
} reg8_t;

typedef union {
  char sb[2];
  unsigned char ub[2];
  short sw[1];
  unsigned short uw[1];
} reg16_t;

typedef union {
  char sb[4];
  unsigned char ub[4];
  short sw[2];
  unsigned short uw[2];
  long int sd[1];
  unsigned long int ud[1];
  float ps[1];
} reg32_t;

typedef union {
  char sb[8];
  unsigned char ub[8];
  short sw[4];
  unsigned short uw[4];
  long int sd[2];
  unsigned long int ud[2];
  long long int sq[1];
  unsigned long long int uq[1];
  float ps[2];
  double pd[1];
} reg64_t __attribute__ ((aligned (8)));

typedef union {
  char sb[16];
  unsigned char ub[16];
  short sw[8];
  unsigned short uw[8];
  long int sd[4];
  unsigned long int ud[4];
  long long int sq[2];
  unsigned long long int uq[2];
  float ps[4];
  double pd[2];
} reg128_t __attribute__ ((aligned (16)));

static sigjmp_buf catchpoint;

static void handle_sigill(int signum)
{
   siglongjmp(catchpoint, 1);
}

__attribute__((unused))
static int eq_float(float f1, float f2)
{
   return f1 == f2 || fabsf(f1 - f2)  fabsf(f1) * 1.5 * pow(2,-12);
}

__attribute__((unused))
static int eq_double(double d1, double d2)
{
   return d1 == d2 || fabs(d1 - d2)  fabs(d1) * 1.5 * pow(2,-12);
}

static void cvttps2dq_1(void)
{
   reg128_t arg0 = { .ps = { 12.34F, 56.78F, 43.21F, 87.65F } };
   reg128_t arg1 = { .sd = { 1L, 2L, 3L, 4L } };
   reg128_t result0;
   char state[108];

   if (sigsetjmp(catchpoint, 1) == 0)
   {
  asm(
 fsave %3\n
 movlps 0%0, %%xmm4\n
 movhps 8%0, %%xmm4\n
 movlps 0%1, %%xmm5\n
 movhps 8%1, %%xmm5\n
 cvttps2dq %%xmm4, %%xmm5\n
 movlps %%xmm5, 0%2\n
 movhps %%xmm5, 8%2\n
 frstor %3\n
 :
 : m (arg0), m (arg1), m (result0), m (state[0])
 : xmm4, xmm5
  );

  if (result0.sd[0] == 12L  result0.sd[1] == 56L  result0.sd[2] == 43L 
 result0.sd[3] == 87L )
  {
 printf(cvttps2dq_1 ... ok\n);
  }
  else
  {
 printf(cvttps2dq_1 ... not ok\n);
 printf(  result0.sd[0] = %ld (expected %ld)\n, result0.sd[0], 12L);
 printf(  result0.sd[1] = %ld (expected %ld)\n, result0.sd[1], 56L);
 printf(  result0.sd[2] = %ld (expected %ld)\n, result0.sd[2], 43L);
 printf(  result0.sd[3] = %ld (expected %ld)\n, result0.sd[3], 87L);
  }
   }
   else
   {
  printf(cvttps2dq_1 ... failed\n);
   }

   return;
}

static void cvttps2dq_2(void)
{
   reg128_t arg0 = { .ps = { 12.34F, 56.78F, 43.21F, 87.65F } };
   reg128_t arg1 = { .sd = { 1L, 2L, 3L, 4L } };
   reg128_t result0;
   char state[108];

   if (sigsetjmp(catchpoint, 1) == 0)
   {
  asm(
 fsave %3\n
 movlps 0%1, %%xmm5\n
 movhps 8%1, %%xmm5\n
 cvttps2dq %0, %%xmm5\n
 movlps %%xmm5, 0%2\n
 movhps %%xmm5, 8%2\n
 frstor %3\n
 :
 : m (arg0), m (arg1), m (result0), m (state[0])
 : xmm4, xmm5
  );

  if (result0.sd[0] == 12L  result0.sd[1] == 56L  result0.sd[2] == 43L 
 result0.sd[3] == 87L )
  {
 printf(cvttps2dq_2 ... ok\n);
  }
  else
  {
 printf(cvttps2dq_2 ... not ok\n);
 printf(  result0.sd[0] = %ld (expected %ld)\n, result0.sd[0], 12L);
 printf(  result0.sd[1] = %ld (expected %ld)\n, result0.sd[1], 56L);
 printf(  result0.sd[2] = %ld (expected %ld)\n, result0.sd[2], 43L);
 printf(  result0.sd[3] = %ld (expected %ld)\n, result0.sd[3], 87L);
  }
   }
   else
   {
  printf(cvttps2dq_2 ... failed\n);
   }

   return;
}

static void movdq2q_1(void)
{
   reg128_t arg0 = { .uq = { 0x012345678abcdefULL, 0xfedcba9876543210ULL } };
   reg64_t arg1 = { .uq = { 0x1212121234343434ULL } };
   reg64_t result0;
   char state[108];

   if (sigsetjmp(catchpoint, 1) == 0)
   {
  asm(
 fsave %3\n
 movlps 0%0, %%xmm4\n
 movhps 8%0, %%xmm4\n
 movq %1, %%mm6\n
 movdq2q 

Re: [Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour

2006-06-20 Thread Julian Seward
On Tuesday 20 June 2006 12:29, malc wrote:

 The signature of movdq2q is Pq, VRq and for movq2dq - Vo, PRq it appears
 that translate.c gets it backwards, attached patch should deal with it.

Cool.

 As for cvttps2dq i ran it with interpreter which uses outdated(i.e. non
 soft-float) conversion routines and it passed, so my guess would be that
 this is float32_to_int32_round_to_zero vs (int32_t) cast issue.

I had a feeling this is a garbage-in-memory (or regs, or somewhere)
problem.  Reason is that the wrong results kept changing as I cut
the full test program down to just the small one I posted.  Can you
try on a vanilla build of i386-softmmu from cvs?

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour

2006-06-20 Thread Julian Seward

  [EMAIL PROTECTED] qemu]$ gcc -msse2 sse2test.c -o sse2test
  [EMAIL PROTECTED] qemu]$ ./sse2test
  cvttps2dq_1 ... failed
  cvttps2dq_2 ... failed
  movdq2q_1 ... failed
  movq2dq_1 ... failed
 
  what am i doing wrong here ?

 Running it on a CPU without SSE2, if i'm allowed to venture a gues.

Yup.  Try 'strace ./sse2test' and see if it gets SIGILLs thrown at it.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] cvttps2dq, movdq2q, movq2dq incorrect behaviour

2006-06-20 Thread Julian Seward

Malc, your sse-movq.patch works for me.  Thanks.

 soft-float was a red herring, translate.c is at fault here (interpreter
 does not use it, hence behaved correctly)

 translate.c:3009
 if (b1 = 2  ((b = 0x50  b = 0x5f) ||
  b == 0xc2)) {
  /* specific case for SSE single instructions */
  if (b1 == 2) {
  /* 32 bit access */
  gen_op_ld_T0_A0[OT_LONG + s-mem_index]();
  gen_op_movl_env_T0(offsetof(CPUX86State,xmm_t0.XMM_L(0)));
  } else {
  /* 64 bit access */
  gen_ldq_env_A0[s-mem_index 
 2](offsetof(CPUX86State,xmm_t0.XMM_D(0))); }
 } else {
  gen_ldo_env_A0[s-mem_index  2](op2_offset);
 }

 cvttps2dq is 0x5b(b=0x5b) with repn prefix (b1=2) the above code is
 optimized a bit more than it should have been, as it loads only 4 bytes
 into xmm_t0 instead of 16.

Uh, fine, but I don't understand how/what to fix.  Can you advise?

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] [PATCH] Bug in target-i386/helper.c:helper_fxam_ST0

2006-06-19 Thread Julian Seward

I've been doing some instruction set testing on i386-softmmu, 
with the aim of seeing if I can find any anomalies which might
be the cause the of Win2K SP4 installation failure.

helper_fxam_ST0 doesn't correctly distinguish infinities from
nans, and thereby causes programs that use the x86 'fxam'
instruction to occasionally produce incorrect results.  That
instruction is quite often used as part of transcendentals, for 
example pow, exp, log.

On a Linux guest, it for example causes the libc call pow(0.6, inf) 
to produce inf when it should produce zero, and causes about 20 
cases in the FP correctness suite I'm using to fail.

The test case below shows the problem.  It should produce

0x4000: 0.00
0x4200: -0.00
0x0500: inf
0x0700: -inf
0x0300: nan
0x0100: nan
0x0400: 0.00
0x0600: -0.00
0x0400: 1.23
0x0600: -1.23

but instead produces (omitting the correct cases)

0x0100: inf
0x0300: -inf

What's strange is the logic in helper_fxam_ST0 looks correct.
The distinguish-nans-from-infinities part is

if (expdif == MAXEXPD) {
if (MANTD(temp) == 0)
env-fpus |=  0x500 /*Infinity*/;
else
env-fpus |=  0x100 /*NaN*/;
}

I suspect the check is correct for 52-bit mantissas (64-bit floats)
but not for 64-bit mantissas (80-bit floats), as per these notes:

/* 80 and 64-bit floating point formats:

   80-bit:

S  0   0---0  zero
S  0   0X--X  denormals
S  1-7FFE  1X--X  normals (all normals have leading 1)
S  7FFF10--0  infinity
S  7FFF10X-X  snan
S  7FFF11X-X  qnan

   S is the sign bit.  For runs XX, at least one of the Xs must be
   nonzero.  Exponent is 15 bits, fractional part is 63 bits, and
   there is an explicitly represented leading 1, and a sign bit,
   giving 80 in total.

   64-bit avoids the confusion of an explicitly represented leading 1
   and so is simpler:

S  0  0--0   zero
S  0  X--X   denormals
S  1-7FE  anynormals
S  7FF0--0   infinity
S  7FF0X-X   snan
S  7FF1X-X   qnan

   Exponent is 11 bits, fractional part is 52 bits, and there is a
   sign bit, giving 64 in total.
*/

For 52-bit mantissas, the mantissa zero-vs-nonzero check is correct.
But for 64-bit mantissas, the check needs to be 
if (MANTD(temp) == 0x8000ULL)
and indeed setting it to that makes the test program run correctly.

Patch and testcase follow.

I'm still seeing cases where x87-based computation on qemu winds
up with a NaN when it shouldn't.  I think that's a separate problem.
Will investigate.

J

-

Index: target-i386/helper.c
===
RCS file: /sources/qemu/qemu/target-i386/helper.c,v
retrieving revision 1.65
diff -r1.65 helper.c
2952a2953,2955
 #   ifdef USE_X86LDOUBLE
 if (MANTD(temp) == 0x8000ULL)
 #   else
2953a2957
 #   endif


-


#include stdio.h
#include math.h

/* FPU flag masks */
#define X86G_FC_SHIFT_C3   14
#define X86G_FC_SHIFT_C2   10
#define X86G_FC_SHIFT_C1   9
#define X86G_FC_SHIFT_C0   8

#define X86G_FC_MASK_C3(1  X86G_FC_SHIFT_C3)
#define X86G_FC_MASK_C2(1  X86G_FC_SHIFT_C2)
#define X86G_FC_MASK_C1(1  X86G_FC_SHIFT_C1)
#define X86G_FC_MASK_C0(1  X86G_FC_SHIFT_C0)

#define MASK_C3210 (X86G_FC_MASK_C3 | X86G_FC_MASK_C2 | X86G_FC_MASK_C1 | 
X86G_FC_MASK_C0)
double d;
int i;

extern void do_fxam ( void );

asm(
\n
do_fxam:\n
\txorl %eax,%eax\n
\tfldl d\n
\tfxam\n
\tfnstsw %ax\n
\tffree %st(0)\n
\tmovl %eax, i\n
\tret\n
);


double inf ( void ) { return 1.0 / 0.0; }
double nAn ( void ) { return 0.0 / 0.0; }
double den ( void ) { return 9.1e-220 / 1e100; }
double nor ( void ) { return 1.23; }

/* Try positive and negative variants of: zero, infinity,
   nAn, denorm and normal */

int main ( void )
{
   d =  0.0;   do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );
   d = -0.0;   do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );

   d =  inf(); do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );
   d = -inf(); do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );

   d =  nAn(); do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );
   d = -nAn(); do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );

   d =  den(); do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );
   d = -den(); do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );

   d =  nor(); do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );
   d = -nor(); do_fxam(); printf(0x%04x: %f\n, i  MASK_C3210, d );
   return 0;
}



___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] Re: invisible wall patch

2006-06-17 Thread Julian Seward
On Saturday 17 June 2006 18:03, Rick Vernam wrote:
 On Saturday 17 June 2006 11:32, Alex wrote:
  This patch has been around for a while but never committed to the
  mainstream.

Huh?  Fabrice committed it some time around Tuesday.  I've been
using it 8+ hours/day since then and it seems fine to me.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] VMware Player

2006-06-15 Thread Julian Seward
On Thursday 15 June 2006 14:18, WaxDragon wrote:
 On 6/15/06, kadil [EMAIL PROTECTED] wrote:
  On Wed, 2006-06-14 at 18:10 +0200, Oliver Gerlich wrote:
  Real world, gui's are just so easy  desirable, especially if the gui is
  consistent across os's, and part of the original distro.  I think
  take-up would be huge (well huge-er, current takeup is huge)
 
  Kim

 Some of us appriciate the fact that qemu has no GUI per se.  ;0)

Sure.  But to 'sell' the project to wider audience, which may be
helpful for its longer term development, a GUI is necessary.
Usability engineering isn't as much fun as hacking the JIT, or 
whatever, but in the end usability counts.  A lot.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] invisible wall patch

2006-06-13 Thread Julian Seward

Could somebody please commit, or at least consider committing, 
Anthony Liguori's invisible wall patch, shown at 
http://lists.gnu.org/archive/html/qemu-devel/2006-05/msg00112.html ?

Without it, QEMU is essentially unusable on my SuSE 10 host; with it,
the mouse stuff works perfectly.  A couple of other people on that
thread had similar experiences with it.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] getting the 5446 in 1152x864 mode

2006-06-09 Thread Julian Seward
On Wednesday 07 June 2006 12:49, Julian Seward wrote:
  On Wednesday 07 June 2006 02:31, Ben Taylor wrote:
  I've been able to get 1152x864 out of the Solaris Xorg gdm5446 driver
  so there must be something else that's causing you problems.  I think
  I've gotten win98se to do it as well.

 Thanks for the confirmation.  So, I re-tried (extensively) to get
 1152x864.

1152x864 doesn't work in WinXP either, and I am also still getting
'invisible wall' mouse problems.  I wonder if my build is broken;
but I'm not sure how.  It's a clean build of cvs from two days
ago, using gcc 3.4.3 on SuSE 10 (x86).  The only change I made is to
increase the translation cache size.

J

$ cvs diff -rHEAD
Index: exec-all.h
===
RCS file: /sources/qemu/qemu/exec-all.h,v
retrieving revision 1.47
diff -r1.47 exec-all.h
140c140
 #define CODE_GEN_BUFFER_SIZE (16 * 1024 * 1024)
---
 #define CODE_GEN_BUFFER_SIZE (64 * 1024 * 1024)
149c149
 #define CODE_GEN_AVG_BLOCK_SIZE 128
---
 #define CODE_GEN_AVG_BLOCK_SIZE 256 //128


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] getting the 5446 in 1152x864 mode

2006-06-07 Thread Julian Seward
 On Wednesday 07 June 2006 02:31, Ben Taylor wrote:
 I've been able to get 1152x864 out of the Solaris Xorg gdm5446 driver
 so there must be something else that's causing you problems.  I think I've
 gotten win98se to do it as well.

Thanks for the confirmation.  So, I re-tried (extensively) to get
1152x864.  That resolution is listed by Windows as possible at 70Hz
and 75Hz (monitor), so I set the monitor refresh rates to those values
in the Windows display settings, but still no success.  Even with
rebooting after changing the settings.

I still also have sometimes an 'invisible wall' problem for the mouse 
pointer problem that was discussed on this list a few weeks ago.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] PATCH: fix bgr color mapping on qemu on Solaris/SPARC

2006-05-10 Thread Julian Seward

 autodetect what color format to use. Also putting if inside the inner loop
 of the low-level conversion routines is a bad idea.

While that's per-se true, maybe it's not such a big deal.  The branch is
going to be perfectly predictable since the condition stays the same 
for the entire run, so I'd be surprised if you even lost one host cycle
per iteration overall.  Basically the hardware will fold it out.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] [PATCH] Fix overflow conditions for MIPS add / subtract

2006-04-28 Thread Julian Seward

  -if ((T0  31) ^ (T1  31) ^ (tmp  31)) {
  +if (((tmp ^ T1 ^ (-1))  (T0 ^ T1))  31) {
  +   /* operands of same sign, result different sign */
  CALL_FROM_TB1(do_raise_exception_direct, EXCP_OVERFLOW);
  }

 I see this went in, but - huh?  The math doesn't make sense.

 T0 ^ T1 - operands of different sign
 tmp ^ T1 ^ (-1) - result has same sign as T1

The definitive reference for all this bit twiddling magic and
much more besides is an excellent book, Hacker's Delight, by
Hank Warren.  It has loads of stuff about integer overflow and
whatnot.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] Emulation differences, qemu-system-x86_64 vs Athlon64

2006-04-12 Thread Julian Seward

Recently I've been playing with CVS qemu-system (softmmu) on amd64
and had some stability problems.  I decided to run Valgrind's amd64
instruction-set tests (derived from qemu's) to see if they picked up
anything.  Resulting diffs are attached.

There are a bunch of differences for the C flag for rotates
(rol/ror) by multiples of the word size.  I don't think these
are significant, but who knows.

Perhaps more worryingly are the 20 or so lines at the bottom
of the diff.  These I believe are for double-to-int/short
conversions for a value which is out of range for an int/short;
the hardware produces 0x8000/0x8000 respectively, which is
the integer indefinite; QEMU produces zero.  I can imagine some
obscure routine somewhere checking for integer indefinite after
conversion and being confused as a result.

J


diffs-qemu-vs-Athlon64.txt.bz2
Description: BZip2 compressed data
___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] Emulation differences, qemu-system-x86_64 vs Athlon64

2006-04-12 Thread Julian Seward

 I guess the problem comes from the usage of lrintl() on x86_64 in
 fpu/softfloat-native.c, but I cannot test it yet.

It might be that you have to pass in an extra value into those
float - int conversion routines, which describes what to do if the
conversion is going to overflow.  That's because the behaviour is
different depending on the guest architecture.  x86/amd64 always
give 0x8000, whereas ppc gives either 0x8000... or 0x7FFF
depending on the sign of the argument (IIRC).

J



___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] Translation cache sizes

2006-04-07 Thread Julian Seward

Using qemu from cvs simulating x86-softmmu (no kqemu) on x86,
booting SuSE 9.1 and getting to the xdm (kdm?) graphical login
screen, requires making about 1088000 translations, and the
translation cache is flushed 17 times.  Booting is not too bad,
but once user-mode starts to run the translation cache is pretty
much hammered.

I made 2 changes: 

* increase CODE_GEN_BUFFER_SIZE from 16*1024*1024
  to 64*1024*1024, 

* observe that CODE_GEN_AVG_BLOCK_SIZE of 128
  for the softmmu case is too low; my measurements put it
  at about 247.  So I changed it to 256.

With those changes in place, the same boot-to-kdm process 
requires only about 57 translations to be made, and 2 
cache flushes to happen.  Of course the cost is an extra
48M of memory use.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] vmware puts up specs for it's disk format

2006-04-05 Thread Julian Seward

  Basically, if u want split images to be supported in qemu, speak up now.
  ;)

 I speak! 

Me too!  I've always used split images with vmware; they are 
easier to manage than files tens of gigabytes long.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


[Qemu-devel] Experiences with qemu-system-ppc

2006-04-04 Thread Julian Seward

What are the prospects for qemu-system-ppc being improved to the
point where I can do a vanilla install of popular ppc32-linux
distros?

I've tried a vanilla install of Debian Sarge (3.1).  That went
smoothly, except for the stage right at the end, where the disk
image is made bootable.  The resulting disk image is, despite
much messing around, not bootable (using QEMU from cvs).

Vesselin Peev made a ppc32 Debian Sarge image with a specially
modified external kernel to work around this problem 
(http://free.oszoo.org/ftp/images/debian_sarge_ppc.tar.torrent)
and that does work.  Unfortunately QEMU consumes 100% host CPU
even when the virtual machine is idle, which means it's not useful
for a long-running virtual server.  And the reliance on a specially
modified kernel means normal updating/upgrading of the virtual
machine isn't possible.

My goal is to run a farm of QEMU virtual ppc32 machines to do
overnight builds/tests of Valgrind on different ppc32 distros.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] [PATCH] SSE3 emulation

2006-02-18 Thread Julian Seward

 it, if someone could test this patch and/or give me links to programs
 (Linux/Win98) that use SSE3 instructions (and preferably also prove

Valgrind (current svn trunk) has some pretty extensive SSE/SSE2 tests;
you could use those.

J


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel


Re: [Qemu-devel] patch for qemu with newer gcc-3.4.x (support repz retq optimization for amd processors correctly)

2005-11-09 Thread Julian Seward

The use of gcc to generate the back end in QEMU's early days was a 
clever way to get the project up and running quickly.  But surely
now it would be better to transition to a handwritten backend, so
as to be independent future changes in gcc, and generally more robust?

J

On Wednesday 09 November 2005 19:51, Igor Kovalenko wrote:
 Paul Brook wrote:
  Notice the 'repz mov' sequence, which seems to be undocumented
  instruction. It seems to work somehow but chokes valgrind decoder.
  The following patch (against current CVS) fixes this problem,
 
  This patch is incorrect.
 
  It could match any number of other instructions that happen to end in
  0xf3. eg
 
 0:   c7 45 00 00 00 00 f3movl   $0xf300,0x0(%ebp)
 7:   c3  ret
 
  IIRC the rep; ret sequence is to avoid a pipeline stall on Athlon CPUs.
   Try tuning for a different CPU.
 
  Paul
 
  Index: dyngen.c
  ===
  RCS file: /cvsroot/qemu/qemu/dyngen.c,v
  retrieving revision 1.40
  diff -u -r1.40 dyngen.c
  --- dyngen.c27 Apr 2005 19:55:58 -  1.40
  +++ dyngen.c9 Nov 2005 19:12:38 -
  @@ -1387,6 +1387,12 @@
error(empty code for %s, name);
if (p_end[-1] == 0xc3) {
len--;
  +/* This can be 'rep ; ret' optimized return sequence,
  + * need to check further and strip the 'rep' prefix
  + */
  +if (len != 0  p_end[-2] == 0xf3) {
  +len--;
  +}
} else {
error(ret or jmp expected at the end of %s, name);
}

 OK I missed that...
 Then a discussion about gcc-4 turns into something much more interesting :)


___
Qemu-devel mailing list
Qemu-devel@nongnu.org
http://lists.nongnu.org/mailman/listinfo/qemu-devel