Re: [Qemu-devel] Re: [BUG] Regression: readonly raw images no longer work

2010-02-15 Thread Markus Armbruster
Paolo Bonzini pbonz...@redhat.com writes:

 On 02/13/2010 10:40 PM, Stefan Weil wrote:
 This command used to work, but fails now:

 $ i386-softmmu/qemu -snapshot /dev/sda
 qemu: could not open disk image /dev/sda: Permission denied

 $ ls -l /dev/sda
 brw-rw-r-- 1 root disk 8, 0 13. Feb 08:55 /dev/sda

 The original file of a snapshot needs only read access,
 but QEMU tries read/write access and fails.

 Variants of above command using -hda or -drive
 also fail with the same error message.

 I did not test whether the regression affects other
 kinds of images, too. Maybe only raw images trigger
 no longer work.

 Caused by

 commit 03cbdac7efc20994d0a87015e24e835d0139df7b
 Author: Naphtali Sprei nsp...@redhat.com  2010-01-17 15:48:15
 Committer: Anthony Liguori aligu...@us.ibm.com  2010-01-20 15:25:22
 Follows: v0.12.0-rc0

 Disable fall-back to read-only when cannot open drive's file for
 read-write

 Signed-off-by: Naphtali Sprei nsp...@redhat.com
 Signed-off-by: Anthony Liguori aligu...@us.ibm.com

 ... because before that it was working like this:

 open(/dev/sda, O_RDWR|O_SYNC|O_CLOEXEC) = -1 EACCES
 open(/dev/sda, O_RDONLY|O_SYNC|O_CLOEXEC) = 10

 BTW, because of other bugs in the middle of the history you need a
 command line -hda /dev/null -drive file=/dev/sda,snapshot=on to
 bisect it.

Maybe drive_enable_snapshot() should set readonly in opts as well.




Re: [Qemu-devel] Re: [PATCH v2] qemu-kvm: Speed up of the dirty-bitmap-traveling

2010-02-15 Thread Alexander Graf

On 15.02.2010, at 07:12, OHMURA Kei wrote:

 dirty-bitmap-traveling is carried out by byte size in qemu-kvm.c.
 But We think that dirty-bitmap-traveling by long size is faster than by byte

We think? I mean - yes, I think so too. But have you actually measured it? 
How much improvement are we talking here?
Is it still faster when a bswap is involved?

Alex



[Qemu-devel] Re: [PATCH] virtio-serial: don't set MULTIPORT for 1 port dev

2010-02-15 Thread Gerd Hoffmann

On 02/12/10 15:23, Amit Shah wrote:

On (Fri) Feb 12 2010 [15:42:14], Michael S. Tsirkin wrote:

Since commit 98b19252cf1bd97c54bc4613f3537c5ec0aae263, all
serial devices declare MULTIPORT feature.
To allow 0.12 compatibility, we should clear this when
max_nr_ports is 1.


In addition to this, setting max_nr_ports to 1 is needed when -M 0.12 is
selected.


Indeed.


However, is this the only way to do it? Gerd?


Is there a qdev property for max_nr_ports?  Then simply adding a compat 
property will do the trick.


cheers,
  Gerd





[Qemu-devel] Re: [PATCH 2/2] net/macvtap: add vhost suppor

2010-02-15 Thread Arnd Bergmann
On Sunday 14 February 2010, Michael S. Tsirkin wrote:
  @@ -473,7 +477,7 @@ static struct socket *get_socket(int fd)
sock = get_raw_socket(fd);
if (!IS_ERR(sock))
return sock;
  - sock = get_tun_socket(fd);
  + sock = get_tap_socket(fd);
if (!IS_ERR(sock))
return sock;
return ERR_PTR(-ENOTSOCK);
 
 This will also need a dependency on macvtap in Kconfig.
 See how it's done for tun.

Ok, I'll add that.

Thanks,

Arnd




[Qemu-devel] Re: [PATCH] virtio-serial: don't set MULTIPORT for 1 port dev

2010-02-15 Thread Amit Shah
On (Mon) Feb 15 2010 [10:03:34], Gerd Hoffmann wrote:
 On 02/12/10 15:23, Amit Shah wrote:
 On (Fri) Feb 12 2010 [15:42:14], Michael S. Tsirkin wrote:
 Since commit 98b19252cf1bd97c54bc4613f3537c5ec0aae263, all
 serial devices declare MULTIPORT feature.
 To allow 0.12 compatibility, we should clear this when
 max_nr_ports is 1.

 In addition to this, setting max_nr_ports to 1 is needed when -M 0.12 is
 selected.

 Indeed.

 However, is this the only way to do it? Gerd?

 Is there a qdev property for max_nr_ports?  Then simply adding a compat  
 property will do the trick.

Something like this (I can split it into two patches before submission):


diff --git a/hw/pc.c b/hw/pc.c
index 5b29f3b..a975934 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1055,12 +1055,32 @@ void cmos_set_s3_resume(void)
 }
 
 static QEMUMachine pc_machine = {
+.name = pc-0.13,
+.alias = pc,
+.desc = Standard PC,
+.init = pc_init_pci,
+.is_default = 1,
+.max_cpus = 255,
+};
+
+static QEMUMachine pc_machine_v0_12 = {
 .name = pc-0.12,
 .alias = pc,
 .desc = Standard PC,
 .init = pc_init_pci,
 .max_cpus = 255,
-.is_default = 1,
+.compat_props = (GlobalProperty[]) {
+{
+.driver   = virtio-serial-pci,
+.property = max_nr_ports,
+.value= stringify(1),
+},{
+.driver   = virtio-serial-pci,
+.property = vectors,
+.value= stringify(0),
+},
+{ /* end of list */ }
+}
 };
 
 static QEMUMachine pc_machine_v0_11 = {
@@ -1074,6 +1094,14 @@ static QEMUMachine pc_machine_v0_11 = {
 .property = vectors,
 .value= stringify(0),
 },{
+.driver   = virtio-serial-pci,
+.property = max_nr_ports,
+.value= stringify(1),
+},{
+.driver   = virtio-serial-pci,
+.property = vectors,
+.value= stringify(0),
+},{
 .driver   = ide-drive,
 .property = ver,
 .value= 0.11,
@@ -1105,6 +1133,14 @@ static QEMUMachine pc_machine_v0_10 = {
 .property = class,
 .value= stringify(PCI_CLASS_DISPLAY_OTHER),
 },{
+.driver   = virtio-serial-pci,
+.property = max_nr_ports,
+.value= stringify(1),
+},{
+.driver   = virtio-serial-pci,
+.property = vectors,
+.value= stringify(0),
+},{
 .driver   = virtio-net-pci,
 .property = vectors,
 .value= stringify(0),
@@ -1139,6 +1175,7 @@ static QEMUMachine isapc_machine = {
 static void pc_machine_init(void)
 {
 qemu_register_machine(pc_machine);
+qemu_register_machine(pc_machine_v0_12);
 qemu_register_machine(pc_machine_v0_11);
 qemu_register_machine(pc_machine_v0_10);
 qemu_register_machine(isapc_machine);

Amit




[Qemu-devel] Re: EHCI support in QEMU

2010-02-15 Thread Jan Kiszka
Taimoor Mirza wrote:
 Hi all,
 
  
 
 I downloaded version 0.12.2 of QEMU and I am unable to find EHCI support in 
 it. Does QEMU support EHCI emulation? Do I need to download some other patch 
 for it? QEMU documentation also does not tell anything about EHCI.
 

QEMU does not support USB 2.0 / EHCI yet. There were patches flying
around here, but so far no one stepped up and seriously pushed that
towards mainline.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux




[Qemu-devel] Re: [PATCH] virtio-serial: don't set MULTIPORT for 1 port dev

2010-02-15 Thread Michael S. Tsirkin
On Mon, Feb 15, 2010 at 02:54:01PM +0530, Amit Shah wrote:
 On (Mon) Feb 15 2010 [10:03:34], Gerd Hoffmann wrote:
  On 02/12/10 15:23, Amit Shah wrote:
  On (Fri) Feb 12 2010 [15:42:14], Michael S. Tsirkin wrote:
  Since commit 98b19252cf1bd97c54bc4613f3537c5ec0aae263, all
  serial devices declare MULTIPORT feature.
  To allow 0.12 compatibility, we should clear this when
  max_nr_ports is 1.
 
  In addition to this, setting max_nr_ports to 1 is needed when -M 0.12 is
  selected.
 
  Indeed.
 
  However, is this the only way to do it? Gerd?
 
  Is there a qdev property for max_nr_ports?  Then simply adding a compat  
  property will do the trick.
 
 Something like this (I can split it into two patches before submission):
 

Right. So, can you ack my patch pls?

 diff --git a/hw/pc.c b/hw/pc.c
 index 5b29f3b..a975934 100644
 --- a/hw/pc.c
 +++ b/hw/pc.c
 @@ -1055,12 +1055,32 @@ void cmos_set_s3_resume(void)
  }
  
  static QEMUMachine pc_machine = {
 +.name = pc-0.13,
 +.alias = pc,
 +.desc = Standard PC,
 +.init = pc_init_pci,
 +.is_default = 1,
 +.max_cpus = 255,
 +};
 +
 +static QEMUMachine pc_machine_v0_12 = {
  .name = pc-0.12,
  .alias = pc,
  .desc = Standard PC,
  .init = pc_init_pci,
  .max_cpus = 255,
 -.is_default = 1,
 +.compat_props = (GlobalProperty[]) {
 +{
 +.driver   = virtio-serial-pci,
 +.property = max_nr_ports,
 +.value= stringify(1),
 +},{
 +.driver   = virtio-serial-pci,
 +.property = vectors,
 +.value= stringify(0),
 +},
 +{ /* end of list */ }
 +}
  };
  
  static QEMUMachine pc_machine_v0_11 = {
 @@ -1074,6 +1094,14 @@ static QEMUMachine pc_machine_v0_11 = {
  .property = vectors,
  .value= stringify(0),
  },{
 +.driver   = virtio-serial-pci,
 +.property = max_nr_ports,
 +.value= stringify(1),
 +},{
 +.driver   = virtio-serial-pci,
 +.property = vectors,
 +.value= stringify(0),
 +},{
  .driver   = ide-drive,
  .property = ver,
  .value= 0.11,
 @@ -1105,6 +1133,14 @@ static QEMUMachine pc_machine_v0_10 = {
  .property = class,
  .value= stringify(PCI_CLASS_DISPLAY_OTHER),
  },{
 +.driver   = virtio-serial-pci,
 +.property = max_nr_ports,
 +.value= stringify(1),
 +},{
 +.driver   = virtio-serial-pci,
 +.property = vectors,
 +.value= stringify(0),
 +},{
  .driver   = virtio-net-pci,
  .property = vectors,
  .value= stringify(0),
 @@ -1139,6 +1175,7 @@ static QEMUMachine isapc_machine = {
  static void pc_machine_init(void)
  {
  qemu_register_machine(pc_machine);
 +qemu_register_machine(pc_machine_v0_12);
  qemu_register_machine(pc_machine_v0_11);
  qemu_register_machine(pc_machine_v0_10);
  qemu_register_machine(isapc_machine);
 
   Amit




Re: [Qemu-devel] qemu-ppc can't run static uClibc binaries.

2010-02-15 Thread Rob Landley
On Sunday 14 February 2010 08:41:00 Alexander Graf wrote:
 Am Sun 14 Feb 2010 09:36:27 AM CET schrieb Rob Landley r...@landley.net:
  On Thursday 11 February 2010 06:32:12 Alexander Graf wrote:
  Rob Landley wrote:
   Static binaries that run under the Linux kernel don't run under
   qemu-ppc. For example, the prebuilt busybox binaries here:
  
 http://busybox.net/downloads/binaries/1.16.0/busybox-powerpc
  
   Don't run under qemu-ppc, but runs just fine under qemu-system-ppc
   with the image at:
  
  
   http://impactlinux.com/fwl/downloads/binaries/system-image-powerpc.tar
  .bz 2
  
   The reason is that the powerpc spec that qemu was written to is for
   AIX, not for Linux, and thus the register layout qemu application
   emulation provides for powerpc doesn't match what the kernel is
   actually doing.
  
   For dynamically linked executables, the dynamic linker reorganizes the
   register contents to match the AIX spec from IBM, but statically
   linked binaries get what the kernel provides directly.  Thus binaries
   statically linked against uClibc won't run under qemu-ppc, but run
   under qemu-system-ppc just fine.
  
   I tracked down this problem in 2007:
  
 http://landley.net/notes-2007.html#28-03-2007
  
   And reported it on the list at the time:
  
 http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00713.html
 http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00720.html
 http://lists.gnu.org/archive/html/qemu-devel/2007-04/msg00315.html
  
   However, the then-maintainer of powerpc believed nobody else ever had
   the right to touch her code:
  
 http://lists.gnu.org/archive/html/qemu-devel/2007-04/msg00198.html
  
   And I was unable to convince her that insisting reality change to
   match a spec which wasn't even for the right platform was not a useful
   approach. Thus the binary in the first link still won't run under
   qemu-ppc three years later, despite running fine under a real Linux
   kernel.
 
  Patches are always welcome. The only thing you might want to make sure
  is that dynamically linked binaries also still continue to work :-).
 
  Attached.
 
  This may help explain the issue:
 
http://sources.redhat.com/ml/libc-alpha/2003-03/msg00272.html
 
  It's not a question of dynamically linked Linux binaries.  They work
   just fine
  with either register layout.  The dynamic linker converts the Linux
  layout to the AIX layout, and is reentrant so it won't do it a second
  time if it's already been converted.
 
  The problem is that BSD wants the AIX layout, and hence this comment
   in linux-
  user/elfload.c function init_thread():
 
  /* Note that isn't exactly what regular kernel does
   * but this is what the ABI wants and is needed to allow
   * execution of PPC BSD programs.
   */
 
  I.E. whoever wrote this already knows it's not what the Linux kernel is
  actually doing, and they're not doing it for Linux, they're doing it for
  BSD.
 
  The fix is probably to add #ifdef CONFIG_BSD around the appropriate chunk
  of code.  Attached is a patch to do that (plus tweaks to make the you
  have an unused variable, break the build! logic shut up about it).
 
  (Yes, I tested that a dynamically linked hello world still worked for
  me.)

 I don't see why it would fail. The link above states that for
 statically linked binaries, r1 points to all the variables. For
 dynamically linked ones, you also get pointers in some regs.

 So the only case I can imagine that this breaks anything is that
 uClibc requires register state to be 0.

Yes, r3 (which is the exit code from the exec syscall, and thus 0 if it 
worked).  In the BSD layout, it's argc (which can never be 0).

  http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00720.html

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds




Re: [Qemu-devel] qemu-ppc can't run static uClibc binaries.

2010-02-15 Thread Alexander Graf

On 15.02.2010, at 12:10, Rob Landley wrote:

 On Sunday 14 February 2010 08:41:00 Alexander Graf wrote:
 Am Sun 14 Feb 2010 09:36:27 AM CET schrieb Rob Landley r...@landley.net:
 On Thursday 11 February 2010 06:32:12 Alexander Graf wrote:
 Rob Landley wrote:
 Static binaries that run under the Linux kernel don't run under
 qemu-ppc. For example, the prebuilt busybox binaries here:
 
  http://busybox.net/downloads/binaries/1.16.0/busybox-powerpc
 
 Don't run under qemu-ppc, but runs just fine under qemu-system-ppc
 with the image at:
 
 
 http://impactlinux.com/fwl/downloads/binaries/system-image-powerpc.tar
 .bz 2
 
 The reason is that the powerpc spec that qemu was written to is for
 AIX, not for Linux, and thus the register layout qemu application
 emulation provides for powerpc doesn't match what the kernel is
 actually doing.
 
 For dynamically linked executables, the dynamic linker reorganizes the
 register contents to match the AIX spec from IBM, but statically
 linked binaries get what the kernel provides directly.  Thus binaries
 statically linked against uClibc won't run under qemu-ppc, but run
 under qemu-system-ppc just fine.
 
 I tracked down this problem in 2007:
 
  http://landley.net/notes-2007.html#28-03-2007
 
 And reported it on the list at the time:
 
  http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00713.html
  http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00720.html
  http://lists.gnu.org/archive/html/qemu-devel/2007-04/msg00315.html
 
 However, the then-maintainer of powerpc believed nobody else ever had
 the right to touch her code:
 
  http://lists.gnu.org/archive/html/qemu-devel/2007-04/msg00198.html
 
 And I was unable to convince her that insisting reality change to
 match a spec which wasn't even for the right platform was not a useful
 approach. Thus the binary in the first link still won't run under
 qemu-ppc three years later, despite running fine under a real Linux
 kernel.
 
 Patches are always welcome. The only thing you might want to make sure
 is that dynamically linked binaries also still continue to work :-).
 
 Attached.
 
 This may help explain the issue:
 
  http://sources.redhat.com/ml/libc-alpha/2003-03/msg00272.html
 
 It's not a question of dynamically linked Linux binaries.  They work
 just fine
 with either register layout.  The dynamic linker converts the Linux
 layout to the AIX layout, and is reentrant so it won't do it a second
 time if it's already been converted.
 
 The problem is that BSD wants the AIX layout, and hence this comment
 in linux-
 user/elfload.c function init_thread():
 
/* Note that isn't exactly what regular kernel does
 * but this is what the ABI wants and is needed to allow
 * execution of PPC BSD programs.
 */
 
 I.E. whoever wrote this already knows it's not what the Linux kernel is
 actually doing, and they're not doing it for Linux, they're doing it for
 BSD.
 
 The fix is probably to add #ifdef CONFIG_BSD around the appropriate chunk
 of code.  Attached is a patch to do that (plus tweaks to make the you
 have an unused variable, break the build! logic shut up about it).
 
 (Yes, I tested that a dynamically linked hello world still worked for
 me.)
 
 I don't see why it would fail. The link above states that for
 statically linked binaries, r1 points to all the variables. For
 dynamically linked ones, you also get pointers in some regs.
 
 So the only case I can imagine that this breaks anything is that
 uClibc requires register state to be 0.
 
 Yes, r3 (which is the exit code from the exec syscall, and thus 0 if it 
 worked).  In the BSD layout, it's argc (which can never be 0).
 
  http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00720.html

So what you really want is something like

#ifdef CONFIG_LINUX_USER
/* exec return value is always 0 */
env-gpr[3] = 0;
#endif

just after the #endif in your patch. If you had inlined your patch I could've 
commented it there.


Alex



[Qemu-devel] Re: qemu-ppc can't run static uClibc binaries.

2010-02-15 Thread Michael S. Tsirkin
On Mon, Feb 15, 2010 at 06:58:33AM -0600, Rob Landley wrote:
 On Monday 15 February 2010 05:19:24 Alexander Graf wrote:
  On 15.02.2010, at 12:10, Rob Landley wrote:
   On Sunday 14 February 2010 08:41:00 Alexander Graf wrote:
   So the only case I can imagine that this breaks anything is that
   uClibc requires register state to be 0.
  
   Yes, r3 (which is the exit code from the exec syscall, and thus 0 if it
   worked).  In the BSD layout, it's argc (which can never be 0).
  
http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00720.html
 
  So what you really want is something like
 
  #ifdef CONFIG_LINUX_USER
  /* exec return value is always 0 */
  env-gpr[3] = 0;
  #endif
 
  just after the #endif in your patch. If you had inlined your patch I
  could've commented it there.
 
 Unfortunately kmail plays fast and loose with whitespace when I inline stuff. 
  
 (Not always, but I can't tell by inspection when it's decided it was hungry 
 for tabs or wanted to throw in that horrible UTF8 escaped whitespace.)

See Documentation/email-clients.txt under linux source tree.




Re: [Qemu-devel] qemu-ppc can't run static uClibc binaries.

2010-02-15 Thread Alexander Graf

On 15.02.2010, at 13:58, Rob Landley wrote:

 On Monday 15 February 2010 05:19:24 Alexander Graf wrote:
 On 15.02.2010, at 12:10, Rob Landley wrote:
 On Sunday 14 February 2010 08:41:00 Alexander Graf wrote:
 So the only case I can imagine that this breaks anything is that
 uClibc requires register state to be 0.
 
 Yes, r3 (which is the exit code from the exec syscall, and thus 0 if it
 worked).  In the BSD layout, it's argc (which can never be 0).
 
 http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00720.html
 
 So what you really want is something like
 
 #ifdef CONFIG_LINUX_USER
 /* exec return value is always 0 */
 env-gpr[3] = 0;
 #endif
 
 just after the #endif in your patch. If you had inlined your patch I
 could've commented it there.
 
 Unfortunately kmail plays fast and loose with whitespace when I inline stuff. 
  
 (Not always, but I can't tell by inspection when it's decided it was hungry 
 for tabs or wanted to throw in that horrible UTF8 escaped whitespace.)

git-send-mail is your friend :-).

 I didn't explicitly set it because they're initialized to zero in function 
 main() on line 2654 of linux-user/main.c.  (Any regs we don't explicitly set 
 to some other value start out zeroed in qemu.)

So it should work already?

 If you prefer to make the requirements explicit, that works too, but a 
 comment 
 might do just as well.  (I tend to prefer removing unnecessary work Linux 
 doesn't need done, rather than adding extra code to undo the unnecessary work 
 afterwards.  Force of habit from years on busybox and such.)

Well, I personally prefer to always use the same code paths whenever possible. 
That makes the code less prone to failure in odd configurations. And we have a 
lot of different combinations of those in Qemu.

But this is Riku's call. He's the linux-user maintainer.


Alex



[Qemu-devel] [PATCH 1/3] pc: Bump up pc version to 0.13 and add a 0.12 compat version

2010-02-15 Thread Amit Shah
The version 0.13 will be the new default and compatibility options will
be added to the 0.12 version.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/pc.c |9 -
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 6fbe98b..7c9a24e 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1052,7 +1052,7 @@ void cmos_set_s3_resume(void)
 }
 
 static QEMUMachine pc_machine = {
-.name = pc-0.12,
+.name = pc-0.13,
 .alias = pc,
 .desc = Standard PC,
 .init = pc_init_pci,
@@ -1060,6 +1060,13 @@ static QEMUMachine pc_machine = {
 .is_default = 1,
 };
 
+static QEMUMachine pc_machine_v0_12 = {
+.name = pc-0.12,
+.desc = Standard PC,
+.init = pc_init_pci,
+.max_cpus = 255,
+};
+
 static QEMUMachine pc_machine_v0_11 = {
 .name = pc-0.11,
 .desc = Standard PC, qemu 0.11,
-- 
1.6.2.5





[Qemu-devel] [PATCH 2/3] pc: Add backward compatibility options for virtio-serial

2010-02-15 Thread Amit Shah
virtio-serial-pci can support multiple ports in the current development
version that will become 0.13. Add compatibility options for the 0.12
and 0.11 pc machine types.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/pc.c |   29 +
 1 files changed, 29 insertions(+), 0 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 7c9a24e..49a95ba 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1065,6 +1065,18 @@ static QEMUMachine pc_machine_v0_12 = {
 .desc = Standard PC,
 .init = pc_init_pci,
 .max_cpus = 255,
+.compat_props = (GlobalProperty[]) {
+{
+.driver   = virtio-serial-pci,
+.property = max_nr_ports,
+.value= stringify(1),
+},{
+.driver   = virtio-serial-pci,
+.property = vectors,
+.value= stringify(0),
+},
+{ /* end of list */ }
+}
 };
 
 static QEMUMachine pc_machine_v0_11 = {
@@ -1078,6 +1090,14 @@ static QEMUMachine pc_machine_v0_11 = {
 .property = vectors,
 .value= stringify(0),
 },{
+.driver   = virtio-serial-pci,
+.property = max_nr_ports,
+.value= stringify(1),
+},{
+.driver   = virtio-serial-pci,
+.property = vectors,
+.value= stringify(0),
+},{
 .driver   = ide-drive,
 .property = ver,
 .value= 0.11,
@@ -1109,6 +1129,14 @@ static QEMUMachine pc_machine_v0_10 = {
 .property = class,
 .value= stringify(PCI_CLASS_DISPLAY_OTHER),
 },{
+.driver   = virtio-serial-pci,
+.property = max_nr_ports,
+.value= stringify(1),
+},{
+.driver   = virtio-serial-pci,
+.property = vectors,
+.value= stringify(0),
+},{
 .driver   = virtio-net-pci,
 .property = vectors,
 .value= stringify(0),
@@ -1143,6 +1171,7 @@ static QEMUMachine isapc_machine = {
 static void pc_machine_init(void)
 {
 qemu_register_machine(pc_machine);
+qemu_register_machine(pc_machine_v0_12);
 qemu_register_machine(pc_machine_v0_11);
 qemu_register_machine(pc_machine_v0_10);
 qemu_register_machine(isapc_machine);
-- 
1.6.2.5





[Qemu-devel] [PATCH 3/3] virtio-serial: don't set MULTIPORT for 1 port dev

2010-02-15 Thread Amit Shah
From: Michael S. Tsirkin m...@redhat.com

Since commit 98b19252cf1bd97c54bc4613f3537c5ec0aae263, all
serial devices declare MULTIPORT feature.
To allow 0.12 compatibility, we should clear this when
max_nr_ports is 1.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-serial-bus.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index ab456ea..d0e0219 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -335,8 +335,10 @@ static void handle_input(VirtIODevice *vdev, VirtQueue *vq)
 
 static uint32_t get_features(VirtIODevice *vdev, uint32_t features)
 {
-features |= (1  VIRTIO_CONSOLE_F_MULTIPORT);
-
+VirtIOSerial *vser = DO_UPCAST(VirtIOSerial, vdev, vdev);
+if (vser-bus-max_nr_ports  1) {
+features |= (1  VIRTIO_CONSOLE_F_MULTIPORT);
+}
 return features;
 }
 
-- 
1.6.2.5





[Qemu-devel] Re: [PATCH] virtio-serial: don't set MULTIPORT for 1 port dev

2010-02-15 Thread Michael S. Tsirkin
On Mon, Feb 15, 2010 at 06:51:31PM +0530, Amit Shah wrote:
 On (Fri) Feb 12 2010 [15:42:14], Michael S. Tsirkin wrote:
  Since commit 98b19252cf1bd97c54bc4613f3537c5ec0aae263, all
  serial devices declare MULTIPORT feature.
  To allow 0.12 compatibility, we should clear this when
  max_nr_ports is 1.
  
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
 
 Acked-by: Amit Shah amit.s...@redhat.com
 
 A couple more patches are needed to introduce a 0.12 machine type and
 default to 1 max_nr_ports for machine types  0.13, which I'll be
 sending shortly.

Note that we also need to fill in virtio feature bits for the rest of
devices, per machine type.  Care to deal with it?

 I'll also make this patch 3/3 in that series, just for completenes..
 
   Amit
 




[Qemu-devel] Re: [PATCH 2/3] pc: Add backward compatibility options for virtio-serial

2010-02-15 Thread Gerd Hoffmann

@@ -1143,6 +1171,7 @@ static QEMUMachine isapc_machine = {
  static void pc_machine_init(void)
  {
  qemu_register_machine(pc_machine);
+qemu_register_machine(pc_machine_v0_12);
  qemu_register_machine(pc_machine_v0_11);
  qemu_register_machine(pc_machine_v0_10);
  qemu_register_machine(isapc_machine);


This chunk should have been in patch 1/3.
Otherwise the whole series looks fine to me.

cheers,
  Gerd




[Qemu-devel] Re: [PATCH 2/3] pc: Add backward compatibility options for virtio-serial

2010-02-15 Thread Amit Shah
On (Mon) Feb 15 2010 [15:36:34], Gerd Hoffmann wrote:
 @@ -1143,6 +1171,7 @@ static QEMUMachine isapc_machine = {
   static void pc_machine_init(void)
   {
   qemu_register_machine(pc_machine);
 +qemu_register_machine(pc_machine_v0_12);
   qemu_register_machine(pc_machine_v0_11);
   qemu_register_machine(pc_machine_v0_10);
   qemu_register_machine(isapc_machine);

 This chunk should have been in patch 1/3.
 Otherwise the whole series looks fine to me.

Yep, missed that. Fixed in v2.

Amit




[Qemu-devel] Re: [PATCH v2 1/3] pc: Bump up pc version to 0.13 and add a 0.12 compat version

2010-02-15 Thread Gerd Hoffmann

On 02/15/10 16:13, Amit Shah wrote:

The version 0.13 will be the new default and compatibility options will
be added to the 0.12 version.


Whole series looks good now.

Acked-by: Gerd Hoffmann kra...@redhat.com

cheers,
  Gerd





[Qemu-devel] Re: [PATCH] virtio-serial: don't set MULTIPORT for 1 port dev

2010-02-15 Thread Amit Shah
On (Mon) Feb 15 2010 [16:07:28], Michael S. Tsirkin wrote:
 On Mon, Feb 15, 2010 at 06:51:31PM +0530, Amit Shah wrote:
  On (Fri) Feb 12 2010 [15:42:14], Michael S. Tsirkin wrote:
   Since commit 98b19252cf1bd97c54bc4613f3537c5ec0aae263, all
   serial devices declare MULTIPORT feature.
   To allow 0.12 compatibility, we should clear this when
   max_nr_ports is 1.
   
   Signed-off-by: Michael S. Tsirkin m...@redhat.com
  
  Acked-by: Amit Shah amit.s...@redhat.com
  
  A couple more patches are needed to introduce a 0.12 machine type and
  default to 1 max_nr_ports for machine types  0.13, which I'll be
  sending shortly.
 
 Note that we also need to fill in virtio feature bits for the rest of
 devices, per machine type.  Care to deal with it?

Yes, that has to be done (and at least for s390 and ppc for
virtio-serial). I've not yet seen how they work, but they should be
similar to pc (if at all they have similar versioning support).

Amit




[Qemu-devel] [PATCH v2 3/3] virtio-serial: don't set MULTIPORT for 1 port dev

2010-02-15 Thread Amit Shah
From: Michael S. Tsirkin m...@redhat.com

Since commit 98b19252cf1bd97c54bc4613f3537c5ec0aae263, all
serial devices declare MULTIPORT feature.
To allow 0.12 compatibility, we should clear this when
max_nr_ports is 1.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Amit Shah amit.s...@redhat.com
---
 hw/virtio-serial-bus.c |6 --
 1 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index ab456ea..d0e0219 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -335,8 +335,10 @@ static void handle_input(VirtIODevice *vdev, VirtQueue *vq)
 
 static uint32_t get_features(VirtIODevice *vdev, uint32_t features)
 {
-features |= (1  VIRTIO_CONSOLE_F_MULTIPORT);
-
+VirtIOSerial *vser = DO_UPCAST(VirtIOSerial, vdev, vdev);
+if (vser-bus-max_nr_ports  1) {
+features |= (1  VIRTIO_CONSOLE_F_MULTIPORT);
+}
 return features;
 }
 
-- 
1.6.2.5





[Qemu-devel] [PATCH v2 1/3] pc: Bump up pc version to 0.13 and add a 0.12 compat version

2010-02-15 Thread Amit Shah
The version 0.13 will be the new default and compatibility options will
be added to the 0.12 version.

Signed-off-by: Amit Shah amit.s...@redhat.com
---
v2: register the 0.12 version in this patch instead of the next one.

 hw/pc.c |   10 +-
 1 files changed, 9 insertions(+), 1 deletions(-)

diff --git a/hw/pc.c b/hw/pc.c
index 6fbe98b..c27ad0f 100644
--- a/hw/pc.c
+++ b/hw/pc.c
@@ -1052,7 +1052,7 @@ void cmos_set_s3_resume(void)
 }
 
 static QEMUMachine pc_machine = {
-.name = pc-0.12,
+.name = pc-0.13,
 .alias = pc,
 .desc = Standard PC,
 .init = pc_init_pci,
@@ -1060,6 +1060,13 @@ static QEMUMachine pc_machine = {
 .is_default = 1,
 };
 
+static QEMUMachine pc_machine_v0_12 = {
+.name = pc-0.12,
+.desc = Standard PC,
+.init = pc_init_pci,
+.max_cpus = 255,
+};
+
 static QEMUMachine pc_machine_v0_11 = {
 .name = pc-0.11,
 .desc = Standard PC, qemu 0.11,
@@ -1136,6 +1143,7 @@ static QEMUMachine isapc_machine = {
 static void pc_machine_init(void)
 {
 qemu_register_machine(pc_machine);
+qemu_register_machine(pc_machine_v0_12);
 qemu_register_machine(pc_machine_v0_11);
 qemu_register_machine(pc_machine_v0_10);
 qemu_register_machine(isapc_machine);
-- 
1.6.2.5





[Qemu-devel] [PATCH] tcg: Add consistency checks for op definitions

2010-02-15 Thread Stefan Weil
When compiled with CONFIG_DEBUG_TCG, this code looks
for missing, duplicate and wrong entries in the
op definitions.

Errors will raise an assertion at program start
(all checks are done in the initial phase).

The current code contains such errors, at least for
i386 guest on i386 host.

Signed-off-by: Stefan Weil w...@mail.berlios.de
---
 tcg/tcg.c |   21 +
 tcg/tcg.h |3 +++
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/tcg/tcg.c b/tcg/tcg.c
index 9949814..e6a1caf 100644
--- a/tcg/tcg.c
+++ b/tcg/tcg.c
@@ -981,9 +981,16 @@ void tcg_add_target_add_op_defs(const TCGTargetOpDef 
*tdefs)
 op = tdefs-op;
 assert(op = 0  op  NB_OPS);
 def = tcg_op_defs[op];
+#if defined(CONFIG_DEBUG_TCG)
+/* Duplicate entry in op definitions? */
+assert(!def-used);
+def-used = 1;
+#endif
 nb_args = def-nb_iargs + def-nb_oargs;
 for(i = 0; i  nb_args; i++) {
 ct_str = tdefs-args_ct_str[i];
+/* Incomplete TCGTargetOpDef entry? */
+assert(ct_str != NULL);
 tcg_regset_clear(def-args_ct[i].u.regs);
 def-args_ct[i].ct = 0;
 if (ct_str[0] = '0'  ct_str[0] = '9') {
@@ -1018,6 +1025,9 @@ void tcg_add_target_add_op_defs(const TCGTargetOpDef 
*tdefs)
 }
 }
 
+/* TCGTargetOpDef entry with too much information? */
+assert(i == TCG_MAX_OP_ARGS || tdefs-args_ct_str[i] == NULL);
+
 /* sort the constraints (XXX: this is just an heuristic) */
 sort_constraints(def, 0, def-nb_oargs);
 sort_constraints(def, def-nb_oargs, def-nb_iargs);
@@ -1035,6 +1045,17 @@ void tcg_add_target_add_op_defs(const TCGTargetOpDef 
*tdefs)
 tdefs++;
 }
 
+#if defined(CONFIG_DEBUG_TCG)
+for (op = 0; op  ARRAY_SIZE(tcg_op_defs); op++) {
+if (op  INDEX_op_call || op == INDEX_op_debug_insn_start) {
+/* Wrong entry in op definitions? */
+assert(!tcg_op_defs[op].used);
+} else {
+/* Missing entry in op definitions? */
+assert(tcg_op_defs[op].used);
+}
+}
+#endif
 }
 
 #ifdef USE_LIVENESS_ANALYSIS
diff --git a/tcg/tcg.h b/tcg/tcg.h
index b218abe..aca9f27 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -404,6 +404,9 @@ typedef struct TCGOpDef {
 uint16_t copy_size;
 TCGArgConstraint *args_ct;
 int *sorted_args;
+#if defined(CONFIG_DEBUG_TCG)
+int used;
+#endif
 } TCGOpDef;
 
 typedef struct TCGTargetOpDef {
-- 
1.6.6.1





[Qemu-devel] [PATCH 0/3] qcow2: Rewrite alloc_refcount_block

2010-02-15 Thread Kevin Wolf
The current implementation of alloc_refcount_block and grow_refcount_table has
fundamental problems regarding error handling. There are some places where an
I/O error means that the image is going to be corrupted. I have found that the
only way to fix this is to completely rewrite the thing.

Just sending as an RFC to the list hasn't generated a lot of comments (to be
precise, not a single one). This is a critical part of qcow2 and needs reviews.
So let's try it another way: People in CC, please give it a review. Sooner or
later some of you will need to do so anyway.

Kevin Wolf (3):
  qcow2: Factor next_refcount_table_size out
  qcow2: Rewrite alloc_refcount_block/grow_refcount_table
  qcow2: More check for qemu-img check

 block/qcow2-refcount.c |  334 +++-
 1 files changed, 244 insertions(+), 90 deletions(-)





[Qemu-devel] [PATCH 1/3] qcow2: Factor next_refcount_table_size out

2010-02-15 Thread Kevin Wolf
When the refcount table grows, it doesn't only grow by one entry but reserves
some space for future refcount blocks. The algorithm to calculate the number of
entries stays the same with the fixes, so factor it out before replacing the
rest.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2-refcount.c |   34 +++---
 1 files changed, 23 insertions(+), 11 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 2fdc26b..0e2ecd7 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -123,6 +123,28 @@ static int get_refcount(BlockDriverState *bs, int64_t 
cluster_index)
 return be16_to_cpu(s-refcount_block_cache[block_index]);
 }
 
+/*
+ * Rounds the refcount table size up to avoid growing the table for each single
+ * refcount block that is allocated.
+ */
+static unsigned int next_refcount_table_size(BDRVQcowState *s,
+unsigned int min_size)
+{
+unsigned int refcount_table_clusters = 0;
+unsigned int new_table_size = 1;
+
+while (min_size  new_table_size) {
+if (refcount_table_clusters == 0) {
+refcount_table_clusters = 1;
+} else {
+refcount_table_clusters = (refcount_table_clusters * 3 + 1) / 2;
+}
+new_table_size = refcount_table_clusters  (s-cluster_bits - 3);
+}
+
+return new_table_size;
+}
+
 static int grow_refcount_table(BlockDriverState *bs, int min_size)
 {
 BDRVQcowState *s = bs-opaque;
@@ -136,17 +158,7 @@ static int grow_refcount_table(BlockDriverState *bs, int 
min_size)
 if (min_size = s-refcount_table_size)
 return 0;
 /* compute new table size */
-refcount_table_clusters = s-refcount_table_size  (s-cluster_bits - 3);
-for(;;) {
-if (refcount_table_clusters == 0) {
-refcount_table_clusters = 1;
-} else {
-refcount_table_clusters = (refcount_table_clusters * 3 + 1) / 2;
-}
-new_table_size = refcount_table_clusters  (s-cluster_bits - 3);
-if (min_size = new_table_size)
-break;
-}
+new_table_size = next_refcount_table_size(s, min_size);
 #ifdef DEBUG_ALLOC2
 printf(grow_refcount_table from %d to %d\n,
s-refcount_table_size,
-- 
1.6.6





[Qemu-devel] [PATCH 3/3] qcow2: More checks for qemu-img check

2010-02-15 Thread Kevin Wolf
Implement some more refcount block related checks

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2-refcount.c |   12 
 1 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index e7fdf64..e50fb2a 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -1065,9 +1065,21 @@ int qcow2_check_refcounts(BlockDriverState *bs)
 for(i = 0; i  s-refcount_table_size; i++) {
 int64_t offset;
 offset = s-refcount_table[i];
+
+/* Refcount blocks are cluster aligned */
+if (offset  (s-cluster_size - 1)) {
+fprintf(stderr, ERROR refcount block %d is not 
+cluster aligned; refcount table entry corrupted\n, i);
+errors++;
+}
+
 if (offset != 0) {
 errors += inc_refcounts(bs, refcount_table, nb_clusters,
   offset, s-cluster_size);
+if (refcount_table[offset / s-cluster_size] != 1) {
+fprintf(stderr, ERROR refcount block %d refcount=%d\n,
+i, refcount_table[offset / s-cluster_size]);
+}
 }
 }
 
-- 
1.6.6





[Qemu-devel] [PATCH 2/3] qcow2: Rewrite alloc_refcount_block/grow_refcount_table

2010-02-15 Thread Kevin Wolf
The current implementation of alloc_refcount_block and grow_refcount_table has
fundamental problems regarding error handling. There are some places where an
I/O error means that the image is going to be corrupted. I have found that the
only way to fix this is to completely rewrite the thing.

In detail, the problem is that the refcount blocks itself are allocated using
alloc_refcount_noref (to avoid endless recursion when updating the refcount of
the new refcount block, which migh access just the same refcount block but its
allocation is not yet completed...). Only at the end of the refcount allocation
the refcount of the refcount block is increased. If an error happens in
between, the refcount block is in use, but has a refcount of zero and will
likely be overwritten later.

The new approach is explained in comments in the code. The trick is basically
to let new refcount blocks describe their own refcount, so their refcount will
be automatically changed when they are hooked up in the refcount table.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2-refcount.c |  306 ++--
 1 files changed, 218 insertions(+), 88 deletions(-)

diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index 0e2ecd7..e7fdf64 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -27,7 +27,7 @@
 #include block/qcow2.h
 
 static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size);
-static int update_refcount(BlockDriverState *bs,
+static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
 int64_t offset, int64_t length,
 int addend);
 
@@ -145,114 +145,244 @@ static unsigned int 
next_refcount_table_size(BDRVQcowState *s,
 return new_table_size;
 }
 
-static int grow_refcount_table(BlockDriverState *bs, int min_size)
+
+/* Checks if two offsets are described by the same refcount block */
+static int in_same_refcount_block(BDRVQcowState *s, uint64_t offset_a,
+uint64_t offset_b)
+{
+uint64_t block_a = offset_a  (2 * s-cluster_bits - REFCOUNT_SHIFT);
+uint64_t block_b = offset_b  (2 * s-cluster_bits - REFCOUNT_SHIFT);
+
+return (block_a == block_b);
+}
+
+/*
+ * Loads a refcount block. If it doesn't exist yet, it is allocated first
+ * (including growing the refcount table if needed).
+ *
+ * Returns the offset of the refcount block on success or -errno in error case
+ */
+static int64_t alloc_refcount_block(BlockDriverState *bs, int64_t 
cluster_index)
 {
 BDRVQcowState *s = bs-opaque;
-int new_table_size, new_table_size2, refcount_table_clusters, i, ret;
-uint64_t *new_table;
-int64_t table_offset;
-uint8_t data[12];
-int old_table_size;
-int64_t old_table_offset;
+unsigned int refcount_table_index;
+uint64_t refcount_block_offset;
+int ret;
+
+/* Find the refcount block for the given cluster */
+refcount_table_index = cluster_index  (s-cluster_bits - REFCOUNT_SHIFT);
+if (refcount_table_index = s-refcount_table_size) {
+refcount_block_offset = 0;
+} else {
+refcount_block_offset = s-refcount_table[refcount_table_index];
+}
+
+/* If it's already there, we're done */
+if (refcount_block_offset) {
+if (refcount_block_offset != s-refcount_block_cache_offset) {
+ret = load_refcount_block(bs, refcount_block_offset);
+if (ret  0) {
+return ret;
+}
+}
+return refcount_block_offset;
+}
+
+/*
+ * If we came here, we need to allocate something. Something is at least
+ * a cluster for the new refcount block. It may also include a new refcount
+ * table if the old refcount table is too small.
+ *
+ * Note that allocating clusters here needs some special care:
+ *
+ * - We can't use the normal qcow2_alloc_clusters(), it would try to
+ *   increase the refcount and very likely we would end up with an endless
+ *   recursion. Instead we must place the refcount blocks in a way that
+ *   they can describe them themselves.
+ *
+ * - We need to consider that at this point we are inside update_refcounts
+ *   and doing the initial refcount increase. This means that some clusters
+ *   have already been allocated by the caller, but their refcount isn't
+ *   accurate yet. free_cluster_index tells us where this allocation ends
+ *   as long as we don't overwrite it by freeing clusters.
+ *
+ * - alloc_clusters_noref and qcow2_free_clusters may load a different
+ *   refcount block into the cache
+ */
+
+if (cache_refcount_updates) {
+write_refcount_block(s);
+}
+
+/* Allocate the refcount block itself and mark it as used */
+uint64_t new_block = alloc_clusters_noref(bs, s-cluster_size);
+memset(s-refcount_block_cache, 0, s-cluster_size);
+s-refcount_block_cache_offset = new_block;
 
-if (min_size = 

[Qemu-devel] [PATCH] QEMU e820 reservation patch

2010-02-15 Thread Jes Sorensen

Hi,

Kevin and I have agreed on the approach for this one now. So here is
the latest version of the patch for QEMU, submitting e820 reservation
entries via fw_cfg.

Cheers,
Jes

Use qemu-cfg to provide the BIOS with an optional table of e820 entries.

Notify the BIOS of the location of the TSS+EPT range to by reserving
it via the e820 table.

This matches a corresponding patch for Seabios, however older versions
of Seabios will default to the hardcoded address range and stay
compatible with current QEMU.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com

---
 hw/pc.c   |   35 +++
 hw/pc.h   |   10 ++
 target-i386/kvm.c |8 
 3 files changed, 53 insertions(+)

Index: qemu/hw/pc.c
===
--- qemu.orig/hw/pc.c
+++ qemu/hw/pc.c
@@ -59,6 +59,7 @@
 #define FW_CFG_ACPI_TABLES (FW_CFG_ARCH_LOCAL + 0)
 #define FW_CFG_SMBIOS_ENTRIES (FW_CFG_ARCH_LOCAL + 1)
 #define FW_CFG_IRQ0_OVERRIDE (FW_CFG_ARCH_LOCAL + 2)
+#define FW_CFG_E820_TABLE (FW_CFG_ARCH_LOCAL + 3)
 
 #define MAX_IDE_BUS 2
 
@@ -67,6 +68,21 @@ static RTCState *rtc_state;
 static PITState *pit;
 static PCII440FXState *i440fx_state;
 
+#define E820_NR_ENTRIES16
+
+struct e820_entry {
+uint64_t address;
+uint64_t length;
+uint32_t type;
+};
+
+struct e820_table {
+uint32_t count;
+struct e820_entry entry[E820_NR_ENTRIES];
+};
+
+static struct e820_table e820_table;
+
 typedef struct isa_irq_state {
 qemu_irq *i8259;
 qemu_irq *ioapic;
@@ -435,6 +451,23 @@ static void bochs_bios_write(void *opaqu
 }
 }
 
+int e820_add_entry(uint64_t address, uint64_t length, uint32_t type)
+{
+int index = e820_table.count;
+struct e820_entry *entry;
+
+if (index = E820_NR_ENTRIES)
+return -EBUSY;
+entry = e820_table.entry[index];
+
+entry-address = address;
+entry-length = length;
+entry-type = type;
+
+e820_table.count++;
+return e820_table.count;
+}
+
 static void *bochs_bios_init(void)
 {
 void *fw_cfg;
@@ -466,6 +499,8 @@ static void *bochs_bios_init(void)
 if (smbios_table)
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SMBIOS_ENTRIES,
  smbios_table, smbios_len);
+fw_cfg_add_bytes(fw_cfg, FW_CFG_E820_TABLE, (uint8_t *)e820_table,
+ sizeof(struct e820_table));
 
 /* allocate memory for the NUMA channel: one (64bit) word for the number
  * of nodes, one word for each VCPU-node and one word for each node to
Index: qemu/hw/pc.h
===
--- qemu.orig/hw/pc.h
+++ qemu/hw/pc.h
@@ -150,4 +150,14 @@ void isa_cirrus_vga_init(void);
 void isa_ne2000_init(int base, int irq, NICInfo *nd);
 
 int cpu_is_bsp(CPUState *env);
+
+/* e820 types */
+#define E820_RAM1
+#define E820_RESERVED   2
+#define E820_ACPI   3
+#define E820_NVS4
+#define E820_UNUSABLE   5
+
+int e820_add_entry(uint64_t, uint64_t, uint32_t);
+
 #endif
Index: qemu/target-i386/kvm.c
===
--- qemu.orig/target-i386/kvm.c
+++ qemu/target-i386/kvm.c
@@ -24,6 +24,7 @@
 #include cpu.h
 #include gdbstub.h
 #include host-utils.h
+#include hw/pc.h
 
 #ifdef CONFIG_KVM_PARA
 #include linux/kvm_para.h
@@ -362,6 +363,13 @@ int kvm_arch_init(KVMState *s, int smp_c
  * as unavaible memory.  FIXME, need to ensure the e820 map deals with
  * this?
  */
+/*
+ * Tell fw_cfg to notify the BIOS to reserve the range.
+ */
+if (e820_add_entry(0xfffbc000, 0x4000, E820_RESERVED)  0) {
+perror(e820_add_entry() table is full);
+exit(1);
+}
 return kvm_vm_ioctl(s, KVM_SET_TSS_ADDR, 0xfffbd000);
 }
 


[Qemu-devel] [PATCH] Seabios e820 reservation portion

2010-02-15 Thread Jes Sorensen

Hi,

This is the Seabios part to match my e820 reservation via fw_cfg patch.

Cheers,
Jes

Read optional table of e820 entries from qemu_cfg

Read optional table of e820 entries through qemu_cfg, allowing QEMU to
provide the location of KVM's switch area etc. rather than rely on
hard coded values.

For now, fall back to the old hard coded values for the TSS and EPT
switch page for compatibility reasons. Compatibility code could
possibly be removed in the future.

Signed-off-by: Jes Sorensen jes.soren...@redhat.com

---
 src/paravirt.c |   17 +
 src/paravirt.h |9 +
 src/post.c |   13 -
 3 files changed, 38 insertions(+), 1 deletion(-)

Index: seabios/src/paravirt.c
===
--- seabios.orig/src/paravirt.c
+++ seabios/src/paravirt.c
@@ -132,6 +132,23 @@ u16 qemu_cfg_smbios_entries(void)
 return cnt;
 }
 
+u32 qemu_cfg_e820_entries(void)
+{
+u32 cnt;
+
+if (!qemu_cfg_present)
+return 0;
+
+qemu_cfg_read_entry(cnt, QEMU_CFG_E820_TABLE, sizeof(cnt));
+return cnt;
+}
+
+void* qemu_cfg_e820_load_next(void *addr)
+{
+qemu_cfg_read(addr, sizeof(struct e820_entry));
+return addr;
+}
+
 struct smbios_header {
 u16 length;
 u8 type;
Index: seabios/src/paravirt.h
===
--- seabios.orig/src/paravirt.h
+++ seabios/src/paravirt.h
@@ -36,6 +36,7 @@ static inline int kvm_para_available(voi
 #define QEMU_CFG_ACPI_TABLES   (QEMU_CFG_ARCH_LOCAL + 0)
 #define QEMU_CFG_SMBIOS_ENTRIES(QEMU_CFG_ARCH_LOCAL + 1)
 #define QEMU_CFG_IRQ0_OVERRIDE (QEMU_CFG_ARCH_LOCAL + 2)
+#define QEMU_CFG_E820_TABLE(QEMU_CFG_ARCH_LOCAL + 3)
 
 extern int qemu_cfg_present;
 
@@ -61,8 +62,16 @@ typedef struct QemuCfgFile {
 char name[56];
 } QemuCfgFile;
 
+struct e820_entry {
+u64 address;
+u64 length;
+u32 type;
+};
+
 u16 qemu_cfg_first_file(QemuCfgFile *entry);
 u16 qemu_cfg_next_file(QemuCfgFile *entry);
 u32 qemu_cfg_read_file(QemuCfgFile *entry, void *dst, u32 maxlen);
+u32 qemu_cfg_e820_entries(void);
+void* qemu_cfg_e820_load_next(void *addr);
 
 #endif
Index: seabios/src/post.c
===
--- seabios.orig/src/post.c
+++ seabios/src/post.c
@@ -135,10 +135,21 @@ ram_probe(void)
  , E820_RESERVED);
 add_e820(BUILD_BIOS_ADDR, BUILD_BIOS_SIZE, E820_RESERVED);
 
-if (kvm_para_available())
+u32 count = qemu_cfg_e820_entries();
+if (count) {
+struct e820_entry entry;
+int i;
+
+for (i = 0; i  count; i++) {
+qemu_cfg_e820_load_next(entry);
+add_e820(entry.address, entry.length, entry.type);
+}
+} else if (kvm_para_available()) {
+// Backwards compatibility - provide hard coded range.
 // 4 pages before the bios, 3 pages for vmx tss pages, the
 // other page for EPT real mode pagetable
 add_e820(0xfffbc000, 4*4096, E820_RESERVED);
+}
 
 dprintf(1, Ram Size=0x%08x (0x%08x%08x high)\n
 , RamSize, (u32)(RamSizeOver4G  32), (u32)RamSizeOver4G);


[Qemu-devel] sparc32 fix spurious dma interrupts v2

2010-02-15 Thread Artyom Tarasenko
Don't raise irq when not enabled.
Raise irq on enabling if DMA_INTR is set
Don't clear irq unless it was raised by DMA, as there are other irq sources
Don't set DMA_INTR bit spuriously.

v1-v2:
 - Don't clear irq unless it was raised by DMA
 - Raise irq on enabling if DMA_INTR is set
 - Assume revertion of 787cfbc432bf1d353a77cbdb613754f3963371a3

Signed-off-by: Artyom Tarasenko atar4q...@gmail.com
---
diff --git a/hw/sparc32_dma.c b/hw/sparc32_dma.c
index faf6dbc..18ba035 100644
--- a/hw/sparc32_dma.c
+++ b/hw/sparc32_dma.c
@@ -3,6 +3,9 @@
  *
  * Copyright (c) 2006 Fabrice Bellard
  *
+ * Modifications:
+ *  2010-Feb-14 Artyom Tarasenko : reworked irq generation
+ *
  * Permission is hereby granted, free of charge, to any person obtaining a copy
  * of this software and associated documentation files (the Software), to 
deal
  * in the Software without restriction, including without limitation the rights
@@ -125,13 +128,19 @@ static void dma_set_irq(void *opaque, int irq, int level)
 {
 DMAState *s = opaque;
 if (level) {
-DPRINTF(Raise IRQ\n);
 s-dmaregs[0] |= DMA_INTR;
-qemu_irq_raise(s-irq);
+if (s-dmaregs[0]  DMA_INTREN) {
+DPRINTF(Raise IRQ\n);
+qemu_irq_raise(s-irq);
+}
 } else {
-s-dmaregs[0] = ~DMA_INTR;
-DPRINTF(Lower IRQ\n);
-qemu_irq_lower(s-irq);
+if (s-dmaregs[0]  DMA_INTR) {
+s-dmaregs[0] = ~DMA_INTR;
+if (s-dmaregs[0]  DMA_INTREN) {
+DPRINTF(Lower IRQ\n);
+qemu_irq_lower(s-irq);
+}
+}
 }
 }
 
@@ -142,7 +151,6 @@ void espdma_memory_read(void *opaque, uint8_t *buf, int len)
 DPRINTF(DMA read, direction: %c, addr 0x%8.8x\n,
 s-dmaregs[0]  DMA_WRITE_MEM ? 'w': 'r', s-dmaregs[1]);
 sparc_iommu_memory_read(s-iommu, s-dmaregs[1], buf, len);
-s-dmaregs[0] |= DMA_INTR;
 s-dmaregs[1] += len;
 }
 
@@ -153,7 +161,6 @@ void espdma_memory_write(void *opaque, uint8_t *buf, int 
len)
 DPRINTF(DMA write, direction: %c, addr 0x%8.8x\n,
 s-dmaregs[0]  DMA_WRITE_MEM ? 'w': 'r', s-dmaregs[1]);
 sparc_iommu_memory_write(s-iommu, s-dmaregs[1], buf, len);
-s-dmaregs[0] |= DMA_INTR;
 s-dmaregs[1] += len;
 }
 
@@ -179,9 +186,16 @@ static void dma_mem_writel(void *opaque, 
target_phys_addr_t addr, uint32_t val)
 s-dmaregs[saddr], val);
 switch (saddr) {
 case 0:
-if (!(val  DMA_INTREN)) {
-DPRINTF(Lower IRQ\n);
-qemu_irq_lower(s-irq);
+if (val  DMA_INTREN) {
+if (val  DMA_INTR) {
+DPRINTF(Raise IRQ\n);
+qemu_irq_raise(s-irq);
+}
+} else {
+if (s-dmaregs[0]  (DMA_INTR | DMA_INTREN)) {
+DPRINTF(Lower IRQ\n);
+qemu_irq_lower(s-irq);
+}
 }
 if (val  DMA_RESET) {
 qemu_irq_raise(s-dev_reset);




[Qemu-devel] Re: sparc32 fix spurious dma interrupts v2

2010-02-15 Thread Blue Swirl
Thanks, applied.


On Mon, Feb 15, 2010 at 7:39 PM, Artyom Tarasenko
atar4q...@googlemail.com wrote:
 Don't raise irq when not enabled.
 Raise irq on enabling if DMA_INTR is set
 Don't clear irq unless it was raised by DMA, as there are other irq sources
 Don't set DMA_INTR bit spuriously.

 v1-v2:
  - Don't clear irq unless it was raised by DMA
  - Raise irq on enabling if DMA_INTR is set
  - Assume revertion of 787cfbc432bf1d353a77cbdb613754f3963371a3

 Signed-off-by: Artyom Tarasenko atar4q...@gmail.com
 ---
 diff --git a/hw/sparc32_dma.c b/hw/sparc32_dma.c
 index faf6dbc..18ba035 100644
 --- a/hw/sparc32_dma.c
 +++ b/hw/sparc32_dma.c
 @@ -3,6 +3,9 @@
  *
  * Copyright (c) 2006 Fabrice Bellard
  *
 + * Modifications:
 + *  2010-Feb-14 Artyom Tarasenko : reworked irq generation
 + *
  * Permission is hereby granted, free of charge, to any person obtaining a 
 copy
  * of this software and associated documentation files (the Software), to 
 deal
  * in the Software without restriction, including without limitation the 
 rights
 @@ -125,13 +128,19 @@ static void dma_set_irq(void *opaque, int irq, int 
 level)
  {
     DMAState *s = opaque;
     if (level) {
 -        DPRINTF(Raise IRQ\n);
         s-dmaregs[0] |= DMA_INTR;
 -        qemu_irq_raise(s-irq);
 +        if (s-dmaregs[0]  DMA_INTREN) {
 +            DPRINTF(Raise IRQ\n);
 +            qemu_irq_raise(s-irq);
 +        }
     } else {
 -        s-dmaregs[0] = ~DMA_INTR;
 -        DPRINTF(Lower IRQ\n);
 -        qemu_irq_lower(s-irq);
 +        if (s-dmaregs[0]  DMA_INTR) {
 +            s-dmaregs[0] = ~DMA_INTR;
 +            if (s-dmaregs[0]  DMA_INTREN) {
 +                DPRINTF(Lower IRQ\n);
 +                qemu_irq_lower(s-irq);
 +            }
 +        }
     }
  }

 @@ -142,7 +151,6 @@ void espdma_memory_read(void *opaque, uint8_t *buf, int 
 len)
     DPRINTF(DMA read, direction: %c, addr 0x%8.8x\n,
             s-dmaregs[0]  DMA_WRITE_MEM ? 'w': 'r', s-dmaregs[1]);
     sparc_iommu_memory_read(s-iommu, s-dmaregs[1], buf, len);
 -    s-dmaregs[0] |= DMA_INTR;
     s-dmaregs[1] += len;
  }

 @@ -153,7 +161,6 @@ void espdma_memory_write(void *opaque, uint8_t *buf, int 
 len)
     DPRINTF(DMA write, direction: %c, addr 0x%8.8x\n,
             s-dmaregs[0]  DMA_WRITE_MEM ? 'w': 'r', s-dmaregs[1]);
     sparc_iommu_memory_write(s-iommu, s-dmaregs[1], buf, len);
 -    s-dmaregs[0] |= DMA_INTR;
     s-dmaregs[1] += len;
  }

 @@ -179,9 +186,16 @@ static void dma_mem_writel(void *opaque, 
 target_phys_addr_t addr, uint32_t val)
             s-dmaregs[saddr], val);
     switch (saddr) {
     case 0:
 -        if (!(val  DMA_INTREN)) {
 -            DPRINTF(Lower IRQ\n);
 -            qemu_irq_lower(s-irq);
 +        if (val  DMA_INTREN) {
 +            if (val  DMA_INTR) {
 +                DPRINTF(Raise IRQ\n);
 +                qemu_irq_raise(s-irq);
 +            }
 +        } else {
 +            if (s-dmaregs[0]  (DMA_INTR | DMA_INTREN)) {
 +                DPRINTF(Lower IRQ\n);
 +                qemu_irq_lower(s-irq);
 +            }
         }
         if (val  DMA_RESET) {
             qemu_irq_raise(s-dev_reset);





[Qemu-devel] Re: QError conversion problems: putting errors in context

2010-02-15 Thread Markus Armbruster
Anthony Liguori aligu...@linux.vnet.ibm.com writes:

 Hi Markus,

 On 02/12/2010 11:48 AM, Markus Armbruster wrote:
 Our QError conversions were pretty straightforward so far.  For example,
 when we found

  monitor_printf(mon, device is not removable\n);

 in eject_device(), we created the obvious QError class for it:

 #define QERR_DEVICE_NOT_REMOVABLE \
  { 'class': 'DeviceNotRemovable', 'data': { 'device': %s } }

 with the obvious pretty-print template:

  {
  .error_fmt = QERR_DEVICE_NOT_REMOVABLE,
  .desc  = Device %(device) is not removable,
  },

 and replaced the print with

  qemu_error_new(QERR_DEVICE_NOT_REMOVABLE,
 bdrv_get_device_name(bs));

 This even improved the message from device is not removable to device
 hda is not removable.  Commit 2c2a6bb8.


 We also ran into cases like this one, in qdev_device_add():

 -qemu_error(Device \%s\ not found.  Try -device '?' for a 
 list.\n,
 -   driver);
 +qemu_error_new(QERR_DEVICE_NOT_FOUND, driver);

 Here, the pretty-print template became

  .desc  = The %(device) device has not been found,

 Note the loss of the helpful Try -device '?' for a list. part.  That's
 because the same error is used in other places, where that piece of
 advice doesn't apply.  Commit 3ced9f7a.


 For similar reasons, Invalid CPU index changed to the more generic
 Invalid parameter index in monitor command cpu (commit cc0c4185).
 More examples in my private tree: set_link's invalid link status 'off';
 only 'up' or 'down' valid becomes Invalid parameter up_or_down, and
 migrate's migration already in progress becomes Already in progress.


 Most of the conversions touched only monitor code.  Things get much more
 complicated for code shared between the monitor and other places.  For
 instance, there's this one in qdev_device_add():

  qemu_error(-device: no driver specified\n);

 Of course, this message is pretty pathetic already, in several ways:

 * A command line can contain several -device, and the error message
leaves finding the offending one to the user.

 * It's even worse with configuration files.  FILE:LINENR is par for that
course.

 * It talks about -device even when we're in the monitor or a
configuration file.  To trigger it in the monitor, try device_add
a=b.

  From QMP's point of view, the appropriate error to use is this one:

 #define QERR_MISSING_PARAMETER \
  { 'class': 'MissingParameter', 'data': { 'name': %s } }

 It's pretty-print template is

  .desc  = Parameter %(name) is missing,

 This changes the message -device: no driver specified to Parameter
 driver is missing.  We go from bad to worse.


 I think we have a few related but nevertheless distinct issues here:

 * We need to identify the error message's object.  Proper identification
depends on the context in which the code is executing:

- Command line: the option, with its argument, if any.

- Config file: filename, line number.

- Monitor: command (this is trivial)

This is what's lacking in the no driver specified example, even
before conversion to QError.

Dragging context information along all over the place so we can use it
to make better error messages would be cumbersome and invasive.  There
must be a better way.

Note that qemu_error_new() already has a primitive notion of context:
it distinguishes between human monitor, QMP monitor, and anything
else.  This could be refined, so it can add suitable context
information to the error without encumbering its callers.  For
instance, a hypothetical error message pants on fire, emitted with

qemu_error_new(QERR_PANTS_ON_FILE)

should become

- qemu-system-x86_64: -device pants,color=black: pants on fire in
  the context of command line option -device pants,color=black,

- vm.cfg:123: pants on fire in the context of configuration file
  vm.cfg, line 123,

- pants on fire in the human monitor, and

- { error: { class: PantsOnFire, data: {},
   desc: pants on fire } }
  in QMP.

 * Occasionally, we'd like to add help, but not for all uses of the same
error, and not in all contexts.

This is the Try -device '?' for a list example.  We'd like to print
that if we're working for -device (user qdev_device_add(), context
command line).  And maybe Try device_add ? for a list if we're
working for device_add (same user, context human monitor).

I believe this is rare enough to permit an ad hoc solution, like this:

  qemu_error_new(QERR_DEVICE_NOT_FOUND, driver);
  printf_if_cmdline(Try -device '?' for a list.\n);
  printf_if_human_monitor(Try device_add ? for a list.\n);

Wouldn't work for the monitor right now, because the actual printing
of the error gets delayed until the handler returns, but that's

[Qemu-devel] Re: [PATCH] QEMU e820 reservation patch

2010-02-15 Thread Stefano Stabellini
On Mon, 15 Feb 2010, Jes Sorensen wrote:
 Hi,
 
 Kevin and I have agreed on the approach for this one now. So here is
 the latest version of the patch for QEMU, submitting e820 reservation
 entries via fw_cfg.
 

I think the interface is fine and it is perfectly usable by Xen as well.

Cheers,

Stefano




[Qemu-devel] [PATCH 1/7] Move TARGET_PHYS_ADDR_SPACE_BITS to target-*/cpu.h.

2010-02-15 Thread Richard Henderson
Removes a set of ifdefs from exec.c.

Introduce TARGET_VIRT_ADDR_SPACE_BITS for all targets other
than Alpha.  This will be used for page_find_alloc, which is
supposed to be using virtual addresses in the first place.
---
 exec.c  |   17 -
 target-alpha/cpu.h  |4 +++-
 target-arm/cpu.h|3 +++
 target-cris/cpu.h   |3 +++
 target-i386/cpu.h   |   11 +++
 target-m68k/cpu.h   |3 +++
 target-microblaze/cpu.h |3 +++
 target-mips/mips-defs.h |4 
 target-ppc/cpu.h|   17 +
 target-s390x/cpu.h  |5 +
 target-sh4/cpu.h|3 +++
 target-sparc/cpu.h  |8 
 12 files changed, 63 insertions(+), 18 deletions(-)

diff --git a/exec.c b/exec.c
index 8389c54..766568b 100644
--- a/exec.c
+++ b/exec.c
@@ -62,23 +62,6 @@
 
 #define SMC_BITMAP_USE_THRESHOLD 10
 
-#if defined(TARGET_SPARC64)
-#define TARGET_PHYS_ADDR_SPACE_BITS 41
-#elif defined(TARGET_SPARC)
-#define TARGET_PHYS_ADDR_SPACE_BITS 36
-#elif defined(TARGET_ALPHA)
-#define TARGET_PHYS_ADDR_SPACE_BITS 42
-#define TARGET_VIRT_ADDR_SPACE_BITS 42
-#elif defined(TARGET_PPC64)
-#define TARGET_PHYS_ADDR_SPACE_BITS 42
-#elif defined(TARGET_X86_64)
-#define TARGET_PHYS_ADDR_SPACE_BITS 42
-#elif defined(TARGET_I386)
-#define TARGET_PHYS_ADDR_SPACE_BITS 36
-#else
-#define TARGET_PHYS_ADDR_SPACE_BITS 32
-#endif
-
 static TranslationBlock *tbs;
 int code_gen_max_blocks;
 TranslationBlock *tb_phys_hash[CODE_GEN_PHYS_HASH_SIZE];
diff --git a/target-alpha/cpu.h b/target-alpha/cpu.h
index c0dff4b..c144b4b 100644
--- a/target-alpha/cpu.h
+++ b/target-alpha/cpu.h
@@ -41,7 +41,9 @@
 
 #define TARGET_PAGE_BITS 13
 
-#define VA_BITS 43
+/* ??? EV4 has 34 phys addr bits, EV5 has 40, EV6 has 44.  */
+#define TARGET_PHYS_ADDR_SPACE_BITS44
+#define TARGET_VIRT_ADDR_SPACE_BITS(30 + TARGET_PAGE_BITS)
 
 /* Alpha major type */
 enum {
diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 4a1c53f..3892db4 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -405,6 +405,9 @@ void cpu_arm_set_cp_io(CPUARMState *env, int cpnum,
 #define TARGET_PAGE_BITS 10
 #endif
 
+#define TARGET_PHYS_ADDR_SPACE_BITS 32
+#define TARGET_VIRT_ADDR_SPACE_BITS 32
+
 #define cpu_init cpu_arm_init
 #define cpu_exec cpu_arm_exec
 #define cpu_gen_code cpu_arm_gen_code
diff --git a/target-cris/cpu.h b/target-cris/cpu.h
index 8ff86d9..063a240 100644
--- a/target-cris/cpu.h
+++ b/target-cris/cpu.h
@@ -200,6 +200,9 @@ enum {
 #define TARGET_PAGE_BITS 13
 #define MMAP_SHIFT TARGET_PAGE_BITS
 
+#define TARGET_PHYS_ADDR_SPACE_BITS 32
+#define TARGET_VIRT_ADDR_SPACE_BITS 32
+
 #define cpu_init cpu_cris_init
 #define cpu_exec cpu_cris_exec
 #define cpu_gen_code cpu_cris_gen_code
diff --git a/target-i386/cpu.h b/target-i386/cpu.h
index 216b00e..7fb84db 100644
--- a/target-i386/cpu.h
+++ b/target-i386/cpu.h
@@ -872,6 +872,17 @@ uint64_t cpu_get_tsc(CPUX86State *env);
 
 #define TARGET_PAGE_BITS 12
 
+#ifdef TARGET_X86_64
+#define TARGET_PHYS_ADDR_SPACE_BITS 52
+/* ??? This is really 48 bits, sign-extended, but the only thing 
+   accessible to userland with bit 48 set is the VSYSCALL, and that
+   is handled via other mechanisms.  */
+#define TARGET_VIRT_ADDR_SPACE_BITS 47
+#else
+#define TARGET_PHYS_ADDR_SPACE_BITS 36
+#define TARGET_VIRT_ADDR_SPACE_BITS 32
+#endif
+
 #define cpu_init cpu_x86_init
 #define cpu_exec cpu_x86_exec
 #define cpu_gen_code cpu_x86_gen_code
diff --git a/target-m68k/cpu.h b/target-m68k/cpu.h
index 68a7e41..b2f37ec 100644
--- a/target-m68k/cpu.h
+++ b/target-m68k/cpu.h
@@ -210,6 +210,9 @@ void register_m68k_insns (CPUM68KState *env);
 #define TARGET_PAGE_BITS 10
 #endif
 
+#define TARGET_PHYS_ADDR_SPACE_BITS 32
+#define TARGET_VIRT_ADDR_SPACE_BITS 32
+
 #define cpu_init cpu_m68k_init
 #define cpu_exec cpu_m68k_exec
 #define cpu_gen_code cpu_m68k_gen_code
diff --git a/target-microblaze/cpu.h b/target-microblaze/cpu.h
index 1bf4875..5999386 100644
--- a/target-microblaze/cpu.h
+++ b/target-microblaze/cpu.h
@@ -253,6 +253,9 @@ enum {
 #define TARGET_PAGE_BITS 12
 #define MMAP_SHIFT TARGET_PAGE_BITS
 
+#define TARGET_PHYS_ADDR_SPACE_BITS 32
+#define TARGET_VIRT_ADDR_SPACE_BITS 32
+
 #define cpu_init cpu_mb_init
 #define cpu_exec cpu_mb_exec
 #define cpu_gen_code cpu_mb_gen_code
diff --git a/target-mips/mips-defs.h b/target-mips/mips-defs.h
index 54e80f1..0f6a956 100644
--- a/target-mips/mips-defs.h
+++ b/target-mips/mips-defs.h
@@ -8,6 +8,10 @@
 #define TARGET_PAGE_BITS 12
 #define MIPS_TLB_MAX 128
 
+/* ??? MIPS64 no doubt has a larger address space.  */
+#define TARGET_PHYS_ADDR_SPACE_BITS 32
+#define TARGET_VIRT_ADDR_SPACE_BITS 32
+
 #if defined(TARGET_MIPS64)
 #define TARGET_LONG_BITS 64
 #else
diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h
index d15bba1..c91f8fe 100644
--- a/target-ppc/cpu.h
+++ b/target-ppc/cpu.h
@@ -29,6 +29,20 @@
 #define TARGET_LONG_BITS 64
 #define TARGET_PAGE_BITS 12
 
+/* Note that the official physical address space bits 

[Qemu-devel] [PATCH 3/7] Fix last page errors in page_set_flags and page_check_range.

2010-02-15 Thread Richard Henderson
The addr  end comparison prevents the last page from being
iterated; an iteration based on length avoids this problem.
---
 exec.c |   39 +--
 1 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/exec.c b/exec.c
index 766568b..1521ce3 100644
--- a/exec.c
+++ b/exec.c
@@ -2230,27 +2230,27 @@ int page_get_flags(target_ulong address)
 return p-flags;
 }
 
-/* modify the flags of a page and invalidate the code if
-   necessary. The flag PAGE_WRITE_ORG is positioned automatically
-   depending on PAGE_WRITE */
+/* Modify the flags of a page and invalidate the code if necessary.
+   The flag PAGE_WRITE_ORG is positioned automatically depending
+   on PAGE_WRITE.  The mmap_lock should already be held.  */
 void page_set_flags(target_ulong start, target_ulong end, int flags)
 {
-PageDesc *p;
-target_ulong addr;
+target_ulong addr, len;
 
-/* mmap_lock should already be held.  */
 start = start  TARGET_PAGE_MASK;
 end = TARGET_PAGE_ALIGN(end);
-if (flags  PAGE_WRITE)
+
+if (flags  PAGE_WRITE) {
 flags |= PAGE_WRITE_ORG;
-for(addr = start; addr  end; addr += TARGET_PAGE_SIZE) {
-p = page_find_alloc(addr  TARGET_PAGE_BITS);
-/* We may be called for host regions that are outside guest
-   address space.  */
-if (!p)
-return;
-/* if the write protection is set, then we invalidate the code
-   inside */
+}
+
+for (addr = start, len = end - start;
+ len != 0;
+ len -= TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
+PageDesc *p = page_find_alloc(addr  TARGET_PAGE_BITS, 1);
+
+/* If the write protection bit is set, then we invalidate
+   the code inside.  */
 if (!(p-flags  PAGE_WRITE) 
 (flags  PAGE_WRITE) 
 p-first_tb) {
@@ -2266,14 +2266,17 @@ int page_check_range(target_ulong start, target_ulong 
len, int flags)
 target_ulong end;
 target_ulong addr;
 
-if (start + len  start)
-/* we've wrapped around */
+if (start + len - 1  start) {
+/* We've wrapped around.  */
 return -1;
+}
 
 end = TARGET_PAGE_ALIGN(start+len); /* must do before we loose bits in the 
next step */
 start = start  TARGET_PAGE_MASK;
 
-for(addr = start; addr  end; addr += TARGET_PAGE_SIZE) {
+for (addr = start, len = end - start;
+ len != 0;
+ len -= TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
 p = page_find(addr  TARGET_PAGE_BITS);
 if( !p )
 return -1;
-- 
1.6.6





[Qemu-devel] [PATCH 7/7] Assert arguments in range for guest address space.

2010-02-15 Thread Richard Henderson
We don't expect host addresses within page_set_flags or
page_check_range.
---
 exec.c |   15 +++
 1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/exec.c b/exec.c
index bb712ec..10673fc 100644
--- a/exec.c
+++ b/exec.c
@@ -2327,6 +2327,14 @@ void page_set_flags(target_ulong start, target_ulong 
end, int flags)
 {
 target_ulong addr, len;
 
+/* This function should never be called with addresses outside the
+   guest address space.  If this assert fires, it probably indicates
+   a missing call to h2g_valid.  */
+#if HOST_LONG_BITS  TARGET_VIRT_ADDR_SPACE_BITS
+assert(end  (1ul  TARGET_VIRT_ADDR_SPACE_BITS));
+#endif
+assert(start  end);
+
 start = start  TARGET_PAGE_MASK;
 end = TARGET_PAGE_ALIGN(end);
 
@@ -2356,6 +2364,13 @@ int page_check_range(target_ulong start, target_ulong 
len, int flags)
 target_ulong end;
 target_ulong addr;
 
+/* This function should never be called with addresses outside the
+   guest address space.  If this assert fires, it probably indicates
+   a missing call to h2g_valid.  */
+#if HOST_LONG_BITS  TARGET_VIRT_ADDR_SPACE_BITS
+assert(start  (1ul  TARGET_VIRT_ADDR_SPACE_BITS));
+#endif
+
 if (start + len - 1  start) {
 /* We've wrapped around.  */
 return -1;
-- 
1.6.6





[Qemu-devel] [PATCH 5/7] linux-user: Use h2g_valid in qemu_vmalloc.

2010-02-15 Thread Richard Henderson
---
 linux-user/mmap.c |9 -
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 25fc0b2..65fdc33 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -80,16 +80,15 @@ void mmap_unlock(void)
 void *qemu_vmalloc(size_t size)
 {
 void *p;
-unsigned long addr;
+
 mmap_lock();
 /* Use map and mark the pages as used.  */
 p = mmap(NULL, size, PROT_READ | PROT_WRITE,
  MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
 
-addr = (unsigned long)p;
-if (addr == (target_ulong) addr) {
-/* Allocated region overlaps guest address space.
-   This may recurse.  */
+if (h2g_valid(p)) {
+/* Allocated region overlaps guest address space. This may recurse.  */
+unsigned long addr = h2g(p);
 page_set_flags(addr  TARGET_PAGE_MASK, TARGET_PAGE_ALIGN(addr + size),
PAGE_RESERVED);
 }
-- 
1.6.6





[Qemu-devel] [PATCH 2/7] Use TARGET_VIRT_ADDR_SPACE_BITS in h2g_valid.

2010-02-15 Thread Richard Henderson
Previously, only 32-bit guests had a proper check for the
validity of the virtual address.  Extend that check to 64-bit
guests with a restricted virtual address space.
---
 cpu-all.h |   16 +++-
 1 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index 1ccc9a8..b81641f 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -634,16 +634,22 @@ extern int have_guest_base;
 
 /* All direct uses of g2h and h2g need to go away for usermode softmmu.  */
 #define g2h(x) ((void *)((unsigned long)(x) + GUEST_BASE))
+
+#if HOST_LONG_BITS == TARGET_VIRT_ADDR_SPACE_BITS
+#define h2g_valid(x) 1
+#else
+#define h2g_valid(x) ({ \
+unsigned long __guest = (unsigned long)(x) - GUEST_BASE; \
+__guest  (1ul  TARGET_VIRT_ADDR_SPACE_BITS); \
+})
+#endif
+
 #define h2g(x) ({ \
 unsigned long __ret = (unsigned long)(x) - GUEST_BASE; \
 /* Check if given address fits target address space */ \
-assert(__ret == (abi_ulong)__ret); \
+assert(h2g_valid(x)); \
 (abi_ulong)__ret; \
 })
-#define h2g_valid(x) ({ \
-unsigned long __guest = (unsigned long)(x) - GUEST_BASE; \
-(__guest == (abi_ulong)__guest); \
-})
 
 #define saddr(x) g2h(x)
 #define laddr(x) g2h(x)
-- 
1.6.6





[Qemu-devel] [PATCH 0/7] Multi-level page tables and userland mapping fixes, v2

2010-02-15 Thread Richard Henderson
Changes since v1:
  * Sparc virt and phys address range corrections.
  * Unrelated changes removed.
  * Assertions added for guest address space.


r~



Richard Henderson (7):
  Move TARGET_PHYS_ADDR_SPACE_BITS to target-*/cpu.h.
  Use TARGET_VIRT_ADDR_SPACE_BITS in h2g_valid.
  Fix last page errors in page_set_flags and page_check_range.
  Implement multi-level page tables.
  linux-user: Use h2g_valid in qemu_vmalloc.
  linux-user: Fix mmap_find_vma returning invalid addresses.
  Assert arguments in range for guest address space.

 cpu-all.h   |   23 ++-
 exec.c  |  513 ---
 linux-user/main.c   |7 +-
 linux-user/mmap.c   |  111 ---
 target-alpha/cpu.h  |4 +-
 target-arm/cpu.h|3 +
 target-cris/cpu.h   |3 +
 target-i386/cpu.h   |   11 +
 target-m68k/cpu.h   |3 +
 target-microblaze/cpu.h |3 +
 target-mips/mips-defs.h |4 +
 target-ppc/cpu.h|   17 ++
 target-s390x/cpu.h  |5 +
 target-sh4/cpu.h|3 +
 target-sparc/cpu.h  |8 +
 15 files changed, 465 insertions(+), 253 deletions(-)





[Qemu-devel] [PATCH 6/7] linux-user: Fix mmap_find_vma returning invalid addresses.

2010-02-15 Thread Richard Henderson
Don't return addresses that aren't properly aligned for the guest,
e.g. when the guest has a larger page size than the host.  Don't
return addresses that are outside the virtual address space for the
target, by paying proper attention to the h2g/g2h macros.

At the same time, place the default mapping base for 64-bit guests
(on 64-bit hosts) outside the low 4G.  Consistently interpret
mmap_next_start in the guest address space.
---
 linux-user/main.c |7 +---
 linux-user/mmap.c |  102 
 2 files changed, 80 insertions(+), 29 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index a0d8ce7..7db9fc3 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2725,12 +2725,9 @@ int main(int argc, char **argv, char **envp)
 /*
  * Read in mmap_min_addr kernel parameter.  This value is used
  * When loading the ELF image to determine whether guest_base
- * is needed.
- *
- * When user has explicitly set the quest base, we skip this
- * test.
+ * is needed.  It is also used in mmap_find_vma.
  */
-if (!have_guest_base) {
+{
 FILE *fp;
 
 if ((fp = fopen(/proc/sys/vm/mmap_min_addr, r)) != NULL) {
diff --git a/linux-user/mmap.c b/linux-user/mmap.c
index 65fdc33..ad00b6f 100644
--- a/linux-user/mmap.c
+++ b/linux-user/mmap.c
@@ -264,12 +264,15 @@ static int mmap_frag(abi_ulong real_start,
 return 0;
 }
 
-#if defined(__CYGWIN__)
+#if HOST_LONG_BITS == 64  TARGET_ABI_BITS == 64
+# define TASK_UNMAPPED_BASE  (1ul  38)
+#elif defined(__CYGWIN__)
 /* Cygwin doesn't have a whole lot of address space.  */
-static abi_ulong mmap_next_start = 0x1800;
+# define TASK_UNMAPPED_BASE  0x1800
 #else
-static abi_ulong mmap_next_start = 0x4000;
+# define TASK_UNMAPPED_BASE  0x4000
 #endif
+static abi_ulong mmap_next_start = TASK_UNMAPPED_BASE;
 
 unsigned long last_brk;
 
@@ -281,19 +284,24 @@ unsigned long last_brk;
  */
 abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size)
 {
-void *ptr;
+void *ptr, *prev;
 abi_ulong addr;
-
-size = HOST_PAGE_ALIGN(size);
-start = qemu_host_page_mask;
+int wrapped, repeat;
 
 /* If 'start' == 0, then a default start address is used. */
-if (start == 0)
+if (start == 0) {
 start = mmap_next_start;
+} else {
+start = qemu_host_page_mask;
+}
+
+size = HOST_PAGE_ALIGN(size);
 
 addr = start;
+wrapped = repeat = 0;
+prev = 0;
 
-for(;;) {
+for (;; prev = ptr) {
 /*
  * Reserve needed memory area to avoid a race.
  * It should be discarded using:
@@ -301,31 +309,77 @@ abi_ulong mmap_find_vma(abi_ulong start, abi_ulong size)
  *  - mremap() with MREMAP_FIXED flag
  *  - shmat() with SHM_REMAP flag
  */
-ptr = mmap((void *)(unsigned long)addr, size, PROT_NONE,
+ptr = mmap(g2h(addr), size, PROT_NONE,
MAP_ANONYMOUS|MAP_PRIVATE|MAP_NORESERVE, -1, 0);
 
 /* ENOMEM, if host address space has no memory */
-if (ptr == MAP_FAILED)
+if (ptr == MAP_FAILED) {
 return (abi_ulong)-1;
+}
 
-/* If address fits target address space we've found what we need */
-if ((unsigned long)ptr + size - 1 = (abi_ulong)-1)
-break;
+/* Count the number of sequential returns of the same address.
+   This is used to modify the search algorithm below.  */
+repeat = (ptr == prev ? repeat + 1 : 0);
 
-/* Unmap and try again with new page */
+if (h2g_valid(ptr + size - 1)) {
+addr = h2g(ptr);
+
+if ((addr  ~TARGET_PAGE_MASK) == 0) {
+/* Success.  */
+if (start == mmap_next_start  addr = TASK_UNMAPPED_BASE) {
+mmap_next_start = addr + size;
+}
+return addr;
+}
+
+/* The address is not properly aligned for the target.  */
+switch (repeat) {
+case 0:
+/* Assume the result that the kernel gave us is the
+   first with enough free space, so start again at the
+   next higher target page.  */
+addr = TARGET_PAGE_ALIGN(addr);
+break;
+case 1:
+/* Sometimes the kernel decides to perform the allocation
+   at the top end of memory instead.  */
+addr = TARGET_PAGE_MASK;
+break;
+case 2:
+/* Start over at low memory.  */
+addr = 0;
+break;
+default:
+/* Fail.  This unaligned block must the last.  */
+addr = -1;
+break;
+}
+} else {
+/* Since the result the kernel gave didn't fit, start
+   again at low memory.  If any repetition, fail.  */
+addr = 

[Qemu-devel] [PATCH 4/7] Implement multi-level page tables.

2010-02-15 Thread Richard Henderson
Use TARGET_VIRT_ADDR_SPACE_BITS for the virtual memory map based off
of l1_map.  This rewrites page_find_alloc, page_flush_tb, and
walk_memory_regions.

Use TARGET_PHYS_ADDR_SPACE_BITS for the physical memory map based off
of l1_phys_map.  This rewrites page_phys_find_alloc and
phys_page_for_each.
---
 cpu-all.h |7 +-
 exec.c|  442 +
 2 files changed, 271 insertions(+), 178 deletions(-)

diff --git a/cpu-all.h b/cpu-all.h
index b81641f..510f0b4 100644
--- a/cpu-all.h
+++ b/cpu-all.h
@@ -745,8 +745,11 @@ extern unsigned long qemu_host_page_mask;
 #define PAGE_RESERVED  0x0020
 
 void page_dump(FILE *f);
-int walk_memory_regions(void *,
-int (*fn)(void *, unsigned long, unsigned long, unsigned long));
+
+typedef int (*walk_memory_regions_fn)(void *, unsigned long,
+  unsigned long, unsigned long);
+int walk_memory_regions(void *, walk_memory_regions_fn);
+
 int page_get_flags(target_ulong address);
 void page_set_flags(target_ulong start, target_ulong end, int flags);
 int page_check_range(target_ulong start, target_ulong len, int flags);
diff --git a/exec.c b/exec.c
index 1521ce3..bb712ec 100644
--- a/exec.c
+++ b/exec.c
@@ -141,28 +141,47 @@ typedef struct PhysPageDesc {
 ram_addr_t region_offset;
 } PhysPageDesc;
 
+/* Size of the L2 (and L3, etc) page tables.  */
 #define L2_BITS 10
-#if defined(CONFIG_USER_ONLY)  defined(TARGET_VIRT_ADDR_SPACE_BITS)
-/* XXX: this is a temporary hack for alpha target.
- *  In the future, this is to be replaced by a multi-level table
- *  to actually be able to handle the complete 64 bits address space.
- */
-#define L1_BITS (TARGET_VIRT_ADDR_SPACE_BITS - L2_BITS - TARGET_PAGE_BITS)
+#define L2_SIZE (1  L2_BITS)
+
+/* The bits remaining after N lower levels of page tables.  */
+#define P_L1_BITS_REM \
+((TARGET_PHYS_ADDR_SPACE_BITS - TARGET_PAGE_BITS) % L2_BITS)
+#define V_L1_BITS_REM \
+((TARGET_VIRT_ADDR_SPACE_BITS - TARGET_PAGE_BITS) % L2_BITS)
+
+/* Size of the L1 page table.  Avoid silly small sizes.  */
+#if P_L1_BITS_REM  4
+#define P_L1_BITS  (P_L1_BITS_REM + L2_BITS)
 #else
-#define L1_BITS (32 - L2_BITS - TARGET_PAGE_BITS)
+#define P_L1_BITS  P_L1_BITS_REM
 #endif
 
-#define L1_SIZE (1  L1_BITS)
-#define L2_SIZE (1  L2_BITS)
+#if V_L1_BITS_REM  4
+#define V_L1_BITS  (V_L1_BITS_REM + L2_BITS)
+#else
+#define V_L1_BITS  V_L1_BITS_REM
+#endif
+
+#define P_L1_SIZE  ((target_phys_addr_t)1  P_L1_BITS)
+#define V_L1_SIZE  ((target_ulong)1  V_L1_BITS)
+
+#define P_L1_SHIFT (TARGET_PHYS_ADDR_SPACE_BITS - TARGET_PAGE_BITS - P_L1_BITS)
+#define V_L1_SHIFT (TARGET_VIRT_ADDR_SPACE_BITS - TARGET_PAGE_BITS - V_L1_BITS)
 
 unsigned long qemu_real_host_page_size;
 unsigned long qemu_host_page_bits;
 unsigned long qemu_host_page_size;
 unsigned long qemu_host_page_mask;
 
-/* XXX: for system emulation, it could just be an array */
-static PageDesc *l1_map[L1_SIZE];
-static PhysPageDesc **l1_phys_map;
+/* This is a multi-level map on the virtual address space.
+   The bottom level has pointers to PageDesc.  */
+static void *l1_map[V_L1_SIZE];
+
+/* This is a multi-level map on the physical address space.
+   The bottom level has pointers to PhysPageDesc.  */
+static void *l1_phys_map[P_L1_SIZE];
 
 #if !defined(CONFIG_USER_ONLY)
 static void io_mem_init(void);
@@ -247,130 +266,158 @@ static void page_init(void)
 while ((1  qemu_host_page_bits)  qemu_host_page_size)
 qemu_host_page_bits++;
 qemu_host_page_mask = ~(qemu_host_page_size - 1);
-l1_phys_map = qemu_vmalloc(L1_SIZE * sizeof(void *));
-memset(l1_phys_map, 0, L1_SIZE * sizeof(void *));
 
 #if !defined(_WIN32)  defined(CONFIG_USER_ONLY)
 {
-long long startaddr, endaddr;
 FILE *f;
-int n;
 
-mmap_lock();
 last_brk = (unsigned long)sbrk(0);
+
 f = fopen(/proc/self/maps, r);
 if (f) {
+mmap_lock();
+
 do {
-n = fscanf (f, %llx-%llx %*[^\n]\n, startaddr, endaddr);
-if (n == 2) {
-startaddr = MIN(startaddr,
-(1ULL  TARGET_PHYS_ADDR_SPACE_BITS) - 1);
-endaddr = MIN(endaddr,
-(1ULL  TARGET_PHYS_ADDR_SPACE_BITS) - 1);
-page_set_flags(startaddr  TARGET_PAGE_MASK,
-   TARGET_PAGE_ALIGN(endaddr),
-   PAGE_RESERVED); 
+unsigned long startaddr, endaddr;
+int n;
+
+n = fscanf (f, %lx-%lx %*[^\n]\n, startaddr, endaddr);
+
+if (n == 2  h2g_valid(startaddr)) {
+startaddr = h2g(startaddr)  TARGET_PAGE_MASK;
+
+if (h2g_valid(endaddr)) {
+endaddr = h2g(endaddr);
+} else {
+endaddr = ~0ul;
+}
+

Re: [Qemu-devel] Re: [PATCH 2/2] tcg-sparc: Implement setcond, setcond2.

2010-02-15 Thread Richard Henderson
On 02/13/2010 02:01 PM, Blue Swirl wrote:
 On Tue, Feb 9, 2010 at 11:37 PM, Richard Henderson r...@twiddle.net wrote:
 ---
  tcg/sparc/tcg-target.c |  126 
 
  1 files changed, 126 insertions(+), 0 deletions(-)
 
 Something's wrong with the patch...

Oops.  The tree wasn't properly committed when I
extracted the patch.  Here's that last again.


r~
---
commit 8f76ac8882ff2b0d9db402352b9cf632cc92f84f
Author: Richard Henderson r...@twiddle.net
Date:   Mon Feb 15 13:19:49 2010 -0800

tcg-sparc: Implement setcond, setcond2.

diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c
index dd7a598..5b4347a 100644
--- a/tcg/sparc/tcg-target.c
+++ b/tcg/sparc/tcg-target.c
@@ -194,6 +194,7 @@ static inline int tcg_target_const_match(tcg_target_long 
val,
 #define INSN_RS2(x) (x)
 #define INSN_ASI(x) ((x)  5)
 
+#define INSN_IMM11(x) ((1  13) | ((x)  0x7ff))
 #define INSN_IMM13(x) ((1  13) | ((x)  0x1fff))
 #define INSN_OFF19(x) (((x)  2)  0x07)
 #define INSN_OFF22(x) (((x)  2)  0x3f)
@@ -217,6 +218,9 @@ static inline int tcg_target_const_match(tcg_target_long 
val,
 #define COND_VC0xf
 #define BA (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x2))
 
+#define MOVCC_ICC  (1  18)
+#define MOVCC_XCC  (1  18 | 1  12)
+
 #define ARITH_ADD  (INSN_OP(2) | INSN_OP3(0x00))
 #define ARITH_ADDCC (INSN_OP(2) | INSN_OP3(0x10))
 #define ARITH_AND  (INSN_OP(2) | INSN_OP3(0x01))
@@ -233,6 +237,7 @@ static inline int tcg_target_const_match(tcg_target_long 
val,
 #define ARITH_MULX (INSN_OP(2) | INSN_OP3(0x09))
 #define ARITH_UDIVX (INSN_OP(2) | INSN_OP3(0x0d))
 #define ARITH_SDIVX (INSN_OP(2) | INSN_OP3(0x2d))
+#define ARITH_MOVCC (INSN_OP(2) | INSN_OP3(0x2c))
 
 #define SHIFT_SLL  (INSN_OP(2) | INSN_OP3(0x25))
 #define SHIFT_SRL  (INSN_OP(2) | INSN_OP3(0x26))
@@ -580,6 +585,109 @@ static void tcg_out_brcond2_i32(TCGContext *s, int cond,
 }
 #endif
 
+static void tcg_out_setcond_i32(TCGContext *s, int cond, TCGArg ret,
+TCGArg c1, TCGArg c2, int c2const)
+{
+TCGArg t;
+
+/* For 32-bit comparisons, we can play games with ADDX/SUBX.  */
+switch (cond) {
+case TCG_COND_EQ:
+case TCG_COND_NE:
+if (c2 != 0) {
+tcg_out_arithc(s, ret, c1, c2, c2const, ARITH_XOR);
+}
+c1 = TCG_REG_G0, c2 = ret, c2const = 0;
+cond = (cond == TCG_COND_EQ ? TCG_COND_LEU : TCG_COND_LTU);
+   break;
+
+case TCG_COND_GTU:
+case TCG_COND_GEU:
+if (c2const  c2 != 0) {
+tcg_out_movi_imm13(s, TCG_REG_I5, c2);
+c2 = TCG_REG_I5;
+}
+t = c1, c1 = c2, c2 = t, c2const = 0;
+cond = tcg_swap_cond(cond);
+break;
+
+case TCG_COND_LTU:
+case TCG_COND_LEU:
+break;
+
+default:
+tcg_out_cmp(s, c1, c2, c2const);
+#if defined(__sparc_v9__) || defined(__sparc_v8plus__)
+tcg_out_movi_imm13(s, ret, 0);
+tcg_out32 (s, ARITH_MOVCC | INSN_RD(ret)
+   | INSN_RS1(tcg_cond_to_bcond[cond])
+   | MOVCC_ICC | INSN_IMM11(1));
+#else
+t = gen_new_label();
+tcg_out_branch_i32(s, INSN_COND(tcg_cond_to_bcond[cond], 1), t);
+tcg_out_movi_imm13(s, ret, 1);
+tcg_out_movi_imm13(s, ret, 0);
+tcg_out_label(s, t, (tcg_target_long)s-code_ptr);
+#endif
+return;
+}
+
+tcg_out_cmp(s, c1, c2, c2const);
+if (cond == TCG_COND_LTU) {
+tcg_out_arithi(s, ret, TCG_REG_G0, 0, ARITH_ADDX);
+} else {
+tcg_out_arithi(s, ret, TCG_REG_G0, -1, ARITH_SUBX);
+}
+}
+
+#if TCG_TARGET_REG_BITS == 64
+static void tcg_out_setcond_i64(TCGContext *s, int cond, TCGArg ret,
+TCGArg c1, TCGArg c2, int c2const)
+{
+tcg_out_cmp(s, c1, c2, c2const);
+tcg_out_movi_imm13(s, ret, 0);
+tcg_out32 (s, ARITH_MOVCC | INSN_RD(ret)
+   | INSN_RS1(tcg_cond_to_bcond[cond])
+   | MOVCC_XCC | INSN_IMM11(1));
+}
+#else
+static void tcg_out_setcond2_i32(TCGContext *s, int cond, TCGArg ret,
+ TCGArg al, TCGArg ah,
+ TCGArg bl, int blconst,
+ TCGArg bh, int bhconst)
+{
+int lab;
+
+switch (cond) {
+case TCG_COND_EQ:
+tcg_out_setcond_i32(s, TCG_COND_EQ, TCG_REG_I5, al, bl, blconst);
+tcg_out_setcond_i32(s, TCG_COND_EQ, ret, ah, bh, bhconst);
+tcg_out_arith(s, ret, ret, TCG_REG_I5, ARITH_AND);
+break;
+
+case TCG_COND_NE:
+tcg_out_setcond_i32(s, TCG_COND_NE, TCG_REG_I5, al, al, blconst);
+tcg_out_setcond_i32(s, TCG_COND_NE, ret, ah, bh, bhconst);
+tcg_out_arith(s, ret, ret, TCG_REG_I5, ARITH_OR);
+break;
+
+default:
+lab = gen_new_label();
+
+tcg_out_cmp(s, ah, bh, bhconst);
+tcg_out_branch_i32(s, INSN_COND(tcg_cond_to_bcond[cond], 1), lab);
+tcg_out_movi_imm13(s, ret, 1);
+

Re: [Qemu-devel] Heads up: glibc preadv emulation breaks qemu on older kernels

2010-02-15 Thread Christoph Hellwig
On Fri, Feb 12, 2010 at 11:28:57AM -0600, Anthony Liguori wrote:
 Yes.  The BSDs tend to not play stupid emulation games in the libc, so
 changes of these kinds of messups to happen are far less.

 
 In all fairness, I seem to recall there being a problem with the kernel 
 implementation of preadv/pwritev too.

I would be surprised by that as it's just an entirely trivial entry
point to long existant infrastructure.  I even checked the commit logs
and that code hasn't been touched at all since the initial commit.
So if people have problems with it, please report it..





Re: [Qemu-devel] Heads up: glibc preadv emulation breaks qemu on older kernels

2010-02-15 Thread Christoph Hellwig
On Fri, Feb 12, 2010 at 06:47:55PM +, Richard W.M. Jones wrote:
 I don't think there's a good solution for this at the qemu level.
 However I have a working patch for glibc:
 
 https://bugzilla.redhat.com/show_bug.cgi?id=563103#c13

Any chance you could just persuade Uli to drop the optimization?  I
think users are generally better off to just do it themselves, as they
have to do that anyway for portable applications.





Re: [Qemu-devel] Heads up: glibc preadv emulation breaks qemu on older kernels

2010-02-15 Thread Richard W.M. Jones
On Mon, Feb 15, 2010 at 11:17:27PM +0100, Christoph Hellwig wrote:
 On Fri, Feb 12, 2010 at 06:47:55PM +, Richard W.M. Jones wrote:
  I don't think there's a good solution for this at the qemu level.
  However I have a working patch for glibc:
  
  https://bugzilla.redhat.com/show_bug.cgi?id=563103#c13
 
 Any chance you could just persuade Uli to drop the optimization?  I
 think users are generally better off to just do it themselves, as they
 have to do that anyway for portable applications.

I agree.  It seems like the sort of thing which is better in gnulib
than in glibc.  That way programs can choose to use gnulib for
portability, versus having to use it with glibc.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://et.redhat.com/~rjones/libguestfs/
See what it can do: http://et.redhat.com/~rjones/libguestfs/recipes.html




Re: [Qemu-devel] [PATCH 1/3] Add FreeBSD/ppc host ppc_init_cacheline_sizes() implementation.

2010-02-15 Thread Juergen Lock
On Mon, Feb 15, 2010 at 06:15:41AM +0300, malc wrote:
 On Sun, 14 Feb 2010, Juergen Lock wrote:
 
  Submitted by: Andreas Tobler andre...@fgznet.ch
  
  Signed-off-by: Juergen Lock n...@jelal.kn-bremen.de
  
  --- a/cache-utils.c
  +++ b/cache-utils.c
  @@ -57,6 +57,23 @@
   }
   #endif
   
  +#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
  +#include stdio.h
  +#include sys/types.h
  +#include sys/sysctl.h
  +
  +static void ppc_init_cacheline_sizes(void)
  +{
  +size_t len = 4;
  +unsigned cacheline;
  +
  +sysctlbyname (machdep.cacheline_size, cacheline, len, NULL, 0);
 
 Error handling missing.

Mmh I suspect thats a sysctl that simply can't fail but of course checking
still doesn't hurt:

Submitted by: Andreas Tobler andre...@fgznet.ch

Signed-off-by: Juergen Lock n...@jelal.kn-bremen.de

--- a/cache-utils.c
+++ b/cache-utils.c
@@ -57,6 +57,27 @@ static void ppc_init_cacheline_sizes(voi
 }
 #endif
 
+#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
+#include stdio.h
+#include sys/types.h
+#include sys/sysctl.h
+
+static void ppc_init_cacheline_sizes(void)
+{
+size_t len = 4;
+unsigned cacheline;
+
+if (sysctlbyname (machdep.cacheline_size, cacheline, len, NULL, 0)) {
+fprintf(stderr, sysctlbyname machdep.cacheline_size failed: %s\n,
+strerror(errno));
+exit(1);
+}
+
+qemu_cache_conf.dcache_bsize = cacheline;
+qemu_cache_conf.icache_bsize = cacheline;
+}
+#endif
+
 #ifdef __linux__
 void qemu_cache_utils_init(char **envp)
 {




Re: [Qemu-devel] [PATCH 1/3] Add FreeBSD/ppc host ppc_init_cacheline_sizes() implementation.

2010-02-15 Thread Juergen Lock
On Mon, Feb 15, 2010 at 06:16:07AM +0300, malc wrote:
 On Sun, 14 Feb 2010, Juergen Lock wrote:
 
  Submitted by: Andreas Tobler andre...@fgznet.ch
  
  Signed-off-by: Juergen Lock n...@jelal.kn-bremen.de
  
  --- a/cache-utils.c
  +++ b/cache-utils.c
  @@ -57,6 +57,23 @@
   }
   #endif
   
  +#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
 
 __FreeBSD_kernel__ is for something like Debian/kFreeBSD?

Yep, it is.  Support for it was added to qemu some time ago so I
figured I should do the same.

 Cheers,
Juergen




[Qemu-devel] Re: [PATCH 1/2] macvtap: rework object lifetime rules

2010-02-15 Thread Ed Swierk
On Sat, Feb 13, 2010 at 2:33 AM, Arnd Bergmann a...@arndb.de wrote:
 The original macvtap code has a number of problems
 resulting from the use of RCU for protecting the
 access to struct macvtap_queue from open files.

 This includes
 - need for GFP_ATOMIC allocations for skbs
 - potential deadlocks when copy_*_user sleeps
 - inability to work with vhost-net

 Changing the lifetime of macvtap_queue to always
 depend on the open file solves all these. The
 RCU reference simply moves one step down to
 the reference on the macvlan_dev, which we
 only need for nonblocking operations.

 Signed-off-by: Arnd Bergmann a...@arndb.de

This works for me.

Acked-by: Ed Swierk eswi...@aristanetworks.com




[Qemu-devel] Re: [PATCH] Seabios e820 reservation portion

2010-02-15 Thread Kevin O'Connor
On Mon, Feb 15, 2010 at 06:33:59PM +0100, Jes Sorensen wrote:
 Hi,
 
 This is the Seabios part to match my e820 reservation via fw_cfg patch.

This still has 'struct e820_entry' which is too similar to 'struct
e820entry' in memmap.h.  Otherwise, it looks good to me.

-Kevin




[Qemu-devel] Re: qemu-ppc can't run static uClibc binaries.

2010-02-15 Thread Rob Landley
On Monday 15 February 2010 07:08:33 Michael S. Tsirkin wrote:
 On Mon, Feb 15, 2010 at 06:58:33AM -0600, Rob Landley wrote:
  On Monday 15 February 2010 05:19:24 Alexander Graf wrote:
   On 15.02.2010, at 12:10, Rob Landley wrote:
On Sunday 14 February 2010 08:41:00 Alexander Graf wrote:
So the only case I can imagine that this breaks anything is that
uClibc requires register state to be 0.
   
Yes, r3 (which is the exit code from the exec syscall, and thus 0
if it worked).  In the BSD layout, it's argc (which can never be 0).
   
 http://lists.gnu.org/archive/html/qemu-devel/2007-03/msg00720.html
  
   So what you really want is something like
  
   #ifdef CONFIG_LINUX_USER
   /* exec return value is always 0 */
   env-gpr[3] = 0;
   #endif
  
   just after the #endif in your patch. If you had inlined your patch I
   could've commented it there.
 
  Unfortunately kmail plays fast and loose with whitespace when I inline
  stuff. (Not always, but I can't tell by inspection when it's decided it
  was hungry for tabs or wanted to throw in that horrible UTF8 escaped
  whitespace.)

 See Documentation/email-clients.txt under linux source tree.

I did.  That doesn't cover the different bugs in different versions, what 
happens when you use kmail under xfce, and so on.  (It also doesn't mention 
that you have to disable wordwrap for the entire message and hit return by 
hand at the end of each line to get kmail not to wrap inline includes.  Or 
that some versions of kmail embed NUL bytes into inline includes, for some 
reason.)

I could make it work, just didn't know this list had a tropism for inline.  
(Varies per list and I wander through a bunch of 'em.)  Over on the -hda sets 
/dev/hdc topic I posted a patch inline which was ignored because the behavior 
the Linux kernel has consistently had for the past decade or more isn't 
considered especially important.  Thus I didn't think you were likely 
following lkml tropes.

*shrug*  Now I know...

Rob
-- 
Latency is more important than throughput. It's that simple. - Linus Torvalds




Re: [Qemu-devel] [PATCH 1/3] Add FreeBSD/ppc host ppc_init_cacheline_sizes() implementation.

2010-02-15 Thread malc
On Mon, 15 Feb 2010, Juergen Lock wrote:

 On Mon, Feb 15, 2010 at 06:16:07AM +0300, malc wrote:
  On Sun, 14 Feb 2010, Juergen Lock wrote:
  
   Submitted by: Andreas Tobler andre...@fgznet.ch
   
   Signed-off-by: Juergen Lock n...@jelal.kn-bremen.de
   
   --- a/cache-utils.c
   +++ b/cache-utils.c
   @@ -57,6 +57,23 @@
}
#endif

   +#if defined(__FreeBSD__) || defined(__FreeBSD_kernel__)
  
  __FreeBSD_kernel__ is for something like Debian/kFreeBSD?
 
 Yep, it is.  Support for it was added to qemu some time ago so I
 figured I should do the same.
 

Perhaps it would be better to avoid this particular ifdefery and
just go with _CALL_SYSV, can you verify that:

~$ gcc -E -dM -x c /dev/null | grep SYSV

yields

#define _CALL_SYSV 1

on FreeBSD and Debian/kFreeBSD?

-- 
mailto:av1...@comtv.ru