[Qemu-devel] [PATCH 1/3] pc: add compat machine types for 0.14 and 0.15

2011-06-14 Thread Alexander Graf
We change virtual hardware from time to time, so in order to ensure that
the guest doesn't get confused, we have compatibility machines for the PC
machine.

This patch adds compatibility definitions for 0.14 and 0.15 versions.

Signed-off-by: Alexander Graf 
---
 hw/pc_piix.c |   32 +++-
 1 files changed, 31 insertions(+), 1 deletions(-)

diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 9a22a8a..3b3ef84 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -213,6 +213,18 @@ static void pc_init_pci(ram_addr_t ram_size,
  initrd_filename, cpu_model, 1, 1);
 }
 
+static void pc_init_pci_014(ram_addr_t ram_size,
+const char *boot_device,
+const char *kernel_filename,
+const char *kernel_cmdline,
+const char *initrd_filename,
+const char *cpu_model)
+{
+pc_init1(ram_size, boot_device,
+ kernel_filename, kernel_cmdline,
+ initrd_filename, cpu_model, 1, 1);
+}
+
 static void pc_init_pci_no_kvmclock(ram_addr_t ram_size,
 const char *boot_device,
 const char *kernel_filename,
@@ -258,7 +270,7 @@ static void pc_xen_hvm_init(ram_addr_t ram_size,
 #endif
 
 static QEMUMachine pc_machine = {
-.name = "pc-0.14",
+.name = "pc-0.16",
 .alias = "pc",
 .desc = "Standard PC",
 .init = pc_init_pci,
@@ -266,6 +278,22 @@ static QEMUMachine pc_machine = {
 .is_default = 1,
 };
 
+static QEMUMachine pc_machine_v0_15 = {
+.name = "pc-0.15",
+.desc = "Standard PC",
+.init = pc_init_pci_014,
+.max_cpus = 255,
+.is_default = 1,
+};
+
+static QEMUMachine pc_machine_v0_14 = {
+.name = "pc-0.14",
+.desc = "Standard PC",
+.init = pc_init_pci_014,
+.max_cpus = 255,
+.is_default = 1,
+};
+
 static QEMUMachine pc_machine_v0_13 = {
 .name = "pc-0.13",
 .desc = "Standard PC",
@@ -434,6 +462,8 @@ static QEMUMachine xenfv_machine = {
 static void pc_machine_init(void)
 {
 qemu_register_machine(&pc_machine);
+qemu_register_machine(&pc_machine_v0_15);
+qemu_register_machine(&pc_machine_v0_14);
 qemu_register_machine(&pc_machine_v0_13);
 qemu_register_machine(&pc_machine_v0_12);
 qemu_register_machine(&pc_machine_v0_11);
-- 
1.7.3.4




Re: [Qemu-devel] [PATCH] xen: avoid tracking the region 0xa0000 - 0xbffff

2011-06-14 Thread Alexander Graf

On 06/14/2011 05:24 PM, Stefano Stabellini wrote:

On Tue, 14 Jun 2011, Alexander Graf wrote:

On 14.06.2011, at 13:48, Stefano Stabellini wrote:


On Tue, 14 Jun 2011, Alexander Graf wrote:

On 03.06.2011, at 17:56,  
  wrote:


From: Stefano Stabellini

Xen can only do dirty bit tracking for one memory region, so we should
explicitly avoid trying to track the legacy VGA region between 0xa
and 0xb, rather than trying and failing.

Signed-off-by: Stefano Stabellini
---
xen-all.c |4 
1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/xen-all.c b/xen-all.c
index 9a5c3ec..1fdc2e8 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -218,6 +218,10 @@ static int xen_add_to_physmap(XenIOState *state,
if (get_physmapping(state, start_addr, size)) {
return 0;
}
+/* do not try to map legacy VGA memory */
+if (start_addr>= 0xa&&  start_addr + size<= 0xb) {

I don't quite like the hardcoded range here. What exactly is the issue? The fact 
that you can only map a single region? Then do a counter and fail when it's>  1.

That is what we were doing before: succeeding the first time and
failing from the second time on.
By "coincidence" the second time was the range 0xa-0xb so
everything worked as expected, but it wasn't obvious why.
I am just trying to make sure that one year from now it will be clear
just looking at the code why it works.



If you don't want to map the VGA region as memory slot, why not change the 
actual mapping code in the cirrus adapter?

Because I didn't want to introduce any ugly if (xen_enable()) in generic
code, if it is that simple to catch the issue from xen specific code.

Well sure, but 2 years from now yet another region will be introduced that 
might even be registered before the FB and everyone's puzzled again :). How 
about you print a warning when anyone tries to map anything after the first 
map? Or - as Jan suggests - implement multiple regions.

If you prefer, you could even check for the VGA range as "known broken" and 
only print warnings on others.

I can do that (actually we already do it) but it wouldn't change the
fact that if somebody modifies hw/cirrus_vga.c:map_linear_vram to call
cpu_register_physical_memory_log for the legacy range first, it would
break xen, that is the problem I was trying to solve.

In order to make the code more reliable, and also catch the scenario
where another region is registered before the framebuffer, we could do
something like this, but it is not very pretty:


I agree, but it's probably the most pretty one I've seen so far. I'll 
let others comment on it too before taking it in.



Alex




diff --git a/xen-all.c b/xen-all.c
index 9a5c3ec..de1e724 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -214,6 +214,7 @@ static int xen_add_to_physmap(XenIOState *state,
  unsigned long i = 0;
  int rc = 0;
  XenPhysmap *physmap = NULL;
+RAMBlock *block;

  if (get_physmapping(state, start_addr, size)) {
  return 0;
@@ -221,7 +222,16 @@ static int xen_add_to_physmap(XenIOState *state,
  if (size<= 0) {
  return -1;
  }
+/* only add the vga vram to physmap */
+QLIST_FOREACH(block,&ram_list.blocks, next) {
+if (!strcmp(block->idstr, "vga.vram")&&  block->offset == phys_offset
+&&  start_addr>  0xb) {
+goto go_physmap;
+}
+}
+return -1;

+go_physmap:
  DPRINTF("mapping vram to %llx - %llx, from %llx\n", start_addr, 
start_addr + size, phys_offset);
  for (i = 0; i<  size>>  TARGET_PAGE_BITS; i++) {
  unsigned long idx = (phys_offset>>  TARGET_PAGE_BITS) + i;





Re: [Qemu-devel] [PATCH] xen: avoid tracking the region 0xa0000 - 0xbffff

2011-06-14 Thread Stefano Stabellini
On Tue, 14 Jun 2011, Alexander Graf wrote:
> On 14.06.2011, at 13:48, Stefano Stabellini wrote:
> 
> > On Tue, 14 Jun 2011, Alexander Graf wrote:
> >> On 03.06.2011, at 17:56,  
> >>  wrote:
> >> 
> >>> From: Stefano Stabellini 
> >>> 
> >>> Xen can only do dirty bit tracking for one memory region, so we should
> >>> explicitly avoid trying to track the legacy VGA region between 0xa
> >>> and 0xb, rather than trying and failing.
> >>> 
> >>> Signed-off-by: Stefano Stabellini 
> >>> ---
> >>> xen-all.c |4 
> >>> 1 files changed, 4 insertions(+), 0 deletions(-)
> >>> 
> >>> diff --git a/xen-all.c b/xen-all.c
> >>> index 9a5c3ec..1fdc2e8 100644
> >>> --- a/xen-all.c
> >>> +++ b/xen-all.c
> >>> @@ -218,6 +218,10 @@ static int xen_add_to_physmap(XenIOState *state,
> >>>if (get_physmapping(state, start_addr, size)) {
> >>>return 0;
> >>>}
> >>> +/* do not try to map legacy VGA memory */
> >>> +if (start_addr >= 0xa && start_addr + size <= 0xb) {
> >> 
> >> I don't quite like the hardcoded range here. What exactly is the issue? 
> >> The fact that you can only map a single region? Then do a counter and fail 
> >> when it's > 1. 
> > 
> > That is what we were doing before: succeeding the first time and
> > failing from the second time on.
> > By "coincidence" the second time was the range 0xa-0xb so
> > everything worked as expected, but it wasn't obvious why.
> > I am just trying to make sure that one year from now it will be clear
> > just looking at the code why it works.
> > 
> > 
> >> If you don't want to map the VGA region as memory slot, why not change the 
> >> actual mapping code in the cirrus adapter?
> > 
> > Because I didn't want to introduce any ugly if (xen_enable()) in generic
> > code, if it is that simple to catch the issue from xen specific code.
> 
> Well sure, but 2 years from now yet another region will be introduced that 
> might even be registered before the FB and everyone's puzzled again :). How 
> about you print a warning when anyone tries to map anything after the first 
> map? Or - as Jan suggests - implement multiple regions.
> 
> If you prefer, you could even check for the VGA range as "known broken" and 
> only print warnings on others.

I can do that (actually we already do it) but it wouldn't change the
fact that if somebody modifies hw/cirrus_vga.c:map_linear_vram to call
cpu_register_physical_memory_log for the legacy range first, it would
break xen, that is the problem I was trying to solve.

In order to make the code more reliable, and also catch the scenario
where another region is registered before the framebuffer, we could do
something like this, but it is not very pretty:



diff --git a/xen-all.c b/xen-all.c
index 9a5c3ec..de1e724 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -214,6 +214,7 @@ static int xen_add_to_physmap(XenIOState *state,
 unsigned long i = 0;
 int rc = 0;
 XenPhysmap *physmap = NULL;
+RAMBlock *block;
 
 if (get_physmapping(state, start_addr, size)) {
 return 0;
@@ -221,7 +222,16 @@ static int xen_add_to_physmap(XenIOState *state,
 if (size <= 0) {
 return -1;
 }
+/* only add the vga vram to physmap */
+QLIST_FOREACH(block, &ram_list.blocks, next) {
+if (!strcmp(block->idstr, "vga.vram") && block->offset == phys_offset
+&& start_addr > 0xb) {
+goto go_physmap;
+}
+}
+return -1;
 
+go_physmap:
 DPRINTF("mapping vram to %llx - %llx, from %llx\n", start_addr, start_addr 
+ size, phys_offset);
 for (i = 0; i < size >> TARGET_PAGE_BITS; i++) {
 unsigned long idx = (phys_offset >> TARGET_PAGE_BITS) + i;



Re: [Qemu-devel] [PULL] libcacard libtoolized + usb-ccid fix

2011-06-14 Thread Alon Levy
On Thu, Jun 09, 2011 at 11:16:11AM +0300, Alon Levy wrote:
> Ping?
> 

Reping.

> On Tue, May 31, 2011 at 07:42:20PM +0300, Alon Levy wrote:
> > Hi,
> > 
> > This pull request includes the libcacard.la library target and a memory leak
> > fix by Markus. Please pull.
> > 
> > The following changes since commit b1d7d2b93a1d6b2d2848b616cc35acdf521c923c:
> > 
> >   Merge remote-tracking branch 'stefanha/trivial-patches' into staging 
> > (2011-05-31 08:23:11 -0500)
> > 
> > are available in the git repository at:
> > 
> >   git://anongit.freedesktop.org/~alon/qemu pull-libcacard-1
> > 
> > Alon Levy (2):
> >   configure: add libdir and --libdir
> >   libcacard: add libcacard.la target
> > 
> > Markus Armbruster (1):
> >   usb-ccid: Plug memory leak on qdev exit()
> > 
> >  Makefile   |   20 +++-
> >  Makefile.objs  |8 
> >  configure  |   17 -
> >  hw/usb-ccid.c  |   28 
> >  libcacard/Makefile |   32 
> >  rules.mak  |8 
> >  6 files changed, 87 insertions(+), 26 deletions(-)
> > 
> 



[Qemu-devel] [Bug 739785] Re: qemu-i386 user mode on ARMv5 host fails (bash: fork: Invalid argument)

2011-06-14 Thread Peter Maydell
I think you should be able to work around this with:
  echo 2 >/proc/cpu/alignment

which makes unaligned accesses fault and the kernel fix them up. This
will be slower than if qemu wasn't generating unaligned accesses in the
first place but it should at least make things run. (echo 0 ... will
give you the original behaviour back.)

(I have an armv5 system running now.)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/739785

Title:
  qemu-i386 user mode on ARMv5 host fails (bash: fork: Invalid argument)

Status in QEMU:
  New

Bug description:
  Good time of day everybody,

  I have been trying to make usermode qemu on ARM with plugapps
  (archlinux) with archlinux i386 chroot to work.

  1. I installed arch linux in a virtuabox and created a chroot for it with 
mkarchroot. Transferred it to my pogo plug into /i386/
  2. I comiled qemu-i386 static and put it into /i386/usr/bin/
  ./configure --static --disable-blobs --disable-system 
--target-list=i386-linux-user
  make

  3. I also compiled linux kernel 2.6.38 with CONFIG_BINFMT_MISC=y and 
installed it.
  uname -a
  Linux Plugbox 2.6.38 #4 PREEMPT Fri Mar 18 22:19:10 CDT 2011 armv5tel 
Feroceon 88FR131 rev 1 (v5l) Marvell SheevaPlug Reference Board GNU/Linux

  4. Added the following options into /etc/rc.local
  /sbin/modprobe binfmt_misc
  /bin/mount binfmt_misc -t binfmt_misc /proc/sys/fs/binfmt_misc
  echo 
':qemu-i386:M::\x7fELF\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x00\x03\x00:\xff\xff\xff\xff\xff\xfe\xfe\xff\xff\xff\xff\xff\xff\xff\xff\xff\xfb\xff\xff\xff:/usr/bin/qemu-i386:'
 >/proc/sys/fs/binfmt_misc/register

  5. Also copied ld-linux.so.3 (actually ld-2.13.so because ld-
  linux.so.3 is a link to that file) from /lib/ to /i386/lib/

  6.Now i chroot into /i386 and I get this:
  [root@Plugbox i386]# chroot .
  [II aI hnve ao n@P /]# pacman -Suy
  bash: fork: Invalid argument

  7.I also downloaded linux-user-test-0.3 from qemu website and ran the test:
  [root@Plugbox linux-user-test-0.3]# make
  ./qemu-linux-user.sh
  [qemu-i386]
  ../qemu-0.14.0/i386-linux-user/qemu-i386 -L ./gnemul/qemu-i386 i386/ls -l 
dummyfile
  BUG IN DYNAMIC LINKER ld.so: dl-version.c: 210: _dl_check_map_versions: 
Assertion `needed != ((void *)0)' failed!
  make: *** [test] Error 127

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/739785/+subscriptions



Re: [Qemu-devel] [PATCH] Fix signal handling when io-thread is disabled

2011-06-14 Thread Alexandre Raymond
Hi Jan,

Thanks for reviewing my patch.

> But please pull the now common pthread_sigmask out of the #ifdef.
Please see v2.

Alexandre



[Qemu-devel] [PATCH v2] Fix signal handling when io-thread is disabled

2011-06-14 Thread Alexandre Raymond
Changes since v1:
- take pthread_sigmask() out of the ifdef as it is now common
to both parts.

This fix effectively blocks, in the main thread, the signals handled
by signalfd or the compatibility signal thread.

This way, such signals are received synchronously in the main thread
through sigfd_handler() instead of triggering the signal handler
directly, asynchronously.

Signed-off-by: Alexandre Raymond 
---
 cpus.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/cpus.c b/cpus.c
index 4ab76f0..18a1522 100644
--- a/cpus.c
+++ b/cpus.c
@@ -399,7 +399,6 @@ static int qemu_signal_init(void)
 sigaddset(&set, SIGALRM);
 sigaddset(&set, SIG_IPI);
 sigaddset(&set, SIGBUS);
-pthread_sigmask(SIG_BLOCK, &set, NULL);
 #else
 sigemptyset(&set);
 sigaddset(&set, SIGBUS);
@@ -412,6 +411,7 @@ static int qemu_signal_init(void)
 sigaddset(&set, SIGALRM);
 }
 #endif
+pthread_sigmask(SIG_BLOCK, &set, NULL);
 
 sigfd = qemu_signalfd(&set);
 if (sigfd == -1) {
-- 
1.7.5




Re: [Qemu-devel] [PATCH] cirrus_vga: reset lfb_addr after a pci config write if the BAR is unmapped

2011-06-14 Thread Alexander Graf

On 06/03/2011 05:56 PM, stefano.stabell...@eu.citrix.com wrote:

From: Stefano Stabellini

If the cirrus_vga PCI BAR is unmapped than we should not only reset
map_addr but also lfb_addr, otherwise we'll keep trying to map
the old lfb_addr in map_linear_vram.


The patch looks good to me, but I'd love to get an ack from someone who 
knows the cirrus code before committing it.



Alex


Signed-off-by: Stefano Stabellini
---
  hw/cirrus_vga.c |5 -
  1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/hw/cirrus_vga.c b/hw/cirrus_vga.c
index 722cac7..3c5043e 100644
--- a/hw/cirrus_vga.c
+++ b/hw/cirrus_vga.c
@@ -3088,8 +3088,11 @@ static void pci_cirrus_write_config(PCIDevice *d,
  CirrusVGAState *s =&pvs->cirrus_vga;

  pci_default_write_config(d, address, val, len);
-if (s->vga.map_addr&&  d->io_regions[0].addr == PCI_BAR_UNMAPPED)
+if (s->vga.map_addr&&  d->io_regions[0].addr == PCI_BAR_UNMAPPED) {
  s->vga.map_addr = 0;
+s->vga.lfb_addr = 0;
+s->vga.lfb_end = 0;
+}
  cirrus_update_memory_access(s);
  }






[Qemu-devel] [PATCH v2] Add support for fd: protocol

2011-06-14 Thread Corey Bryant
sVirt provides SELinux MAC isolation for Qemu guest processes and their
corresponding resources (image files). sVirt provides this support
by labeling guests and resources with security labels that are stored
in file system extended attributes. Some file systems, such as NFS, do
not support the extended attribute security namespace, which is needed
for image file isolation when using the sVirt SELinux security driver
in libvirt.

The proposed solution entails a combination of Qemu, libvirt, and
SELinux patches that work together to isolate multiple guests' images
when they're stored in the same NFS mount. This results in an
environment where sVirt isolation and NFS image file isolation can both
be provided.

This patch contains the Qemu code to support this solution. I would
like to solicit input from the libvirt community prior to starting
the libvirt patch.

Currently, Qemu opens an image file in addition to performing the
necessary read and write operations. The proposed solution will move
the open out of Qemu and into libvirt. Once libvirt opens an image
file for the guest, it will pass the file descriptor to Qemu via a
new fd: protocol.

If the image file resides in an NFS mount, the following SELinux policy
changes will provide image isolation:

  - A new SELinux boolean is created (e.g. virt_read_write_nfs) to
allow Qemu (svirt_t) to only have SELinux read and write
permissions on nfs_t files

  - Qemu (svirt_t) also gets SELinux use permissions on libvirt
(virtd_t) file descriptors

Following is a sample invocation of Qemu using the fd: protocol on
the command line:

qemu -drive file=fd:4,format=qcow2

The fd: protocol is also supported by the drive_add monitor command.
This requires that the specified file descriptor is passed to the
monitor alongside a prior getfd monitor command.

There are some additional features provided by certain image types
where Qemu reopens the image file. All of these scenarios will be
unsupported for the fd: protocol, at least for this patch:

  - The -snapshot command line option
  - The savevm monitor command
  - The snapshot_blkdev monitor command
  - Starting Qemu with a backing file

The thought is that this support can be added in the future, but is
not required for the initial fd: support.

This patch was tested with the following formats: raw, cow, qcow,
qcow2, qed, and vmdk, using the fd: protocol from the command line
and the monitor. Tests were also run to verify existing file name
support and qemu-img were not regressed. Non-valid file descriptors,
fd: without format, snapshot and backing files were also tested.

Signed-off-by: Corey Bryant 
---
 block.c   |   16 ++
 block.h   |1 +
 block/cow.c   |5 +++
 block/qcow.c  |5 +++
 block/qcow2.c |5 +++
 block/qed.c   |4 ++
 block/raw-posix.c |   81 +++--
 block/vmdk.c  |5 +++
 block_int.h   |1 +
 blockdev.c|   10 ++
 monitor.c |5 +++
 monitor.h |1 +
 qemu-options.hx   |8 +++--
 qemu-tool.c   |5 +++
 14 files changed, 140 insertions(+), 12 deletions(-)

diff --git a/block.c b/block.c
index 24a25d5..500db84 100644
--- a/block.c
+++ b/block.c
@@ -536,6 +536,10 @@ int bdrv_open(BlockDriverState *bs, const char *filename, 
int flags,
 char tmp_filename[PATH_MAX];
 char backing_filename[PATH_MAX];
 
+if (bdrv_is_fd_protocol(bs)) {
+return -ENOTSUP;
+}
+
 /* if snapshot, we create a temporary backing file and open it
instead of opening 'filename' directly */
 
@@ -585,6 +589,10 @@ int bdrv_open(BlockDriverState *bs, const char *filename, 
int flags,
 
 /* Find the right image format driver */
 if (!drv) {
+/* format must be specified for fd: protocol */
+if (bdrv_is_fd_protocol(bs)) {
+return -ENOTSUP;
+}
 ret = find_image_format(filename, &drv);
 }
 
@@ -1460,6 +1468,11 @@ int bdrv_enable_write_cache(BlockDriverState *bs)
 return bs->enable_write_cache;
 }
 
+int bdrv_is_fd_protocol(BlockDriverState *bs)
+{
+return bs->fd_protocol;
+}
+
 /* XXX: no longer used */
 void bdrv_set_change_cb(BlockDriverState *bs,
 void (*change_cb)(void *opaque, int reason),
@@ -1964,6 +1977,9 @@ int bdrv_snapshot_create(BlockDriverState *bs,
 BlockDriver *drv = bs->drv;
 if (!drv)
 return -ENOMEDIUM;
+if (bdrv_is_fd_protocol(bs)) {
+return -ENOTSUP;
+}
 if (drv->bdrv_snapshot_create)
 return drv->bdrv_snapshot_create(bs, sn_info);
 if (bs->file)
diff --git a/block.h b/block.h
index da7d39c..dd46d52 100644
--- a/block.h
+++ b/block.h
@@ -182,6 +182,7 @@ int bdrv_is_removable(BlockDriverState *bs);
 int bdrv_is_read_only(BlockDriverState *bs);
 int bdrv_is_sg(BlockDriverState *bs);
 int bdrv_enable_write_cache(BlockDriverState *bs);
+int bdrv_is_fd_protocol(Bl

Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing

2011-06-14 Thread Stefano Stabellini
On Tue, 14 Jun 2011, Alexander Graf wrote:
> > static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
> > {
> >   PCII440FXState *d = opaque;
> > @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char 
> > *device_name,
> >   d = pci_create_simple(b, 0, device_name);
> >   *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
> > 
> > -piix3 = DO_UPCAST(PIIX3State, dev,
> > -  pci_create_simple_multifunction(b, -1, true, 
> > "PIIX3"));
> > +if (xen_enabled()) {
> > +piix3 = DO_UPCAST(PIIX3State, dev,
> > +pci_create_simple_multifunction(b, -1, true, 
> > "PIIX3-xen"));
> > +pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> > +piix3, XEN_PIIX_NUM_PIRQS);
>  
>  But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the 
>  reason behind this change?
> >>> 
> >>> It is still a piix3, but also provides non-legacy interrupt links to the
> >>> IO-APIC.
> >>> The four pins of each PCI device on the bus not only are routed to the
> >>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
> >>> they are connected to the IO-APIC directly.
> >>> These additional routes can only be discovered through ACPI, so you need
> >>> matching ACPI tables. We used to build the old ACPI tables like this:
> >>> 
> >>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
> >>> printf("Name(PRTA, Package() {\n");
> >>> for ( dev = 1; dev < 32; dev++ )
> >>>   for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
> >>>   printf("Package(){0x%04x, %u, 0, %u},\n",
> >>>  dev, intx, ((dev*4+dev/8+intx)&31)+16);
> >>> printf("})\n");
> >>> 
> >> 
> >> Interesting concept, but completely non-standard and very much
> >> different from real hardware. Please at least add a comment there to
> >> show readers that Xen is doing a hack which is not at all related to
> >> how the PIIX really works.
> > 
> > Isn't this more a function of the "wires" on the motherboard than the
> > PIIX specifically? i.e. this just encodes the permutation of the wires
> > from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
> > which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
> 
> Interrupts with PCI work slightly different. PCI devices can map (themselves 
> or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These 
> get converted using PCI host controller specific logic to 4 interrupt lines 
> which then go into the IO-APIC.
> 
> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could be 
> 26 though.

The number of redirection entries in the IOAPIC can be discovered
reading from the IOAPICVER register and it is a property of a specific
model of IOAPIC. As a matter of fact Xen's emulated IOAPIC supports more
pins than the most popular IOAPIC used with PIIX3.

 
> I haven't seen a single case where PCI devices have a direct link to the 
> IO-APIC. I also have not seen any PCI host controller that exports more than 
> 4 interrupts. Giving each PCI device its own line, on top of that more than 
> ever could be in real hardware, is a plain hack IMHO.

Actually this happens quite often: if I am not mistaken all the GSIs
higher than 15 are actually the result of a direct connection between
an interrupt source and the IOAPIC. I have several on my testboxes.
Also give a look at the Intel Multiprocessor Specification, section
3.6.2.3: as you can see from the diagram in "Symmetric I/O Mode" all the
interrupts are routed through the IOAPIC directly.


> Did this really give you actual performance/latency/scalability gains? I 
> still think for devices that matter, we should go with MSI rather than 
> deriving from real hw.
> 

Not all the operating systems support MSIs, it is nice to be able to
avoid interrupt sharing without recurring to MSIs.
Also this is how Xen has been working for more then 5 years in HVM mode,
so this configuration is well tested and supported by most operating
systems (at least all the ones we tried so far).


In any case I think it is a good idea to add a comment to better explain
what we are doing, see below.



commit 973bb091a967fdec37a1bc8fe30d46a483d2903d
Author: Stefano Stabellini 
Date:   Tue May 17 12:10:36 2011 +

xen: fix interrupt routing

- remove i440FX-xen and i440fx_write_config_xen
we don't need to intercept pci config writes to i440FX anymore;

- introduce PIIX3-xen and piix3_write_config_xen
we do need to intercept pci config write to the PCI-ISA bridge to update
the PCI link routing;

- set the number of PIIX3-xen interrupts line to 128;

Signed-off-by: Stefano Stabellini 

diff --git a/hw/pc.h b/hw/pc.h
index 0dcbee7..6d5730b 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -176,7 +176,6 @@ struct PCII440FXState;
 typedef struct PCII440FXState PCII440FXState;
 
 PCIBus *i440fx_ini

Re: [Qemu-devel] [PATCH] xen: fix interrupt routing

2011-06-14 Thread Stefano Stabellini
On Tue, 14 Jun 2011, Jan Kiszka wrote:
> I bet the motivation is to have an IRQ route that is independent of
> QEMU, thus can be discovered inside the Xen kernel and then remains
> stable - or is simply hard-wired. Device assignment? Direct legacy IRQ
> injection from the kernel?
> 

This code predates me, so Keir might have more insights on the original
motivation, but you are correct, they are hard-wired and can be used for
device-assignment as well.



Re: [Qemu-devel] stale savannah git repo (was: Re: KVM call agenda for April 05)

2011-06-14 Thread Anthony Liguori

On 06/09/2011 05:14 AM, Peter Maydell wrote:

On 5 April 2011 09:29, Alexander Graf  wrote:

On 05.04.2011, at 08:01, Brad Hards wrote:

It isn't even easy to figure out what trees there are (apart from the
main one) and a google search for "qemu git" produces some misleading
links to savannah and places other than git://git.qemu.org/qemu.git.



I'd also like to see the savannah one just deactivated. Last time I
checked it wasn't synced, so you simply get an old snapshot which is
the worst case scenario for everyone.


Ping? Is it possible to either get the savannah git tree to be
automatically mirrored from git.qemu.org, or to remove it?

(This came up on IRC again today.)


We're working on migrating the qemu.org server to something more 
reliable that can also provide mailing list hosting.  We'll announce a 
migration date by EOW.


Regards,

Anthony Liguori


-- PMM






Re: [Qemu-devel] [PATCH] xen: fix interrupt routing

2011-06-14 Thread Jan Kiszka
On 2011-06-14 14:30, Alexander Graf wrote:
> 
> Am 14.06.2011 um 14:17 schrieb Ian Campbell :
> 
>> On Tue, 2011-06-14 at 10:25 +0100, Alexander Graf wrote:
>>> On 31.05.2011, at 13:05, Stefano Stabellini wrote:
>>>
 On Sat, 28 May 2011, Alexander Graf wrote:
>
> On 26.05.2011, at 17:48, Stefano Stabellini wrote:
>
>> xen: fix interrupt routing
>>
>> - remove i440FX-xen and i440fx_write_config_xen
>> we don't need to intercept pci config writes to i440FX anymore;
>
> Why not? In which version? Did anything below change? What about compat 
> code? Older hypervisor versions?

 Nothing changed, I think it was a genuine mistake in the original patch
 series: the intention was to catch the pci config writes to 0x60-0x63 to
 reprogram the xen pci link routes accordingly (see
 xen_piix_pci_write_config_client).  If I am not mistaken a similar thing
 is done by non-Xen Qemu in piix3_update_irq_levels.


>> - introduce PIIX3-xen and piix3_write_config_xen
>> we do need to intercept pci config write to the PCI-ISA bridge to update
>> the PCI link routing;
>>
>> - set the number of PIIX3-xen interrupts lines to 128;
>>
>> Signed-off-by: Stefano Stabellini 
>>
>> diff --git a/hw/pc.h b/hw/pc.h
>> index 0dcbee7..6d5730b 100644
>> --- a/hw/pc.h
>> +++ b/hw/pc.h
>> @@ -176,7 +176,6 @@ struct PCII440FXState;
>> typedef struct PCII440FXState PCII440FXState;
>>
>> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, 
>> qemu_irq *pic, ram_addr_t ram_size);
>> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int 
>> *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
>> void i440fx_init_memory_mappings(PCII440FXState *d);
>>
>> /* piix4.c */
>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
>> index 9a22a8a..ba198de 100644
>> --- a/hw/pc_piix.c
>> +++ b/hw/pc_piix.c
>> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
>>   isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
>>
>>   if (pci_enabled) {
>> -if (!xen_enabled()) {
>> -pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, 
>> ram_size);
>> -} else {
>> -pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, 
>> isa_irq, ram_size);
>> -}
>> +pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, 
>> ram_size);
>>   } else {
>>   pci_bus = NULL;
>>   i440fx_state = NULL;
>> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
>> index 7f1c4cc..3d73a42 100644
>> --- a/hw/piix_pci.c
>> +++ b/hw/piix_pci.c
>> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
>>
>> #define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
>> #define PIIX_NUM_PIRQS  4ULL/* PIRQ[A-D] */
>> +#define XEN_PIIX_NUM_PIRQS  128ULL
>> #define PIIX_PIRQC  0x60
>>
>> typedef struct PIIX3State {
>> @@ -78,6 +79,8 @@ struct PCII440FXState {
>> #define I440FX_SMRAM0x72
>>
>> static void piix3_set_irq(void *opaque, int pirq, int level);
>> +static void piix3_write_config_xen(PCIDevice *dev,
>> +   uint32_t address, uint32_t val, int len);
>>
>> /* return the global irq number corresponding to a given device irq
>>  pin. We could also use the bus number to have a more precise
>> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
>>   }
>> }
>>
>> -static void i440fx_write_config_xen(PCIDevice *dev,
>> -uint32_t address, uint32_t val, int 
>> len)
>> -{
>> -xen_piix_pci_write_config_client(address, val, len);
>> -i440fx_write_config(dev, address, val, len);
>> -}
>> -
>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>> {
>>   PCII440FXState *d = opaque;
>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char 
>> *device_name,
>>   d = pci_create_simple(b, 0, device_name);
>>   *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>
>> -piix3 = DO_UPCAST(PIIX3State, dev,
>> -  pci_create_simple_multifunction(b, -1, true, 
>> "PIIX3"));
>> +if (xen_enabled()) {
>> +piix3 = DO_UPCAST(PIIX3State, dev,
>> +pci_create_simple_multifunction(b, -1, true, 
>> "PIIX3-xen"));
>> +pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>> +piix3, XEN_PIIX_NUM_PIRQS);
>
> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the 
> reason behind this change?

 It is still a piix3, but also provides non-legacy interrupt links to the
 IO-APIC.
 The four pins of each PCI device on the bus not only are routed

Re: [Qemu-devel] KVM call agenda for June 14

2011-06-14 Thread Anthony Liguori

On 06/13/2011 07:40 AM, Juan Quintela wrote:


Please send in any agenda items you are interested in covering.


- 0.15 release
- guest additions (virt-agent next steps)

Regards,

Anthony Liguori



thanks,
-juan








Re: [Qemu-devel] [PATCH RFC 0/3] basic support for composing sysbus devices

2011-06-14 Thread Anthony Liguori

On 06/13/2011 03:59 PM, Blue Swirl wrote:

On Sun, Jun 12, 2011 at 10:21 PM, Anthony Liguori  wrote:

On 06/12/2011 12:12 PM, Avi Kivity wrote:


On 06/10/2011 06:43 PM, Anthony Liguori wrote:



What exactly is so very wrong about buses that they need to die?


They force a device tree. The device model shouldn't be a tree, but a
directed graph.


Right. As an example, you configure PCI interrupt routing and the memory
controller by writing to a PCI device, which logically doesn't have
access to any of this stuff if it's behind the PCI bus.

However, I don't think buses should die. They should be available as an
easy way to model the devices that do follow the rules. But we should
also expose everything else for the exceptional cases.


It's perfectly fine to have a type called PCIBus that I440FX extends,
but qdev shouldn't have explicit knowledge of something called a "bus"
IMHO. Doing this forces a limited mechanism of connecting devices
because it creates an artificial tree (by implying a parent/child
relationship). It makes composition difficult if not impossible.


I think qdev buses are useful as long as they don't enforce their
interfaces. That is, a qdev that is a child of a qbus has access to the
qbus's interfaces, but also access to other stuff.


I see two independent data structures.  The first is the "instantiation
tree".

The instantiation tree may look like this:

+-- i440fx
|  |
|  +-- PIIX3
|  |  |
|  |  +-- mc146818a
|  |  +-- uart
|  |  +-- DMA
|  |  +-- keyboard controller
|  |  +-- (remaining platform ISA devices
|  |
|  +-- UHCI USB controller
|  +-- IDE controller
|
+-- e1000
+-- cirrus-vga
+-- virtio-balloon-pci
+-- IDE disk0

Instantiating i440fx makes a bunch of default stuff.  This is composition.
  Everything else requires explicit instantiation.  This is, strictly
speaking, the parent/child relationships.  If you destroy i440fx, all of
it's children have to also go away (by definition). Nothing about bus
relationship is implied here.  Even if i440fx exposes a PCI bus, the PIIX3
is a child of i440fx even though e1000 is not (even if they're both PCI
devices).


I actually like this slot idea in place of buses. But wouldn't there
be two classes of devices (or two APIs), slot devices and composable
devices?


All devices have properties.  We have this today in qdev.  What's 
missing is to have a properties who's type is a socket for another 
device.  We really want to be able to do:


static DeviceInfo i440fx_info = {
.name = "i440fx",
.props = (Property[]){
   DEFINE_PROP_PLUG(I440FXState, piix3),
   DEFINE_PROP_SOCKET(I440FXState, slot[0]),
   DEFINE_PROP_SOCKET(I440FXState, slot[1]),
   ...
},
};

Which suggests that we really need to move away from declarative device 
definitions.  It makes it hard to have variable numbers of properties.


In this case, piix3 would be defined as:

struct I440FXState {
PIIX3 piix3;
PCIDevice slots[32];
};

Which suggests we need an initfn to do the following:

void i440fx_initfn(...) {
   qdev_init_inplace(&dev->piix3, "PIIX3");
   dev->slot[1] = &dev->piix3->bus;
}

This gets hard to do well in C though.  I'm not sure how we could make 
DEFINE_PROP_PLUG/SOCKET type safe.


Regards,

Anthony Liguori



Re: [Qemu-devel] Question on virtio disk maximum index and maximum partition

2011-06-14 Thread Richard W.M. Jones

On Wed, Jun 01, 2011 at 05:56:02AM +0100, Stefan Hajnoczi wrote:
> Partitions are not at the virtio-blk level.  The guest operating
> system will see the virtio-blk disk and scan its partition table to
> determine which partitions are available.  The limit then depends on
> the partitioning scheme that you use (legacy boot record, GPT, etc).

But practically the Linux virtio implementation imposes a limit of 15
partitions (vda + vda1 .. vda15 = 16), since it encodes the partition
number in 4 bits.  If you have a virtio disk with more than 15
partitions, Linux guests will ignore the extra partitions.  This may
or may not matter, but it's worth considering.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into Xen guests.
http://et.redhat.com/~rjones/virt-p2v



Re: [Qemu-devel] [PATCH] xen: avoid tracking the region 0xa0000 - 0xbffff

2011-06-14 Thread Jan Kiszka
On 2011-06-14 13:50, Stefano Stabellini wrote:
> On Tue, 14 Jun 2011, Jan Kiszka wrote:
>> On 2011-06-14 12:54, Alexander Graf wrote:
>>>
>>> On 03.06.2011, at 17:56,  
>>>  wrote:
>>>
 From: Stefano Stabellini 

 Xen can only do dirty bit tracking for one memory region, so we should
 explicitly avoid trying to track the legacy VGA region between 0xa
 and 0xb, rather than trying and failing.

 Signed-off-by: Stefano Stabellini 
 ---
 xen-all.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

 diff --git a/xen-all.c b/xen-all.c
 index 9a5c3ec..1fdc2e8 100644
 --- a/xen-all.c
 +++ b/xen-all.c
 @@ -218,6 +218,10 @@ static int xen_add_to_physmap(XenIOState *state,
 if (get_physmapping(state, start_addr, size)) {
 return 0;
 }
 +/* do not try to map legacy VGA memory */
 +if (start_addr >= 0xa && start_addr + size <= 0xb) {
>>>
>>> I don't quite like the hardcoded range here. What exactly is the issue? The 
>>> fact that you can only map a single region? Then do a counter and fail when 
>>> it's > 1. If you don't want to map the VGA region as memory slot, why not 
>>> change the actual mapping code in the cirrus adapter?
>>
>> Err, please no "if (xen_enabled())" in that code. We just got rid of the
>> kvm_enabled() mess. And it doesn't scale, it would be required in e1000
>> as well e.g.
> 
> agreed

[Actually, e1000 is not using dirty logging but coalesced MMIO.]

> 
> 
>> BTW, if Xen is not able to track more than one dirty region, I think
>> it's time to fix that limitation. At some point it may no longer be
>> possible to work around it (who knows how the new memory API will look
>> like in this regard).
> 
> you are right, however it is not a simple fix and at present we don't
> actually need to track more than one region...

Well, you already miss dirty logged VGA/VBE memory access this way
(everything that goes to legacy VGA mem, not the framebuffer BAR). Grub
provides a really poor use experience in that mode.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] SdCard read/write performance is slow

2011-06-14 Thread Peter Maydell
On 14 June 2011 12:22, Sid Kapoor  wrote:
> I am using qemu-system-arm for ARM Realview PBX-A9 board. I feel that the
> performance (read/write) from the SdCard is slower than the case in which I
> was mounting my rootfs through NFS. The size of my rootfs is around 3 GB. I
> am making my SdCard image using qemu-img command and formatting it with ext3
> filesystem. Is this a common phenomenon with everyone? Or am I doing
> something wrong in the case of SdCard ?

I've seen very slow sdcard performance too; it's on my todo list
to try to investigate but hasn't made it to the top yet.

-- PMM



Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing

2011-06-14 Thread Alexander Graf

Am 14.06.2011 um 14:17 schrieb Ian Campbell :

> On Tue, 2011-06-14 at 10:25 +0100, Alexander Graf wrote:
>> On 31.05.2011, at 13:05, Stefano Stabellini wrote:
>> 
>>> On Sat, 28 May 2011, Alexander Graf wrote:
 
 On 26.05.2011, at 17:48, Stefano Stabellini wrote:
 
> xen: fix interrupt routing
> 
> - remove i440FX-xen and i440fx_write_config_xen
> we don't need to intercept pci config writes to i440FX anymore;
 
 Why not? In which version? Did anything below change? What about compat 
 code? Older hypervisor versions?
>>> 
>>> Nothing changed, I think it was a genuine mistake in the original patch
>>> series: the intention was to catch the pci config writes to 0x60-0x63 to
>>> reprogram the xen pci link routes accordingly (see
>>> xen_piix_pci_write_config_client).  If I am not mistaken a similar thing
>>> is done by non-Xen Qemu in piix3_update_irq_levels.
>>> 
>>> 
> - introduce PIIX3-xen and piix3_write_config_xen
> we do need to intercept pci config write to the PCI-ISA bridge to update
> the PCI link routing;
> 
> - set the number of PIIX3-xen interrupts lines to 128;
> 
> Signed-off-by: Stefano Stabellini 
> 
> diff --git a/hw/pc.h b/hw/pc.h
> index 0dcbee7..6d5730b 100644
> --- a/hw/pc.h
> +++ b/hw/pc.h
> @@ -176,7 +176,6 @@ struct PCII440FXState;
> typedef struct PCII440FXState PCII440FXState;
> 
> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, 
> qemu_irq *pic, ram_addr_t ram_size);
> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int 
> *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
> void i440fx_init_memory_mappings(PCII440FXState *d);
> 
> /* piix4.c */
> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> index 9a22a8a..ba198de 100644
> --- a/hw/pc_piix.c
> +++ b/hw/pc_piix.c
> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
>   isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
> 
>   if (pci_enabled) {
> -if (!xen_enabled()) {
> -pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, 
> ram_size);
> -} else {
> -pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, 
> isa_irq, ram_size);
> -}
> +pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, 
> ram_size);
>   } else {
>   pci_bus = NULL;
>   i440fx_state = NULL;
> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> index 7f1c4cc..3d73a42 100644
> --- a/hw/piix_pci.c
> +++ b/hw/piix_pci.c
> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
> 
> #define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
> #define PIIX_NUM_PIRQS  4ULL/* PIRQ[A-D] */
> +#define XEN_PIIX_NUM_PIRQS  128ULL
> #define PIIX_PIRQC  0x60
> 
> typedef struct PIIX3State {
> @@ -78,6 +79,8 @@ struct PCII440FXState {
> #define I440FX_SMRAM0x72
> 
> static void piix3_set_irq(void *opaque, int pirq, int level);
> +static void piix3_write_config_xen(PCIDevice *dev,
> +   uint32_t address, uint32_t val, int len);
> 
> /* return the global irq number corresponding to a given device irq
>  pin. We could also use the bus number to have a more precise
> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
>   }
> }
> 
> -static void i440fx_write_config_xen(PCIDevice *dev,
> -uint32_t address, uint32_t val, int 
> len)
> -{
> -xen_piix_pci_write_config_client(address, val, len);
> -i440fx_write_config(dev, address, val, len);
> -}
> -
> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
> {
>   PCII440FXState *d = opaque;
> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char 
> *device_name,
>   d = pci_create_simple(b, 0, device_name);
>   *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
> 
> -piix3 = DO_UPCAST(PIIX3State, dev,
> -  pci_create_simple_multifunction(b, -1, true, 
> "PIIX3"));
> +if (xen_enabled()) {
> +piix3 = DO_UPCAST(PIIX3State, dev,
> +pci_create_simple_multifunction(b, -1, true, 
> "PIIX3-xen"));
> +pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> +piix3, XEN_PIIX_NUM_PIRQS);
 
 But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the 
 reason behind this change?
>>> 
>>> It is still a piix3, but also provides non-legacy interrupt links to the
>>> IO-APIC.
>>> The four pins of each PCI device on the bus not only are routed to the
>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
>>> they are connected to the IO-APIC directly.
>>> These additi

Re: [Qemu-devel] [PATCH] xen: avoid tracking the region 0xa0000 - 0xbffff

2011-06-14 Thread Alexander Graf

On 14.06.2011, at 13:48, Stefano Stabellini wrote:

> On Tue, 14 Jun 2011, Alexander Graf wrote:
>> On 03.06.2011, at 17:56,  
>>  wrote:
>> 
>>> From: Stefano Stabellini 
>>> 
>>> Xen can only do dirty bit tracking for one memory region, so we should
>>> explicitly avoid trying to track the legacy VGA region between 0xa
>>> and 0xb, rather than trying and failing.
>>> 
>>> Signed-off-by: Stefano Stabellini 
>>> ---
>>> xen-all.c |4 
>>> 1 files changed, 4 insertions(+), 0 deletions(-)
>>> 
>>> diff --git a/xen-all.c b/xen-all.c
>>> index 9a5c3ec..1fdc2e8 100644
>>> --- a/xen-all.c
>>> +++ b/xen-all.c
>>> @@ -218,6 +218,10 @@ static int xen_add_to_physmap(XenIOState *state,
>>>if (get_physmapping(state, start_addr, size)) {
>>>return 0;
>>>}
>>> +/* do not try to map legacy VGA memory */
>>> +if (start_addr >= 0xa && start_addr + size <= 0xb) {
>> 
>> I don't quite like the hardcoded range here. What exactly is the issue? The 
>> fact that you can only map a single region? Then do a counter and fail when 
>> it's > 1. 
> 
> That is what we were doing before: succeeding the first time and
> failing from the second time on.
> By "coincidence" the second time was the range 0xa-0xb so
> everything worked as expected, but it wasn't obvious why.
> I am just trying to make sure that one year from now it will be clear
> just looking at the code why it works.
> 
> 
>> If you don't want to map the VGA region as memory slot, why not change the 
>> actual mapping code in the cirrus adapter?
> 
> Because I didn't want to introduce any ugly if (xen_enable()) in generic
> code, if it is that simple to catch the issue from xen specific code.

Well sure, but 2 years from now yet another region will be introduced that 
might even be registered before the FB and everyone's puzzled again :). How 
about you print a warning when anyone tries to map anything after the first 
map? Or - as Jan suggests - implement multiple regions.

If you prefer, you could even check for the VGA range as "known broken" and 
only print warnings on others.


Alex




Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing

2011-06-14 Thread Ian Campbell
On Tue, 2011-06-14 at 10:25 +0100, Alexander Graf wrote:
> On 31.05.2011, at 13:05, Stefano Stabellini wrote:
> 
> > On Sat, 28 May 2011, Alexander Graf wrote:
> >> 
> >> On 26.05.2011, at 17:48, Stefano Stabellini wrote:
> >> 
> >>> xen: fix interrupt routing
> >>> 
> >>> - remove i440FX-xen and i440fx_write_config_xen
> >>> we don't need to intercept pci config writes to i440FX anymore;
> >> 
> >> Why not? In which version? Did anything below change? What about compat 
> >> code? Older hypervisor versions?
> > 
> > Nothing changed, I think it was a genuine mistake in the original patch
> > series: the intention was to catch the pci config writes to 0x60-0x63 to
> > reprogram the xen pci link routes accordingly (see
> > xen_piix_pci_write_config_client).  If I am not mistaken a similar thing
> > is done by non-Xen Qemu in piix3_update_irq_levels.
> > 
> > 
> >>> - introduce PIIX3-xen and piix3_write_config_xen
> >>> we do need to intercept pci config write to the PCI-ISA bridge to update
> >>> the PCI link routing;
> >>> 
> >>> - set the number of PIIX3-xen interrupts lines to 128;
> >>> 
> >>> Signed-off-by: Stefano Stabellini 
> >>> 
> >>> diff --git a/hw/pc.h b/hw/pc.h
> >>> index 0dcbee7..6d5730b 100644
> >>> --- a/hw/pc.h
> >>> +++ b/hw/pc.h
> >>> @@ -176,7 +176,6 @@ struct PCII440FXState;
> >>> typedef struct PCII440FXState PCII440FXState;
> >>> 
> >>> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, 
> >>> qemu_irq *pic, ram_addr_t ram_size);
> >>> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int 
> >>> *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
> >>> void i440fx_init_memory_mappings(PCII440FXState *d);
> >>> 
> >>> /* piix4.c */
> >>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> >>> index 9a22a8a..ba198de 100644
> >>> --- a/hw/pc_piix.c
> >>> +++ b/hw/pc_piix.c
> >>> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
> >>>isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
> >>> 
> >>>if (pci_enabled) {
> >>> -if (!xen_enabled()) {
> >>> -pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, 
> >>> ram_size);
> >>> -} else {
> >>> -pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, 
> >>> isa_irq, ram_size);
> >>> -}
> >>> +pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, 
> >>> ram_size);
> >>>} else {
> >>>pci_bus = NULL;
> >>>i440fx_state = NULL;
> >>> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> >>> index 7f1c4cc..3d73a42 100644
> >>> --- a/hw/piix_pci.c
> >>> +++ b/hw/piix_pci.c
> >>> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
> >>> 
> >>> #define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
> >>> #define PIIX_NUM_PIRQS  4ULL/* PIRQ[A-D] */
> >>> +#define XEN_PIIX_NUM_PIRQS  128ULL
> >>> #define PIIX_PIRQC  0x60
> >>> 
> >>> typedef struct PIIX3State {
> >>> @@ -78,6 +79,8 @@ struct PCII440FXState {
> >>> #define I440FX_SMRAM0x72
> >>> 
> >>> static void piix3_set_irq(void *opaque, int pirq, int level);
> >>> +static void piix3_write_config_xen(PCIDevice *dev,
> >>> +   uint32_t address, uint32_t val, int len);
> >>> 
> >>> /* return the global irq number corresponding to a given device irq
> >>>   pin. We could also use the bus number to have a more precise
> >>> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
> >>>}
> >>> }
> >>> 
> >>> -static void i440fx_write_config_xen(PCIDevice *dev,
> >>> -uint32_t address, uint32_t val, int 
> >>> len)
> >>> -{
> >>> -xen_piix_pci_write_config_client(address, val, len);
> >>> -i440fx_write_config(dev, address, val, len);
> >>> -}
> >>> -
> >>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
> >>> {
> >>>PCII440FXState *d = opaque;
> >>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char 
> >>> *device_name,
> >>>d = pci_create_simple(b, 0, device_name);
> >>>*pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
> >>> 
> >>> -piix3 = DO_UPCAST(PIIX3State, dev,
> >>> -  pci_create_simple_multifunction(b, -1, true, 
> >>> "PIIX3"));
> >>> +if (xen_enabled()) {
> >>> +piix3 = DO_UPCAST(PIIX3State, dev,
> >>> +pci_create_simple_multifunction(b, -1, true, 
> >>> "PIIX3-xen"));
> >>> +pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> >>> +piix3, XEN_PIIX_NUM_PIRQS);
> >> 
> >> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the 
> >> reason behind this change?
> > 
> > It is still a piix3, but also provides non-legacy interrupt links to the
> > IO-APIC.
> > The four pins of each PCI device on the bus not only are routed to the
> > normal four pirqs (programmed writing to 0x60-0x63, see above) but also
> > they are connected to the IO-APIC directly.
> > These additional routes can only be discovered through

Re: [Qemu-devel] [PATCH 03/12] Switch build system to accompanied kernel headers

2011-06-14 Thread Jan Kiszka
On 2011-06-14 13:28, Alexander Graf wrote:
> 
> On 14.06.2011, at 13:21, Jan Kiszka wrote:
> 
>> On 2011-06-14 13:11, Alexander Graf wrote:
>>>
>>> On 08.06.2011, at 16:10, Jan Kiszka wrote:
>>>
>>> This helps reducing our build-time checks for feature support in the
>>> available Linux kernel headers. And it helps users that do not have
>>> sufficiently recent headers installed on their build machine.
>>>
>>> Consequently, the patch removes and build-time checks for kvm and vhost
>>> in configure, the --kerneldir switch, and KVM_CFLAGS. Kernel headers are
>>> supposed to be provided by QEMU only.
>>>
>>> s390 needs some extra love as it carries redefinitions from kernel
>>> headers.
>>>
>>> Yes. I was wondering if we should unconditionally include the kernel 
>>> headers there. The problem I'm seeing there is that I don't know if that 
>>> would work fine on non-Linux hosts, as that code definitely gets compiled 
>>> there, while KVM code is not.
>>>
>>>
>>>
>>> [...]
>>>
>>> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
>>> index 4e5c391..b5e587f 100644
>>> --- a/target-s390x/cpu.h
>>> +++ b/target-s390x/cpu.h
>>> @@ -313,16 +313,6 @@ CPUState *s390_cpu_addr2state(uint16_t cpu_addr);
>>> /* from s390-virtio-bus */
>>> extern const target_phys_addr_t virtio_size;
>>>
>>> -#ifndef KVM_S390_SIGP_STOP
>>> -#define KVM_S390_SIGP_STOP  0
>>> -#define KVM_S390_PROGRAM_INT0
>>> -#define KVM_S390_SIGP_SET_PREFIX0
>>> -#define KVM_S390_RESTART0
>>> -#define KVM_S390_INT_VIRTIO 0
>>> -#define KVM_S390_INT_SERVICE0
>>> -#define KVM_S390_INT_EMERGENCY  0
>>> -#endif
>>> -
>>> #endif
>>> void cpu_lock(void);
>>> void cpu_unlock(void);
>>> diff --git a/target-s390x/op_helper.c b/target-s390x/op_helper.c
>>> index db03a79..9429698 100644
>>> --- a/target-s390x/op_helper.c
>>> +++ b/target-s390x/op_helper.c
>>> @@ -23,6 +23,7 @@
>>> #include "helpers.h"
>>> #include 
>>> #include "kvm.h"
>>> +#include 
>>>
>>> Have you tried to compile this on non-Linux?
>>
>> Sorry, I don't have "non-Linux" around. :) Do you expect build problems
>> with that header? Why?
> 
> Do you have a git tree handy? I can give a Mac build a try then.

Try git://git.kiszka.org/qemu-kvm.git queues/kvm-upstream

> 
>> BTW, if you depend on KVM_* constants for non-KVM builds, that looks a
>> bit fishy to me. Why is that code built at all in that setup?
> 
> The reason for that is that I didn't want to clutter the emulation code with 
> #ifdefs:
> 
> static void program_interrupt(CPUState *env, uint32_t code, int ilc)
> {
> qemu_log("program interrupt at %#" PRIx64 "\n", env->psw.addr);
> 
> if (kvm_enabled()) {
> kvm_s390_interrupt(env, KVM_S390_PROGRAM_INT, code);
> } else {
> env->int_pgm_code = code;
> env->int_pgm_ilc = ilc;
> env->exception_index = EXCP_PGM;
> cpu_loop_exit();
> }
> }
> 
> This breaks compilation when KVM_S390_PROGRAM_INT is not defined. I'm very 
> open to suggestions on how to improve this though :).

Callbacks? See cpu_interrupt_handler e.g.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



[Qemu-devel] Fwd: How to increase ram beyond 756 MB ?

2011-06-14 Thread मनीष शर्मा
-- Forwarded message --
From: मनीष शर्मा 
Date: 2011/6/14
Subject: Re: [Qemu-devel] How to increase ram beyond 756 MB ?
To: Peter Maydell 


Thanks Peter.

I am able to change memory size in guest  to 1024 MB now.
After making small change in my kernel file realview_pbx.c.
To changed the address to 0x7000 instead of 0x8000
where qemu 0.14.0 is registering the memory.
realview_pbx_fixup () {

*-meminfo->bank[2].start = 0x8000   *
+ meminfo->bank[2].start = 0x7000
...
}

And to enable *High Memory support* in my kernel.
Now its showing 1024 MB.

Thanks a lot for such a quick reply.

regards,
Manish
2011/6/14 Peter Maydell 

> 2011/6/14 मनीष शर्मा :
> > I am using arm on x86, running Linux 2.6.33 as guest and fedora 14 as
> host.
> > And using Realview board inside Qemu Machine.
>
> Do you mean "realview-eb", "realview-eb-mpcore", "realview-pb-a9",
> or "realview-pbx-a9" ? They are different models.
>
> > Even though I passed 1024 MB as qemu memory in command line,
> > I can only see ~756MB given to guest. Also I tried passing more than
> > 1024 MB say 2048 MB But I am unable to see increase in RAM in Guest OS.
> > I checked the /proc/meminfo for guest OS memory details.
>
> For both "realview-eb" and "realview-eb-mpcore" the maximum memory
> allowed is 256MB, because the hardware we are modelled has an
> address space layout which does not permit more. If you're using
> those models, then to use more RAM you must use a different model.
>
> "realview-pb-a9" and "realview-pbx-a9" should both in theory
> support 1024MB of RAM as a maximum. Typically to make this work
> you need to both pass QEMU the "-m 1024" option (to model the
> extra RAM) and also pass the guest kernel an option like "mem=1024M".
> (However the pb-a9 hardware has 512MB of RAM as a maximum and
> we are providing extra modelled RAM in a "reserved" section of
> the address space, so it's possible the guest kernel wouldn't
> try to use it.)
>
> -- PMM
>



-- 
-मनीष



-- 
-मनीष


Re: [Qemu-devel] [PATCH 07/12] kvm: Drop KVM_CAP build dependencies

2011-06-14 Thread Alexander Graf

On 14.06.2011, at 13:19, Jan Kiszka wrote:

> On 2011-06-14 13:17, Alexander Graf wrote:
>> 
>> On 14.06.2011, at 13:07, Jan Kiszka wrote:
>> 
>>> On 2011-06-14 13:05, Alexander Graf wrote:
 
 On 08.06.2011, at 16:11, Jan Kiszka wrote:
 
> No longer needed with accompanied kernel headers. We are only left with
> build dependencies that are controlled by kvm arch headers.
 
 This should completely rule out all CAPs right? IIRC, all CAPs are defined 
 in generic code, so we don't get number overlaps.
>>> 
>>> "We are only left with build dependencies that are controlled by kvm
>>> arch headers." E.g. KVM_CAP_VCPU_EVENTS.
>>> 
 
> 
> CC: Alexander Graf 
> Signed-off-by: Jan Kiszka 
> ---
> kvm-all.c |8 
> 1 files changed, 0 insertions(+), 8 deletions(-)
> 
> diff --git a/kvm-all.c b/kvm-all.c
> index 4a9910a..cbc2532 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -757,21 +757,17 @@ int kvm_init(void)
>   s->coalesced_mmio = kvm_check_extension(s, KVM_CAP_COALESCED_MMIO);
> 
>   s->broken_set_mem_region = 1;
> -#ifdef KVM_CAP_JOIN_MEMORY_REGIONS_WORKS
>   ret = kvm_check_extension(s, KVM_CAP_JOIN_MEMORY_REGIONS_WORKS);
>   if (ret > 0) {
>   s->broken_set_mem_region = 0;
>   }
> -#endif
> 
> #ifdef KVM_CAP_VCPU_EVENTS
 
 ... which leaves the question why this one is still here :).
>>> 
>>> See above (does PPC support it?).
>> 
>> Well, regardless of support, the #ifdef shouldn't be required, right?
>> 
>> include/linux/kvm.h:518:#define KVM_CAP_VCPU_EVENTS 41
>> 
>> ... as long as there's a runtime check :)
> 
> Are all types and dependent constants available? Maybe it is the case
> here, but you cannot generally assume this if a CAP is per-arch.

Ah, I see what you mean. Maybe it'd be easier to use __KVM_HAVE_VCPU_EVENTS 
instead of the implicitly defined CAP? But then again, once all the version CAP 
#ifdefs are gone, the only ones left are the ones for arch features, so I guess 
your approach is valid :)


Alex




Re: [Qemu-devel] [PATCH] xen: avoid tracking the region 0xa0000 - 0xbffff

2011-06-14 Thread Stefano Stabellini
On Tue, 14 Jun 2011, Alexander Graf wrote:
> On 03.06.2011, at 17:56,  
>  wrote:
> 
> > From: Stefano Stabellini 
> > 
> > Xen can only do dirty bit tracking for one memory region, so we should
> > explicitly avoid trying to track the legacy VGA region between 0xa
> > and 0xb, rather than trying and failing.
> > 
> > Signed-off-by: Stefano Stabellini 
> > ---
> > xen-all.c |4 
> > 1 files changed, 4 insertions(+), 0 deletions(-)
> > 
> > diff --git a/xen-all.c b/xen-all.c
> > index 9a5c3ec..1fdc2e8 100644
> > --- a/xen-all.c
> > +++ b/xen-all.c
> > @@ -218,6 +218,10 @@ static int xen_add_to_physmap(XenIOState *state,
> > if (get_physmapping(state, start_addr, size)) {
> > return 0;
> > }
> > +/* do not try to map legacy VGA memory */
> > +if (start_addr >= 0xa && start_addr + size <= 0xb) {
> 
> I don't quite like the hardcoded range here. What exactly is the issue? The 
> fact that you can only map a single region? Then do a counter and fail when 
> it's > 1. 

That is what we were doing before: succeeding the first time and
failing from the second time on.
By "coincidence" the second time was the range 0xa-0xb so
everything worked as expected, but it wasn't obvious why.
I am just trying to make sure that one year from now it will be clear
just looking at the code why it works.


> If you don't want to map the VGA region as memory slot, why not change the 
> actual mapping code in the cirrus adapter?

Because I didn't want to introduce any ugly if (xen_enable()) in generic
code, if it is that simple to catch the issue from xen specific code.



[Qemu-devel] SdCard read/write performance is slow

2011-06-14 Thread Sid Kapoor
Hi,

I am using qemu-system-arm for ARM Realview PBX-A9 board. I feel that the
performance (read/write) from the SdCard is slower than the case in which I
was mounting my rootfs through NFS. The size of my rootfs is around 3 GB. I
am making my SdCard image using qemu-img command and formatting it with ext3
filesystem. Is this a common phenomenon with everyone? Or am I doing
something wrong in the case of SdCard ?

Thanks in advance.
-- 
Siddharth Kapoor


Re: [Qemu-devel] [PATCH] xen: avoid tracking the region 0xa0000 - 0xbffff

2011-06-14 Thread Stefano Stabellini
On Tue, 14 Jun 2011, Jan Kiszka wrote:
> On 2011-06-14 12:54, Alexander Graf wrote:
> > 
> > On 03.06.2011, at 17:56,  
> >  wrote:
> > 
> >> From: Stefano Stabellini 
> >>
> >> Xen can only do dirty bit tracking for one memory region, so we should
> >> explicitly avoid trying to track the legacy VGA region between 0xa
> >> and 0xb, rather than trying and failing.
> >>
> >> Signed-off-by: Stefano Stabellini 
> >> ---
> >> xen-all.c |4 
> >> 1 files changed, 4 insertions(+), 0 deletions(-)
> >>
> >> diff --git a/xen-all.c b/xen-all.c
> >> index 9a5c3ec..1fdc2e8 100644
> >> --- a/xen-all.c
> >> +++ b/xen-all.c
> >> @@ -218,6 +218,10 @@ static int xen_add_to_physmap(XenIOState *state,
> >> if (get_physmapping(state, start_addr, size)) {
> >> return 0;
> >> }
> >> +/* do not try to map legacy VGA memory */
> >> +if (start_addr >= 0xa && start_addr + size <= 0xb) {
> > 
> > I don't quite like the hardcoded range here. What exactly is the issue? The 
> > fact that you can only map a single region? Then do a counter and fail when 
> > it's > 1. If you don't want to map the VGA region as memory slot, why not 
> > change the actual mapping code in the cirrus adapter?
> 
> Err, please no "if (xen_enabled())" in that code. We just got rid of the
> kvm_enabled() mess. And it doesn't scale, it would be required in e1000
> as well e.g.

agreed


> BTW, if Xen is not able to track more than one dirty region, I think
> it's time to fix that limitation. At some point it may no longer be
> possible to work around it (who knows how the new memory API will look
> like in this regard).

you are right, however it is not a simple fix and at present we don't
actually need to track more than one region...



Re: [Qemu-devel] [PATCH 03/12] Switch build system to accompanied kernel headers

2011-06-14 Thread Jan Kiszka
On 2011-06-14 13:11, Alexander Graf wrote:
> 
> On 08.06.2011, at 16:10, Jan Kiszka wrote:
> 
> This helps reducing our build-time checks for feature support in the
> available Linux kernel headers. And it helps users that do not have
> sufficiently recent headers installed on their build machine.
> 
> Consequently, the patch removes and build-time checks for kvm and vhost
> in configure, the --kerneldir switch, and KVM_CFLAGS. Kernel headers are
> supposed to be provided by QEMU only.
> 
> s390 needs some extra love as it carries redefinitions from kernel
> headers.
> 
> Yes. I was wondering if we should unconditionally include the kernel headers 
> there. The problem I'm seeing there is that I don't know if that would work 
> fine on non-Linux hosts, as that code definitely gets compiled there, while 
> KVM code is not.
> 
> 
> 
> [...]
> 
> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 4e5c391..b5e587f 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -313,16 +313,6 @@ CPUState *s390_cpu_addr2state(uint16_t cpu_addr);
> /* from s390-virtio-bus */
> extern const target_phys_addr_t virtio_size;
> 
> -#ifndef KVM_S390_SIGP_STOP
> -#define KVM_S390_SIGP_STOP  0
> -#define KVM_S390_PROGRAM_INT0
> -#define KVM_S390_SIGP_SET_PREFIX0
> -#define KVM_S390_RESTART0
> -#define KVM_S390_INT_VIRTIO 0
> -#define KVM_S390_INT_SERVICE0
> -#define KVM_S390_INT_EMERGENCY  0
> -#endif
> -
> #endif
> void cpu_lock(void);
> void cpu_unlock(void);
> diff --git a/target-s390x/op_helper.c b/target-s390x/op_helper.c
> index db03a79..9429698 100644
> --- a/target-s390x/op_helper.c
> +++ b/target-s390x/op_helper.c
> @@ -23,6 +23,7 @@
> #include "helpers.h"
> #include 
> #include "kvm.h"
> +#include 
> 
> Have you tried to compile this on non-Linux?

Sorry, I don't have "non-Linux" around. :) Do you expect build problems
with that header? Why?

BTW, if you depend on KVM_* constants for non-KVM builds, that looks a
bit fishy to me. Why is that code built at all in that setup?

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH 03/12] Switch build system to accompanied kernel headers

2011-06-14 Thread Alexander Graf

On 14.06.2011, at 13:21, Jan Kiszka wrote:

> On 2011-06-14 13:11, Alexander Graf wrote:
>> 
>> On 08.06.2011, at 16:10, Jan Kiszka wrote:
>> 
>> This helps reducing our build-time checks for feature support in the
>> available Linux kernel headers. And it helps users that do not have
>> sufficiently recent headers installed on their build machine.
>> 
>> Consequently, the patch removes and build-time checks for kvm and vhost
>> in configure, the --kerneldir switch, and KVM_CFLAGS. Kernel headers are
>> supposed to be provided by QEMU only.
>> 
>> s390 needs some extra love as it carries redefinitions from kernel
>> headers.
>> 
>> Yes. I was wondering if we should unconditionally include the kernel headers 
>> there. The problem I'm seeing there is that I don't know if that would work 
>> fine on non-Linux hosts, as that code definitely gets compiled there, while 
>> KVM code is not.
>> 
>> 
>> 
>> [...]
>> 
>> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
>> index 4e5c391..b5e587f 100644
>> --- a/target-s390x/cpu.h
>> +++ b/target-s390x/cpu.h
>> @@ -313,16 +313,6 @@ CPUState *s390_cpu_addr2state(uint16_t cpu_addr);
>> /* from s390-virtio-bus */
>> extern const target_phys_addr_t virtio_size;
>> 
>> -#ifndef KVM_S390_SIGP_STOP
>> -#define KVM_S390_SIGP_STOP  0
>> -#define KVM_S390_PROGRAM_INT0
>> -#define KVM_S390_SIGP_SET_PREFIX0
>> -#define KVM_S390_RESTART0
>> -#define KVM_S390_INT_VIRTIO 0
>> -#define KVM_S390_INT_SERVICE0
>> -#define KVM_S390_INT_EMERGENCY  0
>> -#endif
>> -
>> #endif
>> void cpu_lock(void);
>> void cpu_unlock(void);
>> diff --git a/target-s390x/op_helper.c b/target-s390x/op_helper.c
>> index db03a79..9429698 100644
>> --- a/target-s390x/op_helper.c
>> +++ b/target-s390x/op_helper.c
>> @@ -23,6 +23,7 @@
>> #include "helpers.h"
>> #include 
>> #include "kvm.h"
>> +#include 
>> 
>> Have you tried to compile this on non-Linux?
> 
> Sorry, I don't have "non-Linux" around. :) Do you expect build problems
> with that header? Why?

Do you have a git tree handy? I can give a Mac build a try then.

> BTW, if you depend on KVM_* constants for non-KVM builds, that looks a
> bit fishy to me. Why is that code built at all in that setup?

The reason for that is that I didn't want to clutter the emulation code with 
#ifdefs:

static void program_interrupt(CPUState *env, uint32_t code, int ilc)
{
qemu_log("program interrupt at %#" PRIx64 "\n", env->psw.addr);

if (kvm_enabled()) {
kvm_s390_interrupt(env, KVM_S390_PROGRAM_INT, code);
} else {
env->int_pgm_code = code;
env->int_pgm_ilc = ilc;
env->exception_index = EXCP_PGM;
cpu_loop_exit();
}
}

This breaks compilation when KVM_S390_PROGRAM_INT is not defined. I'm very open 
to suggestions on how to improve this though :).


Alex




Re: [Qemu-devel] [PATCH] Reset system before loadvm

2011-06-14 Thread Avi Kivity

On 06/14/2011 01:56 PM, Jan Kiszka wrote:

>
>  I believe it is not.  But regardless, we shouldn't add more incorrect
>  behaviour.

It depends on how the reset event is defined in QMP. As I see it, there
is nothing stated about reset reasons or sources. So emitting
information about the actually happening reset can't be incorrect. Just
like emitting the information about the VM stop/start around loadvm.


I don't think so.  Theoretically we could stop the vm, save a bit of 
state, reset it, and load the state back.  Did a reset occur?  Not from 
the user's point of view.


If a reset event is interesting (personally I don't think it is, so 
much, perhaps just for logging purposes), we should restrict it to user 
visible events (so it means either the user pressed the reset button or 
the guest reset itself).


--
error compiling committee.c: too many arguments to function




Re: [Qemu-devel] [PATCH 07/12] kvm: Drop KVM_CAP build dependencies

2011-06-14 Thread Jan Kiszka
On 2011-06-14 13:05, Alexander Graf wrote:
> 
> On 08.06.2011, at 16:11, Jan Kiszka wrote:
> 
>> No longer needed with accompanied kernel headers. We are only left with
>> build dependencies that are controlled by kvm arch headers.
> 
> This should completely rule out all CAPs right? IIRC, all CAPs are defined in 
> generic code, so we don't get number overlaps.

"We are only left with build dependencies that are controlled by kvm
arch headers." E.g. KVM_CAP_VCPU_EVENTS.

> 
>>
>> CC: Alexander Graf 
>> Signed-off-by: Jan Kiszka 
>> ---
>> kvm-all.c |8 
>> 1 files changed, 0 insertions(+), 8 deletions(-)
>>
>> diff --git a/kvm-all.c b/kvm-all.c
>> index 4a9910a..cbc2532 100644
>> --- a/kvm-all.c
>> +++ b/kvm-all.c
>> @@ -757,21 +757,17 @@ int kvm_init(void)
>> s->coalesced_mmio = kvm_check_extension(s, KVM_CAP_COALESCED_MMIO);
>>
>> s->broken_set_mem_region = 1;
>> -#ifdef KVM_CAP_JOIN_MEMORY_REGIONS_WORKS
>> ret = kvm_check_extension(s, KVM_CAP_JOIN_MEMORY_REGIONS_WORKS);
>> if (ret > 0) {
>> s->broken_set_mem_region = 0;
>> }
>> -#endif
>>
>> #ifdef KVM_CAP_VCPU_EVENTS
> 
> ... which leaves the question why this one is still here :).

See above (does PPC support it?).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH 07/12] kvm: Drop KVM_CAP build dependencies

2011-06-14 Thread Alexander Graf

On 14.06.2011, at 13:07, Jan Kiszka wrote:

> On 2011-06-14 13:05, Alexander Graf wrote:
>> 
>> On 08.06.2011, at 16:11, Jan Kiszka wrote:
>> 
>>> No longer needed with accompanied kernel headers. We are only left with
>>> build dependencies that are controlled by kvm arch headers.
>> 
>> This should completely rule out all CAPs right? IIRC, all CAPs are defined 
>> in generic code, so we don't get number overlaps.
> 
> "We are only left with build dependencies that are controlled by kvm
> arch headers." E.g. KVM_CAP_VCPU_EVENTS.
> 
>> 
>>> 
>>> CC: Alexander Graf 
>>> Signed-off-by: Jan Kiszka 
>>> ---
>>> kvm-all.c |8 
>>> 1 files changed, 0 insertions(+), 8 deletions(-)
>>> 
>>> diff --git a/kvm-all.c b/kvm-all.c
>>> index 4a9910a..cbc2532 100644
>>> --- a/kvm-all.c
>>> +++ b/kvm-all.c
>>> @@ -757,21 +757,17 @@ int kvm_init(void)
>>>s->coalesced_mmio = kvm_check_extension(s, KVM_CAP_COALESCED_MMIO);
>>> 
>>>s->broken_set_mem_region = 1;
>>> -#ifdef KVM_CAP_JOIN_MEMORY_REGIONS_WORKS
>>>ret = kvm_check_extension(s, KVM_CAP_JOIN_MEMORY_REGIONS_WORKS);
>>>if (ret > 0) {
>>>s->broken_set_mem_region = 0;
>>>}
>>> -#endif
>>> 
>>> #ifdef KVM_CAP_VCPU_EVENTS
>> 
>> ... which leaves the question why this one is still here :).
> 
> See above (does PPC support it?).

Well, regardless of support, the #ifdef shouldn't be required, right?

include/linux/kvm.h:518:#define KVM_CAP_VCPU_EVENTS 41

... as long as there's a runtime check :)


Alex




[Qemu-devel] [PATCH] hw/usb-musb.c: Don't misuse usb_packet_complete()

2011-06-14 Thread Peter Maydell
In musb_packet() handle final processing of non-asynchronous
USB packets by directly calling musb_schedule_cb() rather than
going through usb_packet_complete(). The latter will trigger
an assertion because the packet doesn't belong to a device.

Signed-off-by: Peter Maydell 
---
The fix here is as suggested by Gerd.

 hw/usb-musb.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/usb-musb.c b/hw/usb-musb.c
index 6037193..3eed855 100644
--- a/hw/usb-musb.c
+++ b/hw/usb-musb.c
@@ -611,7 +611,7 @@ static void musb_packet(MUSBState *s, MUSBEndPoint *ep,
 }
 
 ep->status[dir] = ret;
-usb_packet_complete(s->port.dev, &ep->packey[dir].p);
+musb_schedule_cb(s->port.dev, &ep->packey[dir].p);
 }
 
 static void musb_tx_packet_complete(USBPacket *packey, void *opaque)
-- 
1.7.1




Re: [Qemu-devel] [PATCH 09/12] kvm: ppc: Drop KVM_CAP build dependencies

2011-06-14 Thread Alexander Graf

On 08.06.2011, at 16:11, Jan Kiszka wrote:

> No longer needed with accompanied kernel headers.

What a nice code cleanup.

Reviewed-by: Alexander Graf 




Re: [Qemu-devel] [PATCH 05/12] kvm: ppc: Drop CONFIG_KVM_PPC_PVR

2011-06-14 Thread Alexander Graf

On 08.06.2011, at 16:10, Jan Kiszka wrote:

> Required header support is now unconditionally available.

Looks good to me :)

Reviewed-by: Alexander Graf 


Alex




Re: [Qemu-devel] [PATCH 03/12] Switch build system to accompanied kernel headers

2011-06-14 Thread Alexander Graf

On 08.06.2011, at 16:10, Jan Kiszka wrote:

> This helps reducing our build-time checks for feature support in the
> available Linux kernel headers. And it helps users that do not have
> sufficiently recent headers installed on their build machine.
> 
> Consequently, the patch removes and build-time checks for kvm and vhost
> in configure, the --kerneldir switch, and KVM_CFLAGS. Kernel headers are
> supposed to be provided by QEMU only.
> 
> s390 needs some extra love as it carries redefinitions from kernel
> headers.

Yes. I was wondering if we should unconditionally include the kernel headers 
there. The problem I'm seeing there is that I don't know if that would work 
fine on non-Linux hosts, as that code definitely gets compiled there, while KVM 
code is not.
> 
> 

[...]

> diff --git a/target-s390x/cpu.h b/target-s390x/cpu.h
> index 4e5c391..b5e587f 100644
> --- a/target-s390x/cpu.h
> +++ b/target-s390x/cpu.h
> @@ -313,16 +313,6 @@ CPUState *s390_cpu_addr2state(uint16_t cpu_addr);
> /* from s390-virtio-bus */
> extern const target_phys_addr_t virtio_size;
> 
> -#ifndef KVM_S390_SIGP_STOP
> -#define KVM_S390_SIGP_STOP  0
> -#define KVM_S390_PROGRAM_INT0
> -#define KVM_S390_SIGP_SET_PREFIX0
> -#define KVM_S390_RESTART0
> -#define KVM_S390_INT_VIRTIO 0
> -#define KVM_S390_INT_SERVICE0
> -#define KVM_S390_INT_EMERGENCY  0
> -#endif
> -
> #endif
> void cpu_lock(void);
> void cpu_unlock(void);
> diff --git a/target-s390x/op_helper.c b/target-s390x/op_helper.c
> index db03a79..9429698 100644
> --- a/target-s390x/op_helper.c
> +++ b/target-s390x/op_helper.c
> @@ -23,6 +23,7 @@
> #include "helpers.h"
> #include 
> #include "kvm.h"
> +#include 

Have you tried to compile this on non-Linux?


Alex



Re: [Qemu-devel] [PATCH 07/12] kvm: Drop KVM_CAP build dependencies

2011-06-14 Thread Jan Kiszka
On 2011-06-14 13:17, Alexander Graf wrote:
> 
> On 14.06.2011, at 13:07, Jan Kiszka wrote:
> 
>> On 2011-06-14 13:05, Alexander Graf wrote:
>>>
>>> On 08.06.2011, at 16:11, Jan Kiszka wrote:
>>>
 No longer needed with accompanied kernel headers. We are only left with
 build dependencies that are controlled by kvm arch headers.
>>>
>>> This should completely rule out all CAPs right? IIRC, all CAPs are defined 
>>> in generic code, so we don't get number overlaps.
>>
>> "We are only left with build dependencies that are controlled by kvm
>> arch headers." E.g. KVM_CAP_VCPU_EVENTS.
>>
>>>

 CC: Alexander Graf 
 Signed-off-by: Jan Kiszka 
 ---
 kvm-all.c |8 
 1 files changed, 0 insertions(+), 8 deletions(-)

 diff --git a/kvm-all.c b/kvm-all.c
 index 4a9910a..cbc2532 100644
 --- a/kvm-all.c
 +++ b/kvm-all.c
 @@ -757,21 +757,17 @@ int kvm_init(void)
s->coalesced_mmio = kvm_check_extension(s, KVM_CAP_COALESCED_MMIO);

s->broken_set_mem_region = 1;
 -#ifdef KVM_CAP_JOIN_MEMORY_REGIONS_WORKS
ret = kvm_check_extension(s, KVM_CAP_JOIN_MEMORY_REGIONS_WORKS);
if (ret > 0) {
s->broken_set_mem_region = 0;
}
 -#endif

 #ifdef KVM_CAP_VCPU_EVENTS
>>>
>>> ... which leaves the question why this one is still here :).
>>
>> See above (does PPC support it?).
> 
> Well, regardless of support, the #ifdef shouldn't be required, right?
> 
> include/linux/kvm.h:518:#define KVM_CAP_VCPU_EVENTS 41
> 
> ... as long as there's a runtime check :)

Are all types and dependent constants available? Maybe it is the case
here, but you cannot generally assume this if a CAP is per-arch.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



[Qemu-devel] [PATCH 32/34] hw/usb-ohci.c: Ignore writes to HcPeriodCurrentED register

2011-06-14 Thread Gerd Hoffmann
From: Peter Maydell 

HcPeriodCurrentED is read-only, but Linux writes to it anyway; silently
ignore this rather than printing a warning message.

(Specifically, drivers/usb/host/ohci-hub.c:ohci_rh_resume() writes a
0, in at least kernels 2.6.25 through 2.6.39.)

Signed-off-by: Peter Maydell 
Signed-off-by: Gerd Hoffmann 
---
 hw/usb-ohci.c |4 
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c
index 401045a..ab77434 100644
--- a/hw/usb-ohci.c
+++ b/hw/usb-ohci.c
@@ -1575,6 +1575,10 @@ static void ohci_mem_write(void *ptr, target_phys_addr_t 
addr, uint32_t val)
 ohci->hcca = val & OHCI_HCCA_MASK;
 break;
 
+case 7: /* HcPeriodCurrentED */
+/* Ignore writes to this read-only register, Linux does them */
+break;
+
 case 8: /* HcControlHeadED */
 ohci->ctrl_head = val & OHCI_EDPTR_MASK;
 break;
-- 
1.7.1




[Qemu-devel] [PATCH 34/34] usb-uhci: fix expire time initialization.

2011-06-14 Thread Gerd Hoffmann
expire_time must be initialited when the guest activates the
usb scheduler, not at device creation time.

Signed-off-by: Gerd Hoffmann 
---
 hw/usb-uhci.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/usb-uhci.c b/hw/usb-uhci.c
index 1503373..75cd231 100644
--- a/hw/usb-uhci.c
+++ b/hw/usb-uhci.c
@@ -424,6 +424,8 @@ static void uhci_ioport_writew(void *opaque, uint32_t addr, 
uint32_t val)
 case 0x00:
 if ((val & UHCI_CMD_RS) && !(s->cmd & UHCI_CMD_RS)) {
 /* start frame processing */
+s->expire_time = qemu_get_clock_ns(vm_clock) +
+(get_ticks_per_sec() / FRAME_TIMER_FREQ);
 qemu_mod_timer(s->frame_timer, qemu_get_clock_ns(vm_clock));
 s->status &= ~UHCI_STS_HCHALTED;
 } else if (!(val & UHCI_CMD_RS)) {
@@ -1131,8 +1133,6 @@ static int usb_uhci_common_initfn(UHCIState *s)
 usb_port_location(&s->ports[i].port, NULL, i+1);
 }
 s->frame_timer = qemu_new_timer_ns(vm_clock, uhci_frame_timer, s);
-s->expire_time = qemu_get_clock_ns(vm_clock) +
-(get_ticks_per_sec() / FRAME_TIMER_FREQ);
 s->num_ports_vmstate = NB_PORTS;
 QTAILQ_INIT(&s->async_pending);
 
-- 
1.7.1




Re: [Qemu-devel] [PATCH] xen: avoid tracking the region 0xa0000 - 0xbffff

2011-06-14 Thread Jan Kiszka
On 2011-06-14 12:54, Alexander Graf wrote:
> 
> On 03.06.2011, at 17:56,  
>  wrote:
> 
>> From: Stefano Stabellini 
>>
>> Xen can only do dirty bit tracking for one memory region, so we should
>> explicitly avoid trying to track the legacy VGA region between 0xa
>> and 0xb, rather than trying and failing.
>>
>> Signed-off-by: Stefano Stabellini 
>> ---
>> xen-all.c |4 
>> 1 files changed, 4 insertions(+), 0 deletions(-)
>>
>> diff --git a/xen-all.c b/xen-all.c
>> index 9a5c3ec..1fdc2e8 100644
>> --- a/xen-all.c
>> +++ b/xen-all.c
>> @@ -218,6 +218,10 @@ static int xen_add_to_physmap(XenIOState *state,
>> if (get_physmapping(state, start_addr, size)) {
>> return 0;
>> }
>> +/* do not try to map legacy VGA memory */
>> +if (start_addr >= 0xa && start_addr + size <= 0xb) {
> 
> I don't quite like the hardcoded range here. What exactly is the issue? The 
> fact that you can only map a single region? Then do a counter and fail when 
> it's > 1. If you don't want to map the VGA region as memory slot, why not 
> change the actual mapping code in the cirrus adapter?

Err, please no "if (xen_enabled())" in that code. We just got rid of the
kvm_enabled() mess. And it doesn't scale, it would be required in e1000
as well e.g.

BTW, if Xen is not able to track more than one dirty region, I think
it's time to fix that limitation. At some point it may no longer be
possible to work around it (who knows how the new memory API will look
like in this regard).

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



[Qemu-devel] [PATCH 33/34] hw/usb-ohci.c: Implement remote wakeup

2011-06-14 Thread Gerd Hoffmann
From: Peter Maydell 

Implement the wakeup callback in the OHCI USBPortOps, so that when
a downstream device wakes up it correctly causes the OHCI controller
to come out of suspend.

Signed-off-by: Peter Maydell 
Signed-off-by: Gerd Hoffmann 
---
 hw/usb-ohci.c |   17 +
 1 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/hw/usb-ohci.c b/hw/usb-ohci.c
index ab77434..832dcd6 100644
--- a/hw/usb-ohci.c
+++ b/hw/usb-ohci.c
@@ -367,6 +367,22 @@ static void ohci_detach(USBPort *port1)
 ohci_set_interrupt(s, OHCI_INTR_RHSC);
 }
 
+static void ohci_wakeup(USBDevice *dev)
+{
+USBBus *bus = usb_bus_from_device(dev);
+OHCIState *s = container_of(bus, OHCIState, bus);
+int portnum = dev->port->index;
+OHCIPort *port = &s->rhport[portnum];
+if (port->ctrl & OHCI_PORT_PSS) {
+DPRINTF("usb-ohci: port %d: wakeup\n", portnum);
+port->ctrl |= OHCI_PORT_PSSC;
+port->ctrl &= ~OHCI_PORT_PSS;
+if ((s->ctl & OHCI_CTL_HCFS) == OHCI_USB_SUSPEND) {
+ohci_set_interrupt(s, OHCI_INTR_RD);
+}
+}
+}
+
 /* Reset the controller */
 static void ohci_reset(void *opaque)
 {
@@ -1675,6 +1691,7 @@ static CPUWriteMemoryFunc * const ohci_writefn[3]={
 static USBPortOps ohci_port_ops = {
 .attach = ohci_attach,
 .detach = ohci_detach,
+.wakeup = ohci_wakeup,
 .complete = ohci_async_complete_packet,
 };
 
-- 
1.7.1




[Qemu-devel] [PULL] usb patch queue

2011-06-14 Thread Gerd Hoffmann
  Hi,

The USB patch queue has been rebased, got a minor fix (wrong comment in
patch #8, spotted by David Ahern) and three new patches.  I'm just
posting the three new patches to avoid spamming the list with 30
identical patches ...

please pull,
  Gerd

The following changes since commit 0b862cedf36d927818c50584ddd611b0370673df:

  configure: Detect and don't try to use older libcurl (2011-06-13 21:16:27 
+0200)

are available in the git repository at:
  git://git.kraxel.org/qemu usb.16

Brad Hards (3):
  usb: Add defines for USB Serial Bus Release Number register
  usb: Use defines for serial bus release number register for UHCI
  usb: Use defines for serial bus release number register for EHCI

Gerd Hoffmann (18):
  usb-linux: catch ENODEV in more places.
  usb-ehci: trace mmio and usbsts
  usb-ehci: trace state machine changes
  usb-ehci: trace port state
  usb-ehci: improve mmio tracing
  usb-ehci: trace buffer copy
  usb-ehci: add queue data struct
  usb-ehci: multiqueue support
  usb-ehci: fix offset writeback in ehci_buffer_rw
  usb-ehci: fix error handling.
  usb: cancel async packets on unplug
  usb-ehci: drop EXECUTING checks.
  usb-ehci: itd handling fixes.
  usb-ehci: split trace calls to handle arg count limits
  usb: documentation update
  usb-linux: only cleanup in host_close when host_open was successful.
  usb: don't call usb_host_device_open from vl.c
  usb-uhci: fix expire time initialization.

Hans de Goede (9):
  ehci: fix a number of unused-but-set-variable warnings (new with gcc-4.6)
  usb-linux: Get speed from sysfs rather then from the connectinfo ioctl
  usb-linux: Teach about super speed
  usb-linux: Don't do perror when errno is not set
  usb-linux: Ensure devep != 0
  usb-linux: Don't try to open the same device twice
  usb-linux: Enlarge buffer for descriptors to 8192 bytes
  usb-bus: Add knowledge of USB_SPEED_SUPER to usb_speed helper
  usb-bus: Don't detach non attached devices on device exit

Kevin O'Connor (2):
  Fix USB mouse Set_Protocol behavior
  The USB tablet should not claim boot protocol support.

Peter Maydell (2):
  hw/usb-ohci.c: Ignore writes to HcPeriodCurrentED register
  hw/usb-ohci.c: Implement remote wakeup

 docs/usb2.txt  |   85 
 hw/milkymist-softusb.c |   10 +-
 hw/usb-bus.c   |   10 +-
 hw/usb-ehci.c  | 1198 
 hw/usb-hid.c   |5 +-
 hw/usb-musb.c  |   23 +-
 hw/usb-ohci.c  |   37 ++-
 hw/usb-uhci.c  |   32 ++-
 hw/usb.h   |   14 +-
 trace-events   |   20 +
 usb-linux.c|   96 +++--
 vl.c   |6 +-
 12 files changed, 990 insertions(+), 546 deletions(-)



Re: [Qemu-devel] [PATCH 07/12] kvm: Drop KVM_CAP build dependencies

2011-06-14 Thread Alexander Graf

On 08.06.2011, at 16:11, Jan Kiszka wrote:

> No longer needed with accompanied kernel headers. We are only left with
> build dependencies that are controlled by kvm arch headers.

This should completely rule out all CAPs right? IIRC, all CAPs are defined in 
generic code, so we don't get number overlaps.

> 
> CC: Alexander Graf 
> Signed-off-by: Jan Kiszka 
> ---
> kvm-all.c |8 
> 1 files changed, 0 insertions(+), 8 deletions(-)
> 
> diff --git a/kvm-all.c b/kvm-all.c
> index 4a9910a..cbc2532 100644
> --- a/kvm-all.c
> +++ b/kvm-all.c
> @@ -757,21 +757,17 @@ int kvm_init(void)
> s->coalesced_mmio = kvm_check_extension(s, KVM_CAP_COALESCED_MMIO);
> 
> s->broken_set_mem_region = 1;
> -#ifdef KVM_CAP_JOIN_MEMORY_REGIONS_WORKS
> ret = kvm_check_extension(s, KVM_CAP_JOIN_MEMORY_REGIONS_WORKS);
> if (ret > 0) {
> s->broken_set_mem_region = 0;
> }
> -#endif
> 
> #ifdef KVM_CAP_VCPU_EVENTS

... which leaves the question why this one is still here :).


Alex




Re: [Qemu-devel] [PATCH] Reset system before loadvm

2011-06-14 Thread Jan Kiszka
On 2011-06-14 12:50, Avi Kivity wrote:
> On 06/14/2011 09:19 AM, Jan Kiszka wrote:
>> On 2011-06-12 19:13, Avi Kivity wrote:
>> >  On 06/11/2011 12:05 PM, Jan Kiszka wrote:
>> >>  From: Jan Kiszka
>> >>
>> >>  In case we load the vmstate during incoming migration, we start
>> from a
>> >>  clean, default machine state as we went through system reset
>> before. But
>> >>  if we load from a snapshot, the machine can be in any state. That can
>> >>  cause troubles if loading an older image which does not carry all
>> state
>> >>  information the executing QEMU requires. Almost no device takes
>> care of
>> >>  this scenario.
>> >>
>> >>  However, fixing this is trivial. We just need to issue a system reset
>> >>  during loadvm as well.
>>
> 
>> >>  +qemu_system_reset();
>> >>ret = qemu_loadvm_state(f);
>> >>
>> >>qemu_fclose(f);
>> >
>> >  Should we suppress the reset event sent out on the monitor?  After
>> all,
>> >  it's the result of an internal implementation choice, not something
>> the
>> >  user or the guest did.
>>
>> We already issue this pattern during -loadvm or -incoming - or is the
>> monitor not yet connected at this point?
> 
> I believe it is not.  But regardless, we shouldn't add more incorrect
> behaviour.

It depends on how the reset event is defined in QMP. As I see it, there
is nothing stated about reset reasons or sources. So emitting
information about the actually happening reset can't be incorrect. Just
like emitting the information about the VM stop/start around loadvm.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux



Re: [Qemu-devel] [PATCH] Reset system before loadvm

2011-06-14 Thread Avi Kivity

On 06/14/2011 09:19 AM, Jan Kiszka wrote:

On 2011-06-12 19:13, Avi Kivity wrote:
>  On 06/11/2011 12:05 PM, Jan Kiszka wrote:
>>  From: Jan Kiszka
>>
>>  In case we load the vmstate during incoming migration, we start from a
>>  clean, default machine state as we went through system reset before. But
>>  if we load from a snapshot, the machine can be in any state. That can
>>  cause troubles if loading an older image which does not carry all state
>>  information the executing QEMU requires. Almost no device takes care of
>>  this scenario.
>>
>>  However, fixing this is trivial. We just need to issue a system reset
>>  during loadvm as well.




>>  +qemu_system_reset();
>>ret = qemu_loadvm_state(f);
>>
>>qemu_fclose(f);
>
>  Should we suppress the reset event sent out on the monitor?  After all,
>  it's the result of an internal implementation choice, not something the
>  user or the guest did.

We already issue this pattern during -loadvm or -incoming - or is the
monitor not yet connected at this point?


I believe it is not.  But regardless, we shouldn't add more incorrect 
behaviour.


--
error compiling committee.c: too many arguments to function





Re: [Qemu-devel] How to increase ram beyond 756 MB ?

2011-06-14 Thread Peter Maydell
2011/6/14 मनीष शर्मा :
> I am using arm on x86, running Linux 2.6.33 as guest and fedora 14 as host.
> And using Realview board inside Qemu Machine.

Do you mean "realview-eb", "realview-eb-mpcore", "realview-pb-a9",
or "realview-pbx-a9" ? They are different models.

> Even though I passed 1024 MB as qemu memory in command line,
> I can only see ~756MB given to guest. Also I tried passing more than
> 1024 MB say 2048 MB But I am unable to see increase in RAM in Guest OS.
> I checked the /proc/meminfo for guest OS memory details.

For both "realview-eb" and "realview-eb-mpcore" the maximum memory
allowed is 256MB, because the hardware we are modelled has an
address space layout which does not permit more. If you're using
those models, then to use more RAM you must use a different model.

"realview-pb-a9" and "realview-pbx-a9" should both in theory
support 1024MB of RAM as a maximum. Typically to make this work
you need to both pass QEMU the "-m 1024" option (to model the
extra RAM) and also pass the guest kernel an option like "mem=1024M".
(However the pb-a9 hardware has 512MB of RAM as a maximum and
we are providing extra modelled RAM in a "reserved" section of
the address space, so it's possible the guest kernel wouldn't
try to use it.)

-- PMM



[Qemu-devel] How to increase ram beyond 756 MB ?

2011-06-14 Thread मनीष शर्मा
Hi All,

I am using arm on x86, running Linux 2.6.33 as guest and fedora 14 as host.
And using Realview board inside Qemu Machine.

Even though I passed 1024 MB as qemu memory in command line,
I can only see ~756MB given to guest. Also I tried passing more than
1024 MB say 2048 MB But I am unable to see increase in RAM in Guest OS.
I checked the /proc/meminfo for guest OS memory details.

What do I missed ? Any help is higly appreciated.

regards,
Manish

-- 
-मनीष


Re: [Qemu-devel] [PATCH] xen: fix interrupt routing

2011-06-14 Thread Alexander Graf

On 31.05.2011, at 13:05, Stefano Stabellini wrote:

> On Sat, 28 May 2011, Alexander Graf wrote:
>> 
>> On 26.05.2011, at 17:48, Stefano Stabellini wrote:
>> 
>>> xen: fix interrupt routing
>>> 
>>> - remove i440FX-xen and i440fx_write_config_xen
>>> we don't need to intercept pci config writes to i440FX anymore;
>> 
>> Why not? In which version? Did anything below change? What about compat 
>> code? Older hypervisor versions?
> 
> Nothing changed, I think it was a genuine mistake in the original patch
> series: the intention was to catch the pci config writes to 0x60-0x63 to
> reprogram the xen pci link routes accordingly (see
> xen_piix_pci_write_config_client).  If I am not mistaken a similar thing
> is done by non-Xen Qemu in piix3_update_irq_levels.
> 
> 
>>> - introduce PIIX3-xen and piix3_write_config_xen
>>> we do need to intercept pci config write to the PCI-ISA bridge to update
>>> the PCI link routing;
>>> 
>>> - set the number of PIIX3-xen interrupts lines to 128;
>>> 
>>> Signed-off-by: Stefano Stabellini 
>>> 
>>> diff --git a/hw/pc.h b/hw/pc.h
>>> index 0dcbee7..6d5730b 100644
>>> --- a/hw/pc.h
>>> +++ b/hw/pc.h
>>> @@ -176,7 +176,6 @@ struct PCII440FXState;
>>> typedef struct PCII440FXState PCII440FXState;
>>> 
>>> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, 
>>> qemu_irq *pic, ram_addr_t ram_size);
>>> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, 
>>> qemu_irq *pic, ram_addr_t ram_size);
>>> void i440fx_init_memory_mappings(PCII440FXState *d);
>>> 
>>> /* piix4.c */
>>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
>>> index 9a22a8a..ba198de 100644
>>> --- a/hw/pc_piix.c
>>> +++ b/hw/pc_piix.c
>>> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
>>>isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
>>> 
>>>if (pci_enabled) {
>>> -if (!xen_enabled()) {
>>> -pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, 
>>> ram_size);
>>> -} else {
>>> -pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, 
>>> isa_irq, ram_size);
>>> -}
>>> +pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, 
>>> ram_size);
>>>} else {
>>>pci_bus = NULL;
>>>i440fx_state = NULL;
>>> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
>>> index 7f1c4cc..3d73a42 100644
>>> --- a/hw/piix_pci.c
>>> +++ b/hw/piix_pci.c
>>> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
>>> 
>>> #define PIIX_NUM_PIC_IRQS   16  /* i8259 * 2 */
>>> #define PIIX_NUM_PIRQS  4ULL/* PIRQ[A-D] */
>>> +#define XEN_PIIX_NUM_PIRQS  128ULL
>>> #define PIIX_PIRQC  0x60
>>> 
>>> typedef struct PIIX3State {
>>> @@ -78,6 +79,8 @@ struct PCII440FXState {
>>> #define I440FX_SMRAM0x72
>>> 
>>> static void piix3_set_irq(void *opaque, int pirq, int level);
>>> +static void piix3_write_config_xen(PCIDevice *dev,
>>> +   uint32_t address, uint32_t val, int len);
>>> 
>>> /* return the global irq number corresponding to a given device irq
>>>   pin. We could also use the bus number to have a more precise
>>> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
>>>}
>>> }
>>> 
>>> -static void i440fx_write_config_xen(PCIDevice *dev,
>>> -uint32_t address, uint32_t val, int 
>>> len)
>>> -{
>>> -xen_piix_pci_write_config_client(address, val, len);
>>> -i440fx_write_config(dev, address, val, len);
>>> -}
>>> -
>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>>> {
>>>PCII440FXState *d = opaque;
>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char 
>>> *device_name,
>>>d = pci_create_simple(b, 0, device_name);
>>>*pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>> 
>>> -piix3 = DO_UPCAST(PIIX3State, dev,
>>> -  pci_create_simple_multifunction(b, -1, true, 
>>> "PIIX3"));
>>> +if (xen_enabled()) {
>>> +piix3 = DO_UPCAST(PIIX3State, dev,
>>> +pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
>>> +pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>>> +piix3, XEN_PIIX_NUM_PIRQS);
>> 
>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason 
>> behind this change?
> 
> It is still a piix3, but also provides non-legacy interrupt links to the
> IO-APIC.
> The four pins of each PCI device on the bus not only are routed to the
> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
> they are connected to the IO-APIC directly.
> These additional routes can only be discovered through ACPI, so you need
> matching ACPI tables. We used to build the old ACPI tables like this:
> 
> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
> printf("Name(PRTA, Package() {\n");
> for ( dev = 1; dev < 32; dev++ )
>for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
>printf("Package

[Qemu-devel] [Bug 785668] Re: bonding inside a bridge does not update ARP correctly when bridged net accessed from within a VM

2011-06-14 Thread Raul Morera
** Also affects: qemu
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/785668

Title:
  bonding inside a bridge does not update ARP correctly when bridged net
  accessed from within a VM

Status in QEMU:
  New
Status in “linux” package in Ubuntu:
  Confirmed
Status in “qemu-kvm” package in Ubuntu:
  Invalid

Bug description:
  Binary package hint: qemu-kvm

  Description: Ubuntu 10.4.2
  Release: 10.04

  When setting a KVM host with a bond0 interface made of eth0 and eth1
  and using this bond0 interface for a bridge to KVM VMs, the ARP tables
  do not get updated correctly so it is not possible for a VM to reach
  an IP on the bridged network up until that remote system has pinged
  the VM itself.

  Reproducible: 100%, with any of the load balancing mode

  How to reproduce the problem

  - Prerequisites:
  1 One KVM system with 10.04.02 server,  configured as a virtual host is 
needed. The following /etc/network/interfaces was used for the test :

  # grep  -v ^# /etc/network/interfaces
  auto lo
  iface lo inet loopback

  auto bond0
  iface bond0 inet manual
   post-up ifconfig $IFACE up
   pre-down ifconfig $IFACE down
   bond-slaves none
   bond_mode balance-rr
   bond-downdelay 250
   bond-updelay 120
  auto eth0
  allow-bond0 eth0
  iface eth0 inet manual
   bond-master bond0
  auto eth1
  allow-bond0 eth1
  iface eth1 inet manual
   bond-master bond0

  auto br0
  iface br0 inet dhcp
   # dns-* options are implemented by the resolvconf package, if installed
   bridge-ports bond0
   bridge-stp off
   bridge-fd 9
   bridge-hello 2
   bridge-maxage 12
   bridge_max_wait 0

  One VM running Maverick 10.10 server, standard installation, using the
  following /etc/network/interfaces file :

  grep -v ^# /etc/network/interfaces

  auto lo
  iface lo inet loopback

  auto eth0
  iface eth0 inet static
  address 10.153.107.92
  netmask 255.255.255.0
  network 10.153.107.0
  broadcast 10.153.107.255

  --
  On a remote bridged network system :
  $ arp -an
  ? (10.153.107.188) à 00:1c:c4:6a:c0:dc [ether] sur tap0
  ? (16.1.1.1) à 00:17:33:e9:ee:3c [ether] sur wlan0
  ? (10.153.107.52) à 00:1c:c4:6a:c0:de [ether] sur tap0

  On KVMhost
  $ arp -an
  ? (10.153.107.80) at ee:99:73:68:f0:a5 [ether] on br0

  On VM
  $ arp -an
  ? (10.153.107.61) at  on eth0

  1) Test #1 : ping from VM (10.153.107.92) to remote bridged network
  system (10.153.107.80) :

  - On remote bridged network system :
  caribou@marvin:~$ arp -an
  ? (10.153.107.188) à 00:1c:c4:6a:c0:dc [ether] sur tap0
  ? (16.1.1.1) à 00:17:33:e9:ee:3c [ether] sur wlan0
  ? (10.153.107.52) à 00:1c:c4:6a:c0:de [ether] sur tap0

  - On KVMhost
  ubuntu@VMhost:~$ arp -an
  ? (10.153.107.80) at ee:99:73:68:f0:a5 [ether] on br0

  - On VM
  ubuntu@vm1:~$ ping 10.153.107.80
  PING 10.153.107.80 (10.153.107.80) 56(84) bytes of data.
  From 10.153.107.92 icmp_seq=1 Destination Host Unreachable
  From 10.153.107.92 icmp_seq=2 Destination Host Unreachable
  From 10.153.107.92 icmp_seq=3 Destination Host Unreachable
  ^C
  --- 10.153.107.80 ping statistics ---
  4 packets transmitted, 0 received, +3 errors, 100% packet loss, time 3010ms
  pipe 3
  ubuntu@vm1:~$ arp -an
  ? (10.153.107.61) at  on eth0
  ? (10.153.107.80) at  on eth0

  2) Test #2 : ping from remote bridged network system (10.153.107.80)
  to VM (10.153.107.92) :

  - On remote bridged network system :
  $ ping 10.153.107.92
  PING 10.153.107.92 (10.153.107.92) 56(84) bytes of data.
  64 bytes from 10.153.107.92: icmp_req=1 ttl=64 time=327 ms
  64 bytes from 10.153.107.92: icmp_req=2 ttl=64 time=158 ms
  64 bytes from 10.153.107.92: icmp_req=3 ttl=64 time=157 ms
  ^C
  --- 10.153.107.92 ping statistics ---
  3 packets transmitted, 3 received, 0% packet loss, time 2003ms
  rtt min/avg/max/mdev = 157.289/214.500/327.396/79.831 ms
  caribou@marvin:~$ arp -an
  ? (10.153.107.188) à 00:1c:c4:6a:c0:dc [ether] sur tap0
  ? (16.1.1.1) à 00:17:33:e9:ee:3c [ether] sur wlan0
  ? (10.153.107.52) à 00:1c:c4:6a:c0:de [ether] sur tap0
  ? (10.153.107.92) à 52:54:00:8c:e0:3c [ether] sur tap0

  - On KVMhost
  $ arp -an
  ? (10.153.107.80) at ee:99:73:68:f0:a5 [ether] on br0

  - On VM
  arp -an
  ? (10.153.107.61) at  on eth0
  ? (10.153.107.80) at ee:99:73:68:f0:a5 [ether] on eth0

  3) Test #3 : New ping from VM (10.153.107.92) to remote bridged network 
system (10.153.107.80) :
  - On remote bridged network system :
  $ arp -an
  ? (10.153.107.188) à 00:1c:c4:6a:c0:dc [ether] sur tap0
  ? (16.1.1.1) à 00:17:33:e9:ee:3c [ether] sur wlan0
  ? (10.153.107.52) à 00:1c:c4:6a:c0:de [ether] sur tap0
  ? (10.153.107.92) à 52:54:00:8c:e0:3c [ether] sur tap0

  - On KVMhost
  ubuntu@VMhost:~$ arp -an
  ? (10.153.107.80) at ee:99:73:68:f0:a5 [ether] on br0

  - On VM
  ubuntu@vm1:~$ ping 10.153.107.80
  PING 10.153.107.80 (10.153.107.80) 56

Re: [Qemu-devel] QEMU API for modelling of user's hardware

2011-06-14 Thread Markus Armbruster
wzab  writes:

> I've found the following source of information regarding writing of device
> models for QEMU:
> http://www.linux-kvm.org/wiki/images/f/fe/2010-forum-armbru-qdev.pdf
> 
>  
>
> Is there any better or more detailed description (except of sources
> themselves ;-) )?

I'm not aware of any.  Yes, this sucks.



Re: [Qemu-devel] [PATCH 2/2] qemu-io: Fix if scoping bug

2011-06-14 Thread Markus Armbruster
Devin Nakamura  writes:

> Fix a bug caused by lack of braces in if statement

You describe the bug's cause.  That's good.  Please also describe the
bug's effect, i.e. what exactly is broken for users.

>
> Signed-off-by: Devin Nakamura 
> ---
>  qemu-io.c |4 ++--
>  1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/qemu-io.c b/qemu-io.c
> index 53adb76..1c4f684 100644
> --- a/qemu-io.c
> +++ b/qemu-io.c
> @@ -433,12 +433,12 @@ static int read_f(int argc, char **argv)
>  return 0;
>  }
>
> -if (!pflag)
> +if (!pflag){

Put a space between ) and {

>  if (offset & 0x1ff) {
>  printf("offset %" PRId64 " is not sector aligned\n",
> offset);
>  return 0;
> -
> +}
>  if (count & 0x1ff) {
>  printf("count %d is not sector aligned\n",
> count);



Re: [Qemu-devel] virtio scsi host draft specification, v3

2011-06-14 Thread Hannes Reinecke

On 06/10/2011 04:35 PM, Paolo Bonzini wrote:

If requests are placed on arbitrary queues you'll inevitably run on
locking issues to ensure strict request ordering.
I would add here:

If a device uses more than one queue it is the responsibility of the
device to ensure strict request ordering.


Applied with s/device/guest/g.


Please do not rely in bus/target/lun here. These are leftovers from
parallel SCSI and do not have any meaning on modern SCSI
implementation (eg FC or SAS). Rephrase that to

The lun field is the Logical Unit Number as defined in SAM.


Ok.


  The status byte is written by the device to be the SCSI status
  code.


?? I doubt that exists. Make that:

The status byte is written by the device to be the status code as
defined in SAM.


Ok.


  The response byte is written by the device to be one of the
  following:

  - VIRTIO_SCSI_S_OK when the request was completed and the
  status byte
is filled with a SCSI status code (not necessarily "GOOD").

  - VIRTIO_SCSI_S_UNDERRUN if the content of the CDB requires
  transferring
more data than is available in the data buffers.

  - VIRTIO_SCSI_S_ABORTED if the request was cancelled due to a
  reset
or another task management function.

  - VIRTIO_SCSI_S_FAILURE for other host or guest error. In
  particular,
if neither dataout nor datain is empty, and the
VIRTIO_SCSI_F_INOUT
feature has not been negotiated, the request will be
immediately
returned with a response equal to VIRTIO_SCSI_S_FAILURE.


And, of course:

VIRTIO_SCSI_S_DISCONNECT if the request could not be processed due
to a communication failure (eg device was removed or could not be
reached).


Ok.


This specification implies a strict one-to-one mapping between host
and target. IE there is no way of specifying more than one target
per host.


Actually no, the intention is to use hierarchical LUNs to support
more than one target per host.


Can't.

Hierarchical LUNs is a target-internal representation.
The initiator (ie guest OS) should _not_ try to assume anything 
about the internal structure and just use the LUN as an opaque number.


Reason being that the LUN addressing is not unique, and there are 
several choices on how to represent a given LUN.

So the consensus here is that different LUN numbers represent
different physical devices, regardless on the (internal) LUN 
representation.
Which in turn means we cannot use the LUN number to convey anything 
else than a device identification relative to a target.


Cf SAM-3 paragraph 4.8:

A logical unit number is a field (see 4.9) containing 64 bits that 
identifies the logical unit within a SCSI target device

when accessed by a SCSI target port.

IE the LUN is dependent on the target, but you cannot make 
assumptions on the target.


Consequently, it's in the hosts' responsibility to figure out the 
targets in the system. After that it invokes the 'scan' function 
from the SCSI midlayer.

You can't start from a LUN and try to figure out the targets ...

If you want to support more than on target per host you need some 
sort of enumeration/callback which allows the host to figure out

the number of available targets.
But in general the targets are referenced by the target port 
identifier as specified in the appropriate standard (eg FC or SAS).

Sadly, we don't have any standard to fall back on for this.

If, however, we decide to expose some details about the backend, we 
could be using the values from the backend directly.

EG we could be forwarding the SCSI target port identifier here
(if backed by real hardware) or creating our own SAS-type
identifier when backed by qemu block. Then we could just query
the backend via a new command on the controlq
(eg 'list target ports') and wouldn't have to worry about any 
protocol specific details here.


Of course, when doing so we would be lose the ability to freely 
remap LUNs. But then remapping LUNs doesn't gain you much imho.

Plus you could always use qemu block backend here if you want
to hide the details.
But we would be finally able to use NPIV for KVM, something
I wanted to do for a _long_ time.

I personally _really_ would like to see the real backing device 
details exposed to the guest.
Otherwise the more advanced stuff like persistent reservations 
becomes _really_ hard if not impossible.


Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)



Re: [Qemu-devel] [PATCH][uq/master] kvm: x86: Save/restore FPU OP, IP and DP

2011-06-14 Thread Jan Kiszka
On 2011-06-14 10:23, Avi Kivity wrote:
> On 06/14/2011 09:10 AM, Jan Kiszka wrote:
>> On 2011-06-13 10:45, Avi Kivity wrote:
>> >  On 06/11/2011 12:23 PM, Jan Kiszka wrote:
>> >>  From: Jan Kiszka
>> >>
>> >>  These FPU states are properly maintained by KVM but not yet by
>> TCG. So
>> >>  far we unconditionally set them to 0 in the guest which may cause
>> >>  state corruptions - not only during migration.
>> >>
>> >>
>> >>  -#define CPU_SAVE_VERSION 12
>> >>  +#define CPU_SAVE_VERSION 13
>> >>
>> >
>> >  Incrementing the version number seems excessive - I can't imagine a
>> >  real-life guest will break due to fp pointer corruption
>> >
>> >  However, I don't think we have a mechanism for optional state.  We
>> >  discussed this during the 18th VMState Subsection Symposium and IIRC
>> >  agreed to re-raise the issue when we encountered it, which appears
>> to be
>> >  now.
>> >
>>
>> Whatever we invent, it has to be backported as well to allow that
>> infamous traveling back in time, migrating VMs from newer to older
>> versions.
>>
>> Would that backporting be simpler if we used an unconditional subsection
>> for the additional states?
> 
> Most likely.  It depends on what mechanism we use.
> 
> Let's spend some time to think about what it would be like.  This patch
> is not urgent, is it? (i.e. it was discovered by code inspection, not
> live migration that caught the cpu between an instruction that caused a
> math exception and the exception handler).

Right, not urgent, should just make it into 0.15 in the end.

Jan



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH][uq/master] kvm: x86: Save/restore FPU OP, IP and DP

2011-06-14 Thread Avi Kivity

On 06/14/2011 09:10 AM, Jan Kiszka wrote:

On 2011-06-13 10:45, Avi Kivity wrote:
>  On 06/11/2011 12:23 PM, Jan Kiszka wrote:
>>  From: Jan Kiszka
>>
>>  These FPU states are properly maintained by KVM but not yet by TCG. So
>>  far we unconditionally set them to 0 in the guest which may cause
>>  state corruptions - not only during migration.
>>
>>
>>  -#define CPU_SAVE_VERSION 12
>>  +#define CPU_SAVE_VERSION 13
>>
>
>  Incrementing the version number seems excessive - I can't imagine a
>  real-life guest will break due to fp pointer corruption
>
>  However, I don't think we have a mechanism for optional state.  We
>  discussed this during the 18th VMState Subsection Symposium and IIRC
>  agreed to re-raise the issue when we encountered it, which appears to be
>  now.
>

Whatever we invent, it has to be backported as well to allow that
infamous traveling back in time, migrating VMs from newer to older versions.

Would that backporting be simpler if we used an unconditional subsection
for the additional states?


Most likely.  It depends on what mechanism we use.

Let's spend some time to think about what it would be like.  This patch 
is not urgent, is it? (i.e. it was discovered by code inspection, not 
live migration that caught the cpu between an instruction that caused a 
math exception and the exception handler).


--
error compiling committee.c: too many arguments to function




[Qemu-devel] [RFC PATCH] virtio-9p: Use clone approach to fix TOCTOU vulnerability

2011-06-14 Thread M. Mohan Kumar
[RFC PATCH] virtio-9p: Use clone approach to fix TOCTOU vulnerability

In passthrough security model, following a symbolic link in the server
side could result in TOCTTOU vulnerability.

Use clone system call to create a thread which runs in chrooted
environment. All passthrough model file operations are done from this
thread to avoid TOCTTOU vulnerability.

Signed-off-by: Venkateswararao Jujjuri 
Signed-off-by: M. Mohan Kumar 
---
 fsdev/file-op-9p.h |1 +
 hw/9pfs/virtio-9p-coth.c   |  105 +--
 hw/9pfs/virtio-9p-coth.h   |   13 +-
 hw/9pfs/virtio-9p-device.c |7 +++-
 hw/9pfs/virtio-9p.h|6 ++-
 5 files changed, 124 insertions(+), 8 deletions(-)

diff --git a/fsdev/file-op-9p.h b/fsdev/file-op-9p.h
index 5d088d4..c54f6bf 100644
--- a/fsdev/file-op-9p.h
+++ b/fsdev/file-op-9p.h
@@ -60,6 +60,7 @@ typedef struct FsContext
 SecModel fs_sm;
 uid_t uid;
 int open_flags;
+int enable_chroot;
 struct xattr_operations **xops;
 /* fs driver specific data */
 void *private;
diff --git a/hw/9pfs/virtio-9p-coth.c b/hw/9pfs/virtio-9p-coth.c
index e61b656..aa71a83 100644
--- a/hw/9pfs/virtio-9p-coth.c
+++ b/hw/9pfs/virtio-9p-coth.c
@@ -17,6 +17,8 @@
 #include "qemu-thread.h"
 #include "qemu-coroutine.h"
 #include "virtio-9p-coth.h"
+#include 
+#include "qemu-error.h"
 
 /* v9fs glib thread pool */
 static V9fsThPool v9fs_pool;
@@ -56,7 +58,96 @@ static void v9fs_thread_routine(gpointer data, gpointer 
user_data)
 } while (len == -1 && errno == EINTR);
 }
 
-int v9fs_init_worker_threads(void)
+static int v9fs_chroot_function(void *arg)
+{
+GError *err;
+V9fsChrootState *csp = arg;
+FsContext *fs_ctx = csp->fs_ctx;
+V9fsThPool *p = &v9fs_pool;
+int notifier_fds[2];
+
+if (chroot(fs_ctx->fs_root) < 0) {
+error_report("chroot");
+goto error;
+}
+
+if (qemu_pipe(notifier_fds) == -1) {
+return -1;
+}
+p->pool = g_thread_pool_new(v9fs_thread_routine, p, 10, TRUE, &err);
+if (!p->pool) {
+error_report("g_thread_pool_new");
+goto error;
+}
+
+p->completed = g_async_queue_new();
+if (!p->completed) {
+/*
+ * We are going to terminate.
+ * So don't worry about cleanup
+ */
+goto error;
+}
+p->rfd = notifier_fds[0];
+p->wfd = notifier_fds[1];
+
+fcntl(p->rfd, F_SETFL, O_NONBLOCK);
+fcntl(p->wfd, F_SETFL, O_NONBLOCK);
+
+qemu_set_fd_handler(p->rfd, v9fs_qemu_process_req_done, NULL, NULL);
+
+/* Signal parent thread that thread pools are created */
+g_cond_signal(csp->cond);
+g_mutex_lock(csp->mutex_term);
+/* TODO: Wake up cs->terminate during 9p hotplug support */
+g_cond_wait(csp->terminate, csp->mutex);
+g_mutex_unlock(csp->mutex_term);
+g_mutex_free(csp->mutex);
+g_cond_free(csp->cond);
+g_mutex_free(csp->mutex_term);
+g_cond_free(csp->terminate);
+qemu_free(csp->stack);
+qemu_free(csp);
+return 0;
+error:
+p->pool = NULL;
+g_cond_signal(csp->cond);
+return 0;
+}
+
+static int v9fs_clone_chroot_th(FsContext *fs_ctx)
+{
+pid_t pid;
+V9fsChrootState *cs;
+
+cs = qemu_malloc(sizeof(*cs));
+cs->stack = qemu_malloc(THREAD_STACK);
+cs->mutex = g_mutex_new();
+cs->cond = g_cond_new();
+cs->mutex_term = g_mutex_new();
+cs->terminate = g_cond_new();
+cs->fs_ctx = fs_ctx;
+
+g_mutex_lock(cs->mutex);
+pid = clone(&v9fs_chroot_function, cs->stack + THREAD_STACK, CLONE_FILES |
+CLONE_SIGHAND | CLONE_VM | CLONE_THREAD, cs);
+if (pid == -1) {
+error_report("clone");
+g_mutex_unlock(cs->mutex);
+g_mutex_free(cs->mutex);
+g_cond_free(cs->cond);
+g_mutex_free(cs->mutex_term);
+g_cond_free(cs->terminate);
+qemu_free(cs->stack);
+qemu_free(cs);
+return pid;
+}
+g_cond_wait(cs->cond, cs->mutex);
+g_mutex_unlock(cs->mutex);
+return pid;
+}
+
+int v9fs_init_worker_threads(FsContext *fs_ctx)
 {
 int notifier_fds[2];
 V9fsThPool *p = &v9fs_pool;
@@ -66,14 +157,18 @@ int v9fs_init_worker_threads(void)
 g_thread_init(NULL);
 }
 
-if (qemu_pipe(notifier_fds) == -1) {
-return -1;
+if (fs_ctx->enable_chroot) {
+return v9fs_clone_chroot_th(fs_ctx);
+} else {
+p->pool = g_thread_pool_new(v9fs_thread_routine, p, -1, FALSE, NULL);
 }
-
-p->pool = g_thread_pool_new(v9fs_thread_routine, p, -1, FALSE, NULL);
 if (!p->pool) {
 return -1;
 }
+if (qemu_pipe(notifier_fds) == -1) {
+return -1;
+}
+
 p->completed = g_async_queue_new();
 if (!p->completed) {
 /*
diff --git a/hw/9pfs/virtio-9p-coth.h b/hw/9pfs/virtio-9p-coth.h
index 4630080..422354a 100644
--- a/hw/9pfs/virtio-9p-coth.h
+++ b/hw/9pfs/virtio-9p-coth.h
@@ -20,6 +20,8 @@
 #include "virtio-9p.h"
 #include 
 
+#define

Re: [Qemu-devel] [Qemu-trivial] [PATCH] Fix typo in cpus.c

2011-06-14 Thread Stefan Hajnoczi
On Mon, Jun 13, 2011 at 10:16:36PM -0400, Alexandre Raymond wrote:
> filed -> failed
> 
> Signed-off-by: Alexandre Raymond 
> ---
>  cpus.c |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)

Thanks, applied to the trivial patches tree:
http://repo.or.cz/w/qemu/stefanha.git/shortlog/refs/heads/trivial-patches

Stefan



Re: [Qemu-devel] [PATCH 03/12] VMDK: probe for mono flat image

2011-06-14 Thread Stefan Hajnoczi
On Sat, Jun 04, 2011 at 08:40:50AM +0800, Fam Zheng wrote:
> vmdk_probe for mono flat images.
> 
> Signed-off-by: Fam Zheng 
> ---
>  block/vmdk.c |   13 ++---
>  1 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/block/vmdk.c b/block/vmdk.c
> index f787528..bf8d02a 100644
> --- a/block/vmdk.c
> +++ b/block/vmdk.c
> @@ -101,10 +101,17 @@ static int vmdk_probe(const uint8_t *buf, int
> buf_size, const char *filename)
>  return 0;
>  magic = be32_to_cpu(*(uint32_t *)buf);
>  if (magic == VMDK3_MAGIC ||
> -magic == VMDK4_MAGIC)
> +magic == VMDK4_MAGIC) {
>  return 100;
> -else
> -return 0;
> +} else {
> +char *cid_p, *ct_p, *extent_p;
> +cid_p = strstr((char *)buf, "CID");
> +ct_p = strstr((char *)buf, "createType");
> +extent_p = strstr((char *)buf, "RW");
> +if (cid_p && ct_p && extent_p)
> +return 100;

NUL-terminated string functions cannot be used for probing because the
input file may be invalid.  If the magic number matches but there is no
NUL in the buffer then the strstr(3) will run off the end of the buffer.

Also note that the specification says "The descriptor file is not
case-sensitive".  "cid", "CiD", and "CID" should all be allowed.

Do non-monolithic vmdk images always have "# Disk DescriptorFile" as the
first line?  Perhaps you can test for that using memcmp(3) instead.

Stefan



Re: [Qemu-devel] [PATCH v2] block/rbd: Remove unused local variable

2011-06-14 Thread Kevin Wolf
Am 10.06.2011 22:05, schrieb Stefan Weil:
> Variable 'snap' is assigned a value that is never used.
> Remove snap and the related code.
> 
> v2:
>   The unused variable which was in function rbd_open is now in function
>   qemu_rbd_create, so the patch needed an update.
> 
> Cc: Christian Brunner 
> Cc: Josh Durgin 
> Cc: Kevin Wolf 
> Signed-off-by: Stefan Weil 

Thanks, applied to the block branch.

Kevin



Re: [Qemu-devel] [PATCH] CPU consumption optimization of 'qemu-img convert' using bdrv_is_allocated()

2011-06-14 Thread Dmitry Konishchev
On Mon, Jun 13, 2011 at 1:13 PM, Dmitry Konishchev  wrote:
> I haven't done this because in this case I have to pass too lot of
> local variables to this function. Just not sure that it'll look
> better. But if you mind I surely can do this.
Should I?

-- 
Dmitry Konishchev
mailto:konishc...@gmail.com



Re: [Qemu-devel] usb-musb: calls usb_packet_complete() on packets with no owner

2011-06-14 Thread Gerd Hoffmann

  Hi,


 usb_packet_complete(s->port.dev,&ep->packey[dir].p);

which will call usb_packet_complete() on packets which did not
return USB_RET_ASYNC from usb_handle_packet, and so trips this
assert.



Any suggestions about what the right way to fix this is?
(I'm a bit confused about the comment that usb_packet_complete
is to 'notify the controller' -- usb-musb is the controller...)


The usual work flow is that the usb device (usb-msd.c does this for 
example) kicks the I/O, then returns USB_RET_ASYNC, and when the I/O is 
completed it calls usb_packet_complete().


In case of the musb controller this will call musb_schedule_cb() which 
is hooked into musb_port_ops->complete.  So simply calling 
musb_schedule_cb() directly for non-async packets should work I think.


cheers,
  Gerd




<    1   2