date:20110719

Re: [Qemu-devel] [PATCH] Fix duplicate device reset

2011-07-19 Thread Stefan Weil


Am 19.07.2011 04:39, schrieb Isaku Yamahata:

Thank you for addressing this. Similar patches were proposed and
weren't merged unfortunately.

The reason why the qdev_register_reset() in vl.c is to keep the reset order.
The reset for main_system_bus shouldn't registered by qbus_create_inplace().
But the check, bus != main_system_bus, doesn't work as intended because
main_system_bus is NULL in early qdev creation.
So there are possible ways for the fix.

- Don't care the reset order
   your patch +
   remove if (bus != main_system_bus) in qbus_create_inplace()

- keep the reset order
   - instantiate main_system_bus early.
 So the check, bus != main_system_bus in qbus_create_inplace(), will work.
   or
   - fix the check, bus != main_system_bus in qbus_create_inplace(), somehow

thanks,


Hi,

my patch does not remove sysbus_get_default(),
so the reset order is kept because main_system_bus
is instantiated by this call.

Cheers,
Stefan

Re: [Qemu-devel] failed migration makes monitor stuck

2011-07-19 Thread Michael Tokarev

Ping?  Anyone know this area of code?  Can we just remove
one monitor_resume() call from migrate_fd_put_buffer() ?

09.07.2011 15:07, Michael Tokarev wrote:
 After some debugging I found a programming error in
 error handling in migration, but I'm not sure how to
 fix it.
 
 When migration starts, monitor gets suspended, calling
 monitor_suspend() routine which increments assotiated
 suspend_cnt counter.
 
 At the end of migration, in migrate_fd_cleanup(),
 monitor_resume() gets called, which decrements the
 counter.
 
 But monitor_resume() gets also called from another
 place, in migrate_fd_put_buffer(), in case we
 encountered a write error.
 
 So, suppose a tcp endpoint has disconnected, or the
 exec: program terminated due to error or whatnot --
 in all these cases write will fail, and we'll call
 monitor_resume() twice as a result: once in this
 place in migrate_fd_put_buffer(), and once more at
 the end in migrate_fd_cleanup().
 
 This results in suspend_cnt being decremented twice,
 with the resultant value being -1.
 
 So monitor_can_read() will return 0 from now on, since
 it compares suspend_cnt with 0.  And hence, monitor will
 stop working.
 
 To me it looks like monitor_resume() call should be
 removed from migrate_fd_put_buffer(), but I'm not sure
 _why_ it were here in the first place.
 
 There's more: monitor_suspend() gets called from within
 protocol handlers (using migrate_fd_monitor_suspend()
 routine), -- are we sure that all current and future
 protocol handlers will call this function?
 
 Thanks!
 
 /mjt

Re: [Qemu-devel] [PATCH 0/4] scsi fixes

2011-07-19 Thread Hannes Reinecke


On 07/12/2011 03:37 PM, Kevin Wolf wrote:

Am 11.07.2011 15:02, schrieb Hannes Reinecke:

Hi all,

these are some fixes I found during debugging my megasas HBA emulation.
This time I've sent them as a separate patchset for inclusion.
All of them have been acked, so please apply.

Hannes Reinecke (4):
   iov: Update parameter usage in iov_(to|from)_buf()
   scsi: Add 'hba_private' to SCSIRequest
   scsi-disk: Fixup debugging statement
   scsi-disk: Mask out serial number EVPD


Thanks, applied all to the block branch.


Any chance to have them pulled into the main tree?
My megasas emulation relies on them, and it feels a bit
stupid to send a patch relying on some fixes not in mainline.
At the same time it's really stupid to resend the entire
patchset again ...

Thanks.

Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

[Qemu-devel] buildbot failure in qemu on s390-next_i386_debian_5_0

2011-07-19 Thread qemu

The Buildbot has detected a new failure on builder s390-next_i386_debian_5_0 
while building qemu.
Full details are available at:
 http://buildbot.b1-systems.de/qemu/builders/s390-next_i386_debian_5_0/builds/38

Buildbot URL: http://buildbot.b1-systems.de/qemu/

Buildslave for this Build: yuzuki

Build Reason: The Nightly scheduler named 'nightly_s390-next' triggered this 
build
Build Source Stamp: [branch s390-next] HEAD
Blamelist: 

BUILD FAILED: failed git

sincerely,
 -The Buildbot

[Qemu-devel] buildbot failure in qemu on s390-next_x86_64_debian_5_0

2011-07-19 Thread qemu

The Buildbot has detected a new failure on builder s390-next_x86_64_debian_5_0 
while building qemu.
Full details are available at:
 
http://buildbot.b1-systems.de/qemu/builders/s390-next_x86_64_debian_5_0/builds/38

Buildbot URL: http://buildbot.b1-systems.de/qemu/

Buildslave for this Build: yuzuki

Build Reason: The Nightly scheduler named 'nightly_s390-next' triggered this 
build
Build Source Stamp: [branch s390-next] HEAD
Blamelist: 

BUILD FAILED: failed git

sincerely,
 -The Buildbot

Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Ian Campbell

On Tue, 2011-07-19 at 03:53 +0100, Wei Liu wrote:
 On Mon, 2011-07-18 at 11:03 +0100, Ian Campbell wrote:
  On Mon, 2011-07-18 at 10:51 +0100, Wei Liu wrote:
   Bug resend.
   
   This bug was reported about one month ago. QEMU fails to start with
   Xen unstable. I found that it has not been fix with latest Xen
   unstable. BIOS is Seabios (with Xen patch).
  
  Please use current mainline seabios.git -- it does not require any
  additional patches.
  
  http://wiki.xensource.com/xenwiki/QEMUUpstream also includes an updated
  SeaBIOS .config which you might try.
  
  Ian.
  
 
 Thanks Ian. This bug is fixed. But I spot new bug with Stefano's
 xen-next QEMU.
 
 --mainline seabios with xen-next--
 (XEN) HVM7: HVM Loader
 (XEN) HVM7: Detected Xen v4.2-unstable
 (XEN) HVM7: Xenbus rings @0xfeffc000, event channel 2
 (XEN) HVM7: System requested SeaBIOS
 (XEN) HVM7: CPU speed is 2993 MHz
 (XEN) irq.c:264: Dom7 PCI link 0 changed 0 - 5
 (XEN) HVM7: PCI-ISA link 0 routed to IRQ5
 (XEN) irq.c:264: Dom7 PCI link 1 changed 0 - 10
 (XEN) HVM7: PCI-ISA link 1 routed to IRQ10
 (XEN) irq.c:264: Dom7 PCI link 2 changed 0 - 11
 (XEN) HVM7: PCI-ISA link 2 routed to IRQ11
 (XEN) irq.c:264: Dom7 PCI link 3 changed 0 - 5
 (XEN) HVM7: PCI-ISA link 3 routed to IRQ5
 (XEN) HVM7: *** HVMLoader assertion '(devfn != PCI_ISA_DEVFN) ||
 ((vendor_id == 0x8086) 
 (XEN) HVM7:  (device_id == 0x7000))' failed at pci.c:78
 (XEN) HVM7: *** HVMLoader crashed.

Anthony posted a patch for this to qemu-devel a few weeks back, I think
it was hw/piix_pci.c: Fix PIIX3-xen to initialize ids (did I see a
pull request for it recently? If so then it might be in the main tree by
now...)

 If I use Anthony's old QEMU tree, qemu-dm-15, HVM boots up. But there
 are issues with irq binding.

I don't know about this one I'm afraid.

Ian.

 --mainline seabios with qemu-dm-15--
 (XEN) irq.c:344: Dom6 callback via changed to Direct Vector 0xf3
 (XEN) irq.c:1979: dom6: pirq 16 or emuirq 8 already mapped
 (XEN) irq.c:1979: dom6: pirq 16 or emuirq 12 already mapped
 (XEN) irq.c:1979: dom6: pirq 16 or emuirq 1 already mapped
 (XEN) irq.c:1979: dom6: pirq 16 or emuirq 6 already mapped
 (XEN) irq.c:1979: dom6: pirq 16 or emuirq 4 already mapped
 (XEN) irq.c:1979: dom6: pirq 16 or emuirq 7 already mapped
 (XEN) event_channel.c:341:d6 EVTCHNOP failure: error -17
 (XEN) event_channel.c:341:d6 EVTCHNOP failure: error -17
 (XEN) event_channel.c:341:d6 EVTCHNOP failure: error -17
 (XEN) event_channel.c:341:d6 EVTCHNOP failure: error -17
 
 
 Wei.

Re: [Qemu-devel] [PATCH] Fix duplicate device reset

2011-07-19 Thread Isaku Yamahata

On Tue, Jul 19, 2011 at 07:56:41AM +0200, Stefan Weil wrote:
 Am 19.07.2011 04:39, schrieb Isaku Yamahata:
 Thank you for addressing this. Similar patches were proposed and
 weren't merged unfortunately.

 The reason why the qdev_register_reset() in vl.c is to keep the reset order.
 The reset for main_system_bus shouldn't registered by qbus_create_inplace().
 But the check, bus != main_system_bus, doesn't work as intended because
 main_system_bus is NULL in early qdev creation.
 So there are possible ways for the fix.

 - Don't care the reset order
your patch +
remove if (bus != main_system_bus) in qbus_create_inplace()

 - keep the reset order
- instantiate main_system_bus early.
  So the check, bus != main_system_bus in qbus_create_inplace(), will 
 work.
or
- fix the check, bus != main_system_bus in qbus_create_inplace(), somehow

 thanks,

 Hi,

 my patch does not remove sysbus_get_default(),
 so the reset order is kept because main_system_bus
 is instantiated by this call.

Yes, your patch doesn't change the order from the existing code.
I think it's not intended one.
During machine creation, someone may call sysbus_get_default().
So the reset for main_system_bus may not be the lastly registered.

The changeset of 80376c3f tries to keep the reset order, but failed.
That's the issue.
-- 
yamahata

[Qemu-devel] [PATCH] avoid core reading with bdrv_read (qemu-io)

2011-07-19 Thread Frediano Ziglio

This patch apply to kevin coroutine-block branch and avoid code. It
fix qcow: Use coroutines patch. Test case:

$ ./qemu-img create -f qcow aaa.img 1G
Formatting 'aaa.img', fmt=qcow size=1073741824 encryption=off
$ ./qemu-io aaa.img
qemu-io read 1024 1024
Segmentation fault

Signed-off-by: Frediano Ziglio fredd...@gmail.com
---
 block/qcow.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/block/qcow.c b/block/qcow.c
index 6f7973c..1386e92 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -573,7 +573,8 @@ static int qcow_aio_read_cb(void *opaque)

 if (acb-nb_sectors == 0) {
 /* request completed */
-qemu_iovec_from_buffer(acb-qiov, acb-orig_buf, acb-qiov-size);
+if (acb-orig_buf)
+qemu_iovec_from_buffer(acb-qiov, acb-orig_buf, acb-qiov-size);
 return 0;
 }

@@ -648,6 +649,7 @@ static int qcow_co_readv(BlockDriverState *bs,
int64_t sector_num,

 if (acb-qiov-niov  1) {
 qemu_vfree(acb-orig_buf);
+acb-orig_buf = NULL;
 }
 qemu_aio_release(acb);

@@ -729,6 +731,7 @@ static int qcow_co_writev(BlockDriverState *bs,
int64_t sector_num,

 if (acb-qiov-niov  1) {
 qemu_vfree(acb-orig_buf);
+acb-orig_buf = NULL;
 }
 qemu_aio_release(acb);

-- 
1.7.1

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Jes Sorensen

On 07/18/11 16:08, Stefan Hajnoczi wrote:
 On Fri, Jul 15, 2011 at 3:58 PM, Jes Sorensen jes.soren...@redhat.com wrote:
 I have been updating the live snapshot wiki for qemu to try and cover
 the commands we will want for async snapshot handling too.

 http://wiki.qemu.org/Features/Snapshots
 
 Regarding fd passing, do we even support SELinux today with backing files?

Not sure I understand what you mean. The current code should be happy to
take an existing file or a raw device for the snapshot.

Jes

Re: [Qemu-devel] [PATCH 0/4] scsi fixes

2011-07-19 Thread Kevin Wolf

Am 19.07.2011 08:31, schrieb Hannes Reinecke:
 On 07/12/2011 03:37 PM, Kevin Wolf wrote:
 Am 11.07.2011 15:02, schrieb Hannes Reinecke:
 Hi all,

 these are some fixes I found during debugging my megasas HBA emulation.
 This time I've sent them as a separate patchset for inclusion.
 All of them have been acked, so please apply.

 Hannes Reinecke (4):
iov: Update parameter usage in iov_(to|from)_buf()
scsi: Add 'hba_private' to SCSIRequest
scsi-disk: Fixup debugging statement
scsi-disk: Mask out serial number EVPD

 Thanks, applied all to the block branch.

 Any chance to have them pulled into the main tree?
 My megasas emulation relies on them, and it feels a bit
 stupid to send a patch relying on some fixes not in mainline.
 At the same time it's really stupid to resend the entire
 patchset again ...

I'm hoping to send a pull request today, now that the VMDK patches look
good finally.

Anyway, I don't think that not having the patches in master yet should
stop you from going forward with the next patches. They will go through
the block tree anyway, so basing them on that tree is fine (and you
wouldn't be the first one to do that). Just state in PATCH 0/n for the
reviewers that it depends on the other patches.

Kevin

Re: [Qemu-devel] [PATCH] avoid core reading with bdrv_read (qemu-io)

2011-07-19 Thread Kevin Wolf

Am 19.07.2011 09:33, schrieb Frediano Ziglio:
 This patch apply to kevin coroutine-block branch and avoid code. It
 fix qcow: Use coroutines patch. Test case:
 
 $ ./qemu-img create -f qcow aaa.img 1G
 Formatting 'aaa.img', fmt=qcow size=1073741824 encryption=off
 $ ./qemu-io aaa.img
 qemu-io read 1024 1024
 Segmentation fault
 
 Signed-off-by: Frediano Ziglio fredd...@gmail.com

Thanks for the report. I'll update the patch, but in a slightly
different way that matches the old code better:

diff --git a/block/qcow.c b/block/qcow.c
index 6f7973c..6447c2a 100644
--- a/block/qcow.c
+++ b/block/qcow.c
@@ -573,7 +573,6 @@ static int qcow_aio_read_cb(void *opaque)

 if (acb-nb_sectors == 0) {
 /* request completed */
-qemu_iovec_from_buffer(acb-qiov, acb-orig_buf, acb-qiov-size);
 return 0;
 }

@@ -647,6 +646,7 @@ static int qcow_co_readv(BlockDriverState *bs,
int64_t sector_num,
 qemu_co_mutex_unlock(s-lock);

 if (acb-qiov-niov  1) {
+qemu_iovec_from_buffer(acb-qiov, acb-orig_buf, acb-qiov-size);
 qemu_vfree(acb-orig_buf);
 }
 qemu_aio_release(acb);

[Qemu-devel] [PULL] virtio-serial: Fixes, trace points

2011-07-19 Thread Amit Shah

Hi Anthony,

Please pull for trace points for virtio-serial/console code and a fix
for a host process closing chardev connection causing an abort().

The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

  Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 19:43:00 
+)

are available in the git repository at:
  git://git.kernel.org/pub/scm/qemu/amit/virtio-serial.git for-anthony

Amit Shah (4):
  virtio-serial-bus: Add trace events
  virtio-console: Add some trace events
  virtio-serial-bus: Fix trailing \n in error_report string
  virtio-console: Prevent abort()s in case of host chardev close

 hw/virtio-console.c|   25 +++--
 hw/virtio-serial-bus.c |9 -
 trace-events   |   11 +++
 3 files changed, 42 insertions(+), 3 deletions(-)


Amit

[Qemu-devel] coroutines and block I/O considerations

2011-07-19 Thread Frediano Ziglio

Hi,
  I'm exercise myself in block I/O layer and I decided to test
coroutine branch cause I find it easier to use instead of normal
callback. Looking at normal code there are a lot of rows in source to
save/restore state and declare callbacks and is not that easier to
understand the normal flow. At the end I would like to create a new
image format to get rid of some performance problem I encounter using
writethrough and snapshots. I have some questions regard block I/O and
also coroutines

1- threading model. I don't understand it. I can see that aio pool
routines does not contain locking code so I think aio layer is mainly
executed in a single thread. I saw introduction of some locking using
coroutines so I think coroutines are now called from different threads
and needs lock (current implementation serialize all device
operations)

2- memory considerations on coroutines. Beside coroutines allow more
readable code I wonder if somebody considered memory. For every
coroutines a different stack has to be allocated. For instance
ucontext and win32 implementation use 4mb. Assuming 128 concurrent AIO
this require about 512mb of ram (mostly only committed but not used
and coroutines are reused).

About snapshot and block i/o I think that using external snapshot
would help making some stuff easier. By external snapshot I mean
creating a new image with backing file as current image file and using
this new image for future operations. This would allow for instance
- support snapshot with every format (even raw)
- making snapshot backup using external programs (even from different
hosts using clustered file system and without many locking issues as
original image is now read-only)
- convert images live (just snapshot, qemu-img convert, remove snapshot)

Regards
  Frediano

Re: [Qemu-devel] [PATCH 0/4] scsi fixes

2011-07-19 Thread Hannes Reinecke


On 07/19/2011 09:39 AM, Kevin Wolf wrote:

Am 19.07.2011 08:31, schrieb Hannes Reinecke:

On 07/12/2011 03:37 PM, Kevin Wolf wrote:

Am 11.07.2011 15:02, schrieb Hannes Reinecke:

Hi all,

these are some fixes I found during debugging my megasas HBA emulation.
This time I've sent them as a separate patchset for inclusion.
All of them have been acked, so please apply.

Hannes Reinecke (4):
iov: Update parameter usage in iov_(to|from)_buf()
scsi: Add 'hba_private' to SCSIRequest
scsi-disk: Fixup debugging statement
scsi-disk: Mask out serial number EVPD


Thanks, applied all to the block branch.


Any chance to have them pulled into the main tree?
My megasas emulation relies on them, and it feels a bit
stupid to send a patch relying on some fixes not in mainline.
At the same time it's really stupid to resend the entire
patchset again ...


I'm hoping to send a pull request today, now that the VMDK patches look
good finally.

Anyway, I don't think that not having the patches in master yet should
stop you from going forward with the next patches. They will go through
the block tree anyway, so basing them on that tree is fine (and you
wouldn't be the first one to do that). Just state in PATCH 0/n for the
reviewers that it depends on the other patches.


Well, the remaining patch is 'just' the megasas emulation itself.
And I want to make reviewing that as easy as possible, so that it's 
not again being held off by complains about missing patches.


Cheers,

Hannes
--
Dr. Hannes Reinecke   zSeries  Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)

Re: [Qemu-devel] [PATCH] avoid core reading with bdrv_read (qemu-io)

2011-07-19 Thread Frediano Ziglio

2011/7/19 Kevin Wolf kw...@redhat.com:
 Am 19.07.2011 09:33, schrieb Frediano Ziglio:
 This patch apply to kevin coroutine-block branch and avoid code. It
 fix qcow: Use coroutines patch. Test case:

 $ ./qemu-img create -f qcow aaa.img 1G
 Formatting 'aaa.img', fmt=qcow size=1073741824 encryption=off
 $ ./qemu-io aaa.img
 qemu-io read 1024 1024
 Segmentation fault

 Signed-off-by: Frediano Ziglio fredd...@gmail.com

 Thanks for the report. I'll update the patch, but in a slightly
 different way that matches the old code better:

 diff --git a/block/qcow.c b/block/qcow.c
 index 6f7973c..6447c2a 100644
 --- a/block/qcow.c
 +++ b/block/qcow.c
 @@ -573,7 +573,6 @@ static int qcow_aio_read_cb(void *opaque)

     if (acb-nb_sectors == 0) {
         /* request completed */
 -        qemu_iovec_from_buffer(acb-qiov, acb-orig_buf, acb-qiov-size);
         return 0;
     }

 @@ -647,6 +646,7 @@ static int qcow_co_readv(BlockDriverState *bs,
 int64_t sector_num,
     qemu_co_mutex_unlock(s-lock);

     if (acb-qiov-niov  1) {
 +        qemu_iovec_from_buffer(acb-qiov, acb-orig_buf, acb-qiov-size);
         qemu_vfree(acb-orig_buf);
     }
     qemu_aio_release(acb);


Yes, my patch also removed some dandling pointer which I don't like
but are not a problem with current code.
In case of ret  0 (error) your code could copy data that probably are
not initialized. I don't know if data is used in case of failure but
in case memory is shared with guest (I don't know, perhaps using
virtio) this lead to security issues. Also some memory debugger like
valgrind could not like that copy.

Frediano

Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Wei Liu

On Tue, 2011-07-19 at 08:14 +0100, Ian Campbell wrote:
 On Tue, 2011-07-19 at 03:53 +0100, Wei Liu wrote:
  On Mon, 2011-07-18 at 11:03 +0100, Ian Campbell wrote:
   On Mon, 2011-07-18 at 10:51 +0100, Wei Liu wrote:
Bug resend.

This bug was reported about one month ago. QEMU fails to start with
Xen unstable. I found that it has not been fix with latest Xen
unstable. BIOS is Seabios (with Xen patch).
   
   Please use current mainline seabios.git -- it does not require any
   additional patches.
   
   http://wiki.xensource.com/xenwiki/QEMUUpstream also includes an updated
   SeaBIOS .config which you might try.
   
   Ian.
   
  
  Thanks Ian. This bug is fixed. But I spot new bug with Stefano's
  xen-next QEMU.
  
  --mainline seabios with xen-next--
  (XEN) HVM7: HVM Loader
  (XEN) HVM7: Detected Xen v4.2-unstable
  (XEN) HVM7: Xenbus rings @0xfeffc000, event channel 2
  (XEN) HVM7: System requested SeaBIOS
  (XEN) HVM7: CPU speed is 2993 MHz
  (XEN) irq.c:264: Dom7 PCI link 0 changed 0 - 5
  (XEN) HVM7: PCI-ISA link 0 routed to IRQ5
  (XEN) irq.c:264: Dom7 PCI link 1 changed 0 - 10
  (XEN) HVM7: PCI-ISA link 1 routed to IRQ10
  (XEN) irq.c:264: Dom7 PCI link 2 changed 0 - 11
  (XEN) HVM7: PCI-ISA link 2 routed to IRQ11
  (XEN) irq.c:264: Dom7 PCI link 3 changed 0 - 5
  (XEN) HVM7: PCI-ISA link 3 routed to IRQ5
  (XEN) HVM7: *** HVMLoader assertion '(devfn != PCI_ISA_DEVFN) ||
  ((vendor_id == 0x8086) 
  (XEN) HVM7:  (device_id == 0x7000))' failed at pci.c:78
  (XEN) HVM7: *** HVMLoader crashed.
 
 Anthony posted a patch for this to qemu-devel a few weeks back, I think
 it was hw/piix_pci.c: Fix PIIX3-xen to initialize ids (did I see a
 pull request for it recently? If so then it might be in the main tree by
 now...)
 

Good, this is it.

But this patch is not yet pulled in the tree.

Wei.

Re: [Qemu-devel] [PATCH 0/4] scsi fixes

2011-07-19 Thread Alexander Graf


On 19.07.2011, at 10:10, Hannes Reinecke wrote:

 On 07/19/2011 09:39 AM, Kevin Wolf wrote:
 Am 19.07.2011 08:31, schrieb Hannes Reinecke:
 On 07/12/2011 03:37 PM, Kevin Wolf wrote:
 Am 11.07.2011 15:02, schrieb Hannes Reinecke:
 Hi all,
 
 these are some fixes I found during debugging my megasas HBA emulation.
 This time I've sent them as a separate patchset for inclusion.
 All of them have been acked, so please apply.
 
 Hannes Reinecke (4):
iov: Update parameter usage in iov_(to|from)_buf()
scsi: Add 'hba_private' to SCSIRequest
scsi-disk: Fixup debugging statement
scsi-disk: Mask out serial number EVPD
 
 Thanks, applied all to the block branch.
 
 Any chance to have them pulled into the main tree?
 My megasas emulation relies on them, and it feels a bit
 stupid to send a patch relying on some fixes not in mainline.
 At the same time it's really stupid to resend the entire
 patchset again ...
 
 I'm hoping to send a pull request today, now that the VMDK patches look
 good finally.
 
 Anyway, I don't think that not having the patches in master yet should
 stop you from going forward with the next patches. They will go through
 the block tree anyway, so basing them on that tree is fine (and you
 wouldn't be the first one to do that). Just state in PATCH 0/n for the
 reviewers that it depends on the other patches.
 
 Well, the remaining patch is 'just' the megasas emulation itself.
 And I want to make reviewing that as easy as possible, so that it's not again 
 being held off by complains about missing patches.

Yes, no worries. Just state in the patch description that it's based on the 
block branch and actually do base it on it - the endianness specific ld./st. 
patches are upstream, so everything you need should be in that branch.


Alex

Re: [Qemu-devel] [PATCH] USB: add usb network redirection support

2011-07-19 Thread Hans de Goede


Hi,

On 07/18/2011 04:33 PM, Gerd Hoffmann wrote:

On 07/18/11 09:13, Hans de Goede wrote:

This patch adds support for a usb-redir device, which takes a chardev
as a communication channel to an actual usbdevice using the usbredir protocol.

Compiling the usb-redir device requires usbredir-0.3 to be installed for
the usbredir protocol parser, usbredir-0.3 also contains a server for
redirecting usb traffic from an actual usb device. You can get the 0.3
release of usbredir here:
http://people.fedoraproject.org/~jwrdegoede/usbredir-0.3.tar.bz2
(getting a more formal site for it is a WIP)


Looks good overall. scripts/checkpatch.pl has a bunch of codestyle complains 
which need to be fixed.


Sorry, I should have though of running checkpatch before submitting myself,
new version coming up.

Regards,

Hans

[Qemu-devel] [PATCH] USB: add usb network redirection support

2011-07-19 Thread Hans de Goede

This patch adds support for a usb-redir device, which takes a chardev
as a communication channel to an actual usbdevice using the usbredir protocol.

Compiling the usb-redir device requires usbredir-0.3 to be installed for
the usbredir protocol parser, usbredir-0.3 also contains a server for
redirecting usb traffic from an actual usb device. You can get the 0.3
release of usbredir here:
http://people.fedoraproject.org/~jwrdegoede/usbredir-0.3.tar.bz2
(getting a more formal site for it is a WIP)

Example usage:
1) Start usbredirserver for a usb device:
sudo usbredirserver 045e:0772
2) Start qemu with usb2 support + a chardev talking to usbredirserver +
   a usb-redir device using this chardev:
qemu ... \
  -readconfig docs/ich9-ehci-uhci.cfg \
  -chardev socket,id=usbredirchardev,host=localhost,port=4000 \
  -device usb-redir,chardev=usbredirchardev,id=usbredirdev

Signed-off-by: Hans de Goede hdego...@redhat.com
---
 Makefile.objs |1 +
 configure |   28 ++
 usb-redir.c   | 1218 +
 3 files changed, 1247 insertions(+), 0 deletions(-)
 create mode 100644 usb-redir.c

diff --git a/Makefile.objs b/Makefile.objs
index cea15e4..ad69fbc 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -205,6 +205,7 @@ hw-obj-$(CONFIG_HPET) += hpet.o
 hw-obj-$(CONFIG_APPLESMC) += applesmc.o
 hw-obj-$(CONFIG_SMARTCARD) += usb-ccid.o ccid-card-passthru.o
 hw-obj-$(CONFIG_SMARTCARD_NSS) += ccid-card-emulated.o
+hw-obj-$(CONFIG_USB_REDIR) += usb-redir.o
 
 # PPC devices
 hw-obj-$(CONFIG_OPENPIC) += openpic.o
diff --git a/configure b/configure
index 88159ac..843bbd8 100755
--- a/configure
+++ b/configure
@@ -177,6 +177,7 @@ spice=
 rbd=
 smartcard=
 smartcard_nss=
+usb_redir=
 opengl=
 
 # parse CC options first
@@ -743,6 +744,10 @@ for opt do
   ;;
   --enable-smartcard-nss) smartcard_nss=yes
   ;;
+  --disable-usb-redir) usb_redir=no
+  ;;
+  --enable-usb-redir) usb_redir=yes
+  ;;
   *) echo ERROR: unknown option $opt; show_help=yes
   ;;
   esac
@@ -1018,6 +1023,8 @@ echo   --disable-smartcard  disable smartcard 
support
 echo   --enable-smartcard   enable smartcard support
 echo   --disable-smartcard-nss  disable smartcard nss support
 echo   --enable-smartcard-nss   enable smartcard nss support
+echo   --disable-usb-redir  disable usb network redirection support
+echo   --enable-usb-redir   enable usb network redirection support
 echo 
 echo NOTE: The object files are built at the place where configure is 
launched
 exit 1
@@ -2371,6 +2378,22 @@ if test $smartcard = no ; then
 smartcard_nss=no
 fi
 
+# check for usbredirparser for usb network redirection support
+if test $usb_redir != no ; then
+if $pkg_config libusbredirparser /dev/null 21 ; then
+usb_redir=yes
+usb_redir_cflags=$($pkg_config --cflags libusbredirparser 2/dev/null)
+usb_redir_libs=$($pkg_config --libs libusbredirparser 2/dev/null)
+QEMU_CFLAGS=$QEMU_CFLAGS $usb_redir_cflags
+LIBS=$LIBS $usb_redir_libs
+else
+if test $usb_redir = yes; then
+feature_not_found usb-redir
+fi
+usb_redir=no
+fi
+fi
+
 ##
 
 ##
@@ -2617,6 +2640,7 @@ echo spice support $spice
 echo rbd support   $rbd
 echo xfsctl support$xfs
 echo nss used  $smartcard_nss
+echo usb net redir $usb_redir
 echo OpenGL support$opengl
 
 if test $sdl_too_old = yes; then
@@ -2910,6 +2934,10 @@ if test $smartcard_nss = yes ; then
   echo CONFIG_SMARTCARD_NSS=y  $config_host_mak
 fi
 
+if test $usb_redir = yes ; then
+  echo CONFIG_USB_REDIR=y  $config_host_mak
+fi
+
 if test $opengl = yes ; then
   echo CONFIG_OPENGL=y  $config_host_mak
 fi
diff --git a/usb-redir.c b/usb-redir.c
new file mode 100644
index 000..e212993
--- /dev/null
+++ b/usb-redir.c
@@ -0,0 +1,1218 @@
+/*
+ * USB redirector usb-guest
+ *
+ * Copyright (c) 2011 Red Hat, Inc.
+ *
+ * Red Hat Authors:
+ * Hans de Goede hdego...@redhat.com
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION

[Qemu-devel] External COW format for raw images

2011-07-19 Thread Robert Wang

As you known, raw image is very popular,but the raw image format does
NOT support Copy-On-Write,a raw image file can NOT be used as a copy
destination, then image streaming/Live Block Copy will NOT work.

To fix this, we need to add a new block driver raw-cow to QEMU. If
finished, we can use qemu-img like this:
qemu-img create -f raw-cow -o backing_file=ubuntu.img,raw_file=my_vm.img
my_vm.raw-cow

1) ubuntu.img is the backing file, my_vm.img is a raw file,
my_vm.raw-cow stores a COW bitmap related to my_vm.img.

2) If the entire COW bitmap is set to dirty flag then we can get all
information from my_vm.img and can ignore ubuntu.img and my_vm.raw-cow
from now.

To implement this, I think I can follow these steps:
1) Add a new member to BlockDriverState struct:
char raw_file[1024];
This member will track raw_file parameter related to raw-cow file from
command line.

2)  * Create a new file block/raw-cow.c. It will be much more like the
mixture of block/cow.c and block/raw.c.

So I will change some functions in cow.c and raw.c to none-static, then
raw-cow.c can re-use them. When read operation occurs, determine whether
dirty flag in raw-cow image is set. If true, read directly from the raw
file. After write operation, set related dirty flag in raw-cow image.
And other functions might also be modified.

* Of course, format_name member of BlockDriver struct will be raw-cow.
And in order to keep relationship with raw file( like my_vm.img) ,
raw_cow_header struct should be
struct raw_cow_header {
uint32_t magic;
uint32_t version;
char backing_file[1024];
char raw_file[1024];/* added*/
int32_t mtime;
uint64_t size;
uint32_t sectorsize;
};
* Struct raw_cow_create_options should be one member plus based on
cow_create_options:
{
.name = BLOCK_OPT_RAW_FILE,
.type = OPT_STRING,
.help = Raw file name
},

3) Add bdrv_get_raw_filename in img_info function of qemu-img.c. In
bdrv_get_raw_filename, if the format of the image file is raw-cow,
print the related raw file.

Do you think my approach is right?
Thank you.

[Qemu-devel] Using checkpatch.pl to check coding style added to wiki

2011-07-19 Thread Stefan Hajnoczi

I just updated SubmitAPatch to mention running scripts/checkpatch.pl
before submitting patches:
http://wiki.qemu.org/Contribute/SubmitAPatch

Follow the coding style and run scripts/checkpatch.pl patchfile
before submitting

Checkpatch.pl makes it easy to follow the coding style and eliminates
the please add a curly brace and resend the patch hassle :).

Stefan

Re: [Qemu-devel] External COW format for raw images

2011-07-19 Thread Stefan Hajnoczi

2011/7/19 Robert Wang wdon...@linux.vnet.ibm.com:
 2)      * Create a new file block/raw-cow.c. It will be much more like the
 mixture of block/cow.c and block/raw.c.

 So I will change some functions in cow.c and raw.c to none-static, then
 raw-cow.c can re-use them. When read operation occurs, determine whether
 dirty flag in raw-cow image is set. If true, read directly from the raw
 file. After write operation, set related dirty flag in raw-cow image.
 And other functions might also be modified.

The block/cow.c driver is inefficient because it does I/O for each
bitmap set/test operation.  I think doing this more efficiently means
basically rewriting the bitmap code to keep a writethrough bitmap in
memory.

Regarding the file header, the msize is not really useful - there is
no interface to read it and no feature makes use of msize.  The
sector_size field could also be dropped.  The true sector size does
from the underlying storage that contains this image file.  Especially
in the cache=none (O_DIRECT) case we need to honor the underlying
sector size and I'm not sure the sector_size field helps.

Stefan

Re: [Qemu-devel] coroutines and block I/O considerations

2011-07-19 Thread Kevin Wolf

Am 19.07.2011 10:06, schrieb Frediano Ziglio:
   I'm exercise myself in block I/O layer and I decided to test
 coroutine branch cause I find it easier to use instead of normal
 callback. Looking at normal code there are a lot of rows in source to
 save/restore state and declare callbacks and is not that easier to
 understand the normal flow. 

Yes. This is one of the reasons why we're trying to switch to
coroutines. QED is a prototype for a fully asynchronous callback-based
image format, and sometimes it's really hard to follow its code paths.
That the real functionality gets lost in the noise of transferring state
doesn't really help with readability either.

 At the end I would like to create a new
 image format to get rid of some performance problem I encounter using
 writethrough and snapshots. I have some questions regard block I/O and
 also coroutines

No. A new image format is the wrong answer, whatever the question may
be. :-)

If writethrough doesn't perform well with the existing format drivers,
fix the existing format drivers. You need very good reasons to convince
me that qcow2 can't do what your new format could do.

The solution for slow writethrough mode in qcow2 is probably to make
requests parallel, even if they touch metadata. This is a change that
becomes possible relatively easily once we have switched to coroutines.

What exactly is the problem with snapshots? Saving/loading internal
snapshots is too slow, or general performance with an image that has
snapshots? I think Luiz reported the first one a while ago, and it
should be easy enough to fix (use Qcow2Cache in writeback mode during
the refcount update).

 1- threading model. I don't understand it. I can see that aio pool
 routines does not contain locking code so I think aio layer is mainly
 executed in a single thread. I saw introduction of some locking using
 coroutines so I think coroutines are now called from different threads
 and needs lock (current implementation serialize all device
 operations)

You can view coroutines as threads with cooperative scheduling. That is,
unlike threads a coroutine is never interrupted by a scheduler, but it
can only call qemu_coroutine_yield(), which transfers control to a
different coroutine. Compared to threads this simplifies locking a bit
because you exactly know at which point other code may run.

But of course, even though you know where it happens, you have other
code running in the middle of your function,  so there can be a need to
lock things, which is why there are things like a CoMutex.

They are still all running in the same thread.

 2- memory considerations on coroutines. Beside coroutines allow more
 readable code I wonder if somebody considered memory. For every
 coroutines a different stack has to be allocated. For instance
 ucontext and win32 implementation use 4mb. Assuming 128 concurrent AIO
 this require about 512mb of ram (mostly only committed but not used
 and coroutines are reused).

128 concurrent requests is a lot. And even then, it's only virtual
memory. I doubt that we're actually using much more than we do in the
old code with the AIOCBs (which will disappear and become local
variables when we complete the conversion).

 About snapshot and block i/o I think that using external snapshot
 would help making some stuff easier. By external snapshot I mean
 creating a new image with backing file as current image file and using
 this new image for future operations. This would allow for instance
 - support snapshot with every format (even raw)
 - making snapshot backup using external programs (even from different
 hosts using clustered file system and without many locking issues as
 original image is now read-only)
 - convert images live (just snapshot, qemu-img convert, remove snapshot)

These are things that are actively worked on. snapshot_blkdev is a
monitor command that already exists and does exactly what you describe.
For the rest, live block copy and image streaming are the keywords that
you should be looking for. We've had quite some discussions on these in
the past few weeks. You may also be interested in this wiki page:
http://wiki.qemu.org/Features/LiveBlockMigration

Kevin

[Qemu-devel] [PULL 00/21] Block patches

2011-07-19 Thread Kevin Wolf

The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

  Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 19:43:00 
+)

are available in the git repository at:
  git://repo.or.cz/qemu/kevin.git for-anthony

Devin Nakamura (2):
  qemu-io: Fix formatting
  qemu-io: Fix if scoping bug

Fam Zheng (12):
  VMDK: introduce VmdkExtent
  VMDK: bugfix, align offset to cluster in get_whole_cluster
  VMDK: probe for monolithicFlat images
  VMDK: separate vmdk_open by format version
  VMDK: add field BDRVVmdkState.desc_offset
  VMDK: flush multiple extents
  VMDK: move 'static' cid_update flag to bs field
  VMDK: change get_cluster_offset return type
  VMDK: open/read/write for monolithicFlat image
  VMDK: create different subformats
  VMDK: fix coding style
  block: add bdrv_get_allocated_file_size() operation

Hannes Reinecke (4):
  iov: Update parameter usage in iov_(to|from)_buf()
  scsi: Add 'hba_private' to SCSIRequest
  scsi-disk: Fixup debugging statement
  scsi-disk: Mask out serial number EVPD

Luiz Capitulino (2):
  qemu-options.hx: Document missing -drive options
  qemu-config: Document -drive options

MORITA Kazutaka (1):
  sheepdog: add full data preallocation support

 block.c|   19 +
 block.h|1 +
 block/raw-posix.c  |   21 +
 block/raw-win32.c  |   29 +
 block/sheepdog.c   |   71 ++-
 block/vmdk.c   | 1297 
 block_int.h|2 +
 hw/esp.c   |2 +-
 hw/lsi53c895a.c|   22 +-
 hw/scsi-bus.c  |9 +-
 hw/scsi-disk.c |   21 +-
 hw/scsi-generic.c  |5 +-
 hw/scsi.h  |   10 +-
 hw/spapr_vscsi.c   |   29 +-
 hw/usb-msd.c   |9 +-
 hw/virtio-net.c|2 +-
 hw/virtio-serial-bus.c |2 +-
 iov.c  |   49 +-
 iov.h  |   10 +-
 qemu-config.c  |6 +
 qemu-img.c |   31 +-
 qemu-io.c  | 2653 
 qemu-options.hx|8 +
 23 files changed, 2462 insertions(+), 1846 deletions(-)

[Qemu-devel] [PATCH 12/21] VMDK: probe for monolithicFlat images

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

Probe as the same behavior as VMware does.
Recognize image as monolithicFlat descriptor file when the file is text
and the first effective line (not '#' leaded comment or space line) is
either 'version=1' or 'version=2'. No space or upper case charactors
accepted.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |   45 +++--
 1 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 03a4619..f8a815c 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -103,10 +103,51 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 return 0;
 magic = be32_to_cpu(*(uint32_t *)buf);
 if (magic == VMDK3_MAGIC ||
-magic == VMDK4_MAGIC)
+magic == VMDK4_MAGIC) {
 return 100;
-else
+} else {
+const char *p = (const char *)buf;
+const char *end = p + buf_size;
+while (p  end) {
+if (*p == '#') {
+/* skip comment line */
+while (p  end  *p != '\n') {
+p++;
+}
+p++;
+continue;
+}
+if (*p == ' ') {
+while (p  end  *p == ' ') {
+p++;
+}
+/* skip '\r' if windows line endings used. */
+if (p  end  *p == '\r') {
+p++;
+}
+/* only accept blank lines before 'version=' line */
+if (p == end || *p != '\n') {
+return 0;
+}
+p++;
+continue;
+}
+if (end - p = strlen(version=X\n)) {
+if (strncmp(version=1\n, p, strlen(version=1\n)) == 0 ||
+strncmp(version=2\n, p, strlen(version=2\n)) == 0) {
+return 100;
+}
+}
+if (end - p = strlen(version=X\r\n)) {
+if (strncmp(version=1\r\n, p, strlen(version=1\r\n)) == 0 
||
+strncmp(version=2\r\n, p, strlen(version=2\r\n)) == 0) 
{
+return 100;
+}
+}
+return 0;
+}
 return 0;
+}
 }
 
 #define CHECK_CID 1
-- 
1.7.6

[Qemu-devel] [PATCH 03/21] qemu-io: Fix if scoping bug

2011-07-19 Thread Kevin Wolf

From: Devin Nakamura devin...@gmail.com

Fix a bug caused by lack of braces in if statement

Lack of braces means that if(count  0x1ff) is never reached

Signed-off-by: Devin Nakamura devin...@gmail.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qemu-io.c |3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/qemu-io.c b/qemu-io.c
index e3c825f..a553d0c 100644
--- a/qemu-io.c
+++ b/qemu-io.c
@@ -449,7 +449,7 @@ static int read_f(int argc, char **argv)
 return 0;
 }
 
-if (!pflag)
+if (!pflag) {
 if (offset  0x1ff) {
 printf(offset % PRId64  is not sector aligned\n,
offset);
@@ -460,6 +460,7 @@ static int read_f(int argc, char **argv)
count);
 return 0;
 }
+}
 
 buf = qemu_io_alloc(count, 0xab);
 
-- 
1.7.6

[Qemu-devel] [PATCH 01/21] sheepdog: add full data preallocation support

2011-07-19 Thread Kevin Wolf

From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp

This introduces qemu-img create option for sheepdog which allows the
data to be fully preallocated (note that sheepdog always preallocates
metadata).

The option is disabled by default and you need to enable it like the
following:

qemu-img create sheepdog:test -o preallocation=full 1G

Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp
Signed-off-by: FUJITA Tomonori fujita.tomon...@lab.ntt.co.jp
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/sheepdog.c |   71 +++--
 1 files changed, 68 insertions(+), 3 deletions(-)

diff --git a/block/sheepdog.c b/block/sheepdog.c
index 80d106c..77a4de5 100644
--- a/block/sheepdog.c
+++ b/block/sheepdog.c
@@ -1286,6 +1286,49 @@ static int do_sd_create(char *filename, int64_t vdi_size,
 return 0;
 }
 
+static int sd_prealloc(const char *filename)
+{
+BlockDriverState *bs = NULL;
+uint32_t idx, max_idx;
+int64_t vdi_size;
+void *buf = qemu_mallocz(SD_DATA_OBJ_SIZE);
+int ret;
+
+ret = bdrv_file_open(bs, filename, BDRV_O_RDWR);
+if (ret  0) {
+goto out;
+}
+
+vdi_size = bdrv_getlength(bs);
+if (vdi_size  0) {
+ret = vdi_size;
+goto out;
+}
+max_idx = DIV_ROUND_UP(vdi_size, SD_DATA_OBJ_SIZE);
+
+for (idx = 0; idx  max_idx; idx++) {
+/*
+ * The created image can be a cloned image, so we need to read
+ * a data from the source image.
+ */
+ret = bdrv_pread(bs, idx * SD_DATA_OBJ_SIZE, buf, SD_DATA_OBJ_SIZE);
+if (ret  0) {
+goto out;
+}
+ret = bdrv_pwrite(bs, idx * SD_DATA_OBJ_SIZE, buf, SD_DATA_OBJ_SIZE);
+if (ret  0) {
+goto out;
+}
+}
+out:
+if (bs) {
+bdrv_delete(bs);
+}
+qemu_free(buf);
+
+return ret;
+}
+
 static int sd_create(const char *filename, QEMUOptionParameter *options)
 {
 int ret;
@@ -1295,13 +1338,15 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options)
 BDRVSheepdogState s;
 char vdi[SD_MAX_VDI_LEN], tag[SD_MAX_VDI_TAG_LEN];
 uint32_t snapid;
+int prealloc = 0;
+const char *vdiname;
 
-strstart(filename, sheepdog:, (const char **)filename);
+strstart(filename, sheepdog:, vdiname);
 
 memset(s, 0, sizeof(s));
 memset(vdi, 0, sizeof(vdi));
 memset(tag, 0, sizeof(tag));
-if (parse_vdiname(s, filename, vdi, snapid, tag)  0) {
+if (parse_vdiname(s, vdiname, vdi, snapid, tag)  0) {
 error_report(invalid filename);
 return -EINVAL;
 }
@@ -1311,6 +1356,16 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options)
 vdi_size = options-value.n;
 } else if (!strcmp(options-name, BLOCK_OPT_BACKING_FILE)) {
 backing_file = options-value.s;
+} else if (!strcmp(options-name, BLOCK_OPT_PREALLOC)) {
+if (!options-value.s || !strcmp(options-value.s, off)) {
+prealloc = 0;
+} else if (!strcmp(options-value.s, full)) {
+prealloc = 1;
+} else {
+error_report(Invalid preallocation mode: '%s',
+ options-value.s);
+return -EINVAL;
+}
 }
 options++;
 }
@@ -1348,7 +1403,12 @@ static int sd_create(const char *filename, 
QEMUOptionParameter *options)
 bdrv_delete(bs);
 }
 
-return do_sd_create((char *)vdi, vdi_size, base_vid, vid, 0, s.addr, 
s.port);
+ret = do_sd_create(vdi, vdi_size, base_vid, vid, 0, s.addr, s.port);
+if (!prealloc || ret) {
+return ret;
+}
+
+return sd_prealloc(filename);
 }
 
 static void sd_close(BlockDriverState *bs)
@@ -1984,6 +2044,11 @@ static QEMUOptionParameter sd_create_options[] = {
 .type = OPT_STRING,
 .help = File name of a base image
 },
+{
+.name = BLOCK_OPT_PREALLOC,
+.type = OPT_STRING,
+.help = Preallocation mode (allowed values: off, full)
+},
 { NULL }
 };
 
-- 
1.7.6

[Qemu-devel] [PATCH 10/21] VMDK: introduce VmdkExtent

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

Introduced VmdkExtent array into BDRVVmdkState, enable holding multiple
image extents for multiple file image support.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |  348 +-
 1 files changed, 246 insertions(+), 102 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 922b23d..3b78583 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -60,7 +60,11 @@ typedef struct {
 
 #define L2_CACHE_SIZE 16
 
-typedef struct BDRVVmdkState {
+typedef struct VmdkExtent {
+BlockDriverState *file;
+bool flat;
+int64_t sectors;
+int64_t end_sector;
 int64_t l1_table_offset;
 int64_t l1_backup_table_offset;
 uint32_t *l1_table;
@@ -74,7 +78,13 @@ typedef struct BDRVVmdkState {
 uint32_t l2_cache_counts[L2_CACHE_SIZE];
 
 unsigned int cluster_sectors;
+} VmdkExtent;
+
+typedef struct BDRVVmdkState {
 uint32_t parent_cid;
+int num_extents;
+/* Extent array with num_extents entries, ascend ordered by address */
+VmdkExtent *extents;
 } BDRVVmdkState;
 
 typedef struct VmdkMetaData {
@@ -105,6 +115,19 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 #define DESC_SIZE 20*SECTOR_SIZE   // 20 sectors of 512 bytes each
 #define HEADER_SIZE 512// first sector of 512 bytes
 
+static void vmdk_free_extents(BlockDriverState *bs)
+{
+int i;
+BDRVVmdkState *s = bs-opaque;
+
+for (i = 0; i  s-num_extents; i++) {
+qemu_free(s-extents[i].l1_table);
+qemu_free(s-extents[i].l2_cache);
+qemu_free(s-extents[i].l1_backup_table);
+}
+qemu_free(s-extents);
+}
+
 static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
 {
 char desc[DESC_SIZE];
@@ -358,11 +381,50 @@ static int vmdk_parent_open(BlockDriverState *bs)
 return 0;
 }
 
+/* Create and append extent to the extent array. Return the added VmdkExtent
+ * address. return NULL if allocation failed. */
+static VmdkExtent *vmdk_add_extent(BlockDriverState *bs,
+   BlockDriverState *file, bool flat, int64_t sectors,
+   int64_t l1_offset, int64_t l1_backup_offset,
+   uint32_t l1_size,
+   int l2_size, unsigned int cluster_sectors)
+{
+VmdkExtent *extent;
+BDRVVmdkState *s = bs-opaque;
+
+s-extents = qemu_realloc(s-extents,
+  (s-num_extents + 1) * sizeof(VmdkExtent));
+extent = s-extents[s-num_extents];
+s-num_extents++;
+
+memset(extent, 0, sizeof(VmdkExtent));
+extent-file = file;
+extent-flat = flat;
+extent-sectors = sectors;
+extent-l1_table_offset = l1_offset;
+extent-l1_backup_table_offset = l1_backup_offset;
+extent-l1_size = l1_size;
+extent-l1_entry_sectors = l2_size * cluster_sectors;
+extent-l2_size = l2_size;
+extent-cluster_sectors = cluster_sectors;
+
+if (s-num_extents  1) {
+extent-end_sector = (*(extent - 1)).end_sector + extent-sectors;
+} else {
+extent-end_sector = extent-sectors;
+}
+bs-total_sectors = extent-end_sector;
+return extent;
+}
+
+
 static int vmdk_open(BlockDriverState *bs, int flags)
 {
 BDRVVmdkState *s = bs-opaque;
 uint32_t magic;
-int l1_size, i;
+int i;
+uint32_t l1_size, l1_entry_sectors;
+VmdkExtent *extent = NULL;
 
 if (bdrv_pread(bs-file, 0, magic, sizeof(magic)) != sizeof(magic))
 goto fail;
@@ -370,32 +432,34 @@ static int vmdk_open(BlockDriverState *bs, int flags)
 magic = be32_to_cpu(magic);
 if (magic == VMDK3_MAGIC) {
 VMDK3Header header;
-
-if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header)) != 
sizeof(header))
+if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header))
+!= sizeof(header)) {
 goto fail;
-s-cluster_sectors = le32_to_cpu(header.granularity);
-s-l2_size = 1  9;
-s-l1_size = 1  6;
-bs-total_sectors = le32_to_cpu(header.disk_sectors);
-s-l1_table_offset = le32_to_cpu(header.l1dir_offset)  9;
-s-l1_backup_table_offset = 0;
-s-l1_entry_sectors = s-l2_size * s-cluster_sectors;
+}
+extent = vmdk_add_extent(bs, bs-file, false,
+  le32_to_cpu(header.disk_sectors),
+  le32_to_cpu(header.l1dir_offset)  9, 0,
+  1  6, 1  9, le32_to_cpu(header.granularity));
 } else if (magic == VMDK4_MAGIC) {
 VMDK4Header header;
-
-if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header)) != 
sizeof(header))
+if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header))
+!= sizeof(header)) {
 goto fail;
-bs-total_sectors =

[Qemu-devel] [PATCH 08/21] qemu-options.hx: Document missing -drive options

2011-07-19 Thread Kevin Wolf

From: Luiz Capitulino lcapitul...@redhat.com

They are 'werror', 'rerror' and 'readonly'.

Signed-off-by: Luiz Capitulino lcapitul...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qemu-options.hx |8 
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index e6d7adc..64114dd 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -160,6 +160,14 @@ an untrusted format header.
 This option specifies the serial number to assign to the device.
 @item addr=@var{addr}
 Specify the controller's PCI address (if=virtio only).
+@item werror=@var{action},rerror=@var{action}
+Specify which @var{action} to take on write and read errors. Valid actions are:
+ignore (ignore the error and try to continue), stop (pause QEMU),
+report (report the error to the guest), enospc (pause QEMU only if the
+host disk is full; report the error to the guest otherwise).
+The default setting is @option{werror=enospc} and @option{rerror=report}.
+@item readonly
+Open drive @option{file} as read-only. Guest write attempts will fail.
 @end table
 
 By default, writethrough caching is used for all block device.  This means that
-- 
1.7.6

[Qemu-devel] [PATCH 04/21] iov: Update parameter usage in iov_(to|from)_buf()

2011-07-19 Thread Kevin Wolf

From: Hannes Reinecke h...@suse.de

iov_to_buf() has an 'offset' parameter, iov_from_buf() hasn't.
This patch adds the missing parameter to iov_from_buf().
It also renames the 'offset' parameter to 'iov_off' to
emphasize it's the offset into the iovec and not the buffer.

Signed-off-by: Hannes Reinecke h...@suse.de
Acked-by: Alexander Graf ag...@suse.de
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/virtio-net.c|2 +-
 hw/virtio-serial-bus.c |2 +-
 iov.c  |   49 ++-
 iov.h  |   10 
 4 files changed, 34 insertions(+), 29 deletions(-)

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index 6997e02..a32cc01 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -657,7 +657,7 @@ static ssize_t virtio_net_receive(VLANClientState *nc, 
const uint8_t *buf, size_
 
 /* copy in packet.  ugh */
 len = iov_from_buf(sg, elem.in_num,
-   buf + offset, size - offset);
+   buf + offset, 0, size - offset);
 total += len;
 offset += len;
 /* If buffers can't be merged, at this point we
diff --git a/hw/virtio-serial-bus.c b/hw/virtio-serial-bus.c
index 7f6db7b..53c58d0 100644
--- a/hw/virtio-serial-bus.c
+++ b/hw/virtio-serial-bus.c
@@ -103,7 +103,7 @@ static size_t write_to_port(VirtIOSerialPort *port,
 }
 
 len = iov_from_buf(elem.in_sg, elem.in_num,
-   buf + offset, size - offset);
+   buf + offset, 0, size - offset);
 offset += len;
 
 virtqueue_push(vq, elem, len);
diff --git a/iov.c b/iov.c
index 588cd04..1e02791 100644
--- a/iov.c
+++ b/iov.c
@@ -14,56 +14,61 @@
 
 #include iov.h
 
-size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
-const void *buf, size_t size)
+size_t iov_from_buf(struct iovec *iov, unsigned int iov_cnt,
+const void *buf, size_t iov_off, size_t size)
 {
-size_t offset;
+size_t iovec_off, buf_off;
 unsigned int i;
 
-offset = 0;
-for (i = 0; offset  size  i  iovcnt; i++) {
-size_t len;
+iovec_off = 0;
+buf_off = 0;
+for (i = 0; i  iov_cnt  size; i++) {
+if (iov_off  (iovec_off + iov[i].iov_len)) {
+size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off, size);
 
-len = MIN(iov[i].iov_len, size - offset);
+memcpy(iov[i].iov_base + (iov_off - iovec_off), buf + buf_off, 
len);
 
-memcpy(iov[i].iov_base, buf + offset, len);
-offset += len;
+buf_off += len;
+iov_off += len;
+size -= len;
+}
+iovec_off += iov[i].iov_len;
 }
-return offset;
+return buf_off;
 }
 
-size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
-  void *buf, size_t offset, size_t size)
+size_t iov_to_buf(const struct iovec *iov, const unsigned int iov_cnt,
+  void *buf, size_t iov_off, size_t size)
 {
 uint8_t *ptr;
-size_t iov_off, buf_off;
+size_t iovec_off, buf_off;
 unsigned int i;
 
 ptr = buf;
-iov_off = 0;
+iovec_off = 0;
 buf_off = 0;
-for (i = 0; i  iovcnt  size; i++) {
-if (offset  (iov_off + iov[i].iov_len)) {
-size_t len = MIN((iov_off + iov[i].iov_len) - offset , size);
+for (i = 0; i  iov_cnt  size; i++) {
+if (iov_off  (iovec_off + iov[i].iov_len)) {
+size_t len = MIN((iovec_off + iov[i].iov_len) - iov_off , size);
 
-memcpy(ptr + buf_off, iov[i].iov_base + (offset - iov_off), len);
+memcpy(ptr + buf_off, iov[i].iov_base + (iov_off - iovec_off), 
len);
 
 buf_off += len;
-offset += len;
+iov_off += len;
 size -= len;
 }
-iov_off += iov[i].iov_len;
+iovec_off += iov[i].iov_len;
 }
 return buf_off;
 }
 
-size_t iov_size(const struct iovec *iov, const unsigned int iovcnt)
+size_t iov_size(const struct iovec *iov, const unsigned int iov_cnt)
 {
 size_t len;
 unsigned int i;
 
 len = 0;
-for (i = 0; i  iovcnt; i++) {
+for (i = 0; i  iov_cnt; i++) {
 len += iov[i].iov_len;
 }
 return len;
diff --git a/iov.h b/iov.h
index 60a8547..110f67a 100644
--- a/iov.h
+++ b/iov.h
@@ -12,8 +12,8 @@
 
 #include qemu-common.h
 
-size_t iov_from_buf(struct iovec *iov, unsigned int iovcnt,
-const void *buf, size_t size);
-size_t iov_to_buf(const struct iovec *iov, const unsigned int iovcnt,
-  void *buf, size_t offset, size_t size);
-size_t iov_size(const struct iovec *iov, const unsigned int iovcnt);
+size_t iov_from_buf(struct iovec *iov, unsigned int iov_cnt,
+const void *buf, size_t iov_off, size_t size);
+size_t iov_to_buf(const struct iovec *iov, const unsigned int iov_cnt,
+  void *buf, size_t iov_off, size_t size);
+size_t

[Qemu-devel] [PATCH 11/21] VMDK: bugfix, align offset to cluster in get_whole_cluster

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

In get_whole_cluster, the offset is not aligned to cluster when reading
from backing_hd. When the first write to child is not at the cluster
boundary, wrong address data from parent is copied to child.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |8 +---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 3b78583..03a4619 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -514,21 +514,23 @@ static int get_whole_cluster(BlockDriverState *bs,
 /* 128 sectors * 512 bytes each = grain size 64KB */
 uint8_t  whole_grain[extent-cluster_sectors * 512];
 
-// we will be here if it's first write on non-exist grain(cluster).
-// try to read from parent image, if exist
+/* we will be here if it's first write on non-exist grain(cluster).
+ * try to read from parent image, if exist */
 if (bs-backing_hd) {
 int ret;
 
 if (!vmdk_is_cid_valid(bs))
 return -1;
 
+/* floor offset to cluster */
+offset -= offset % (extent-cluster_sectors * 512);
 ret = bdrv_read(bs-backing_hd, offset  9, whole_grain,
 extent-cluster_sectors);
 if (ret  0) {
 return -1;
 }
 
-//Write grain only into the active image
+/* Write grain only into the active image */
 ret = bdrv_write(extent-file, cluster_offset, whole_grain,
 extent-cluster_sectors);
 if (ret  0) {
-- 
1.7.6

[Qemu-devel] [PATCH 09/21] qemu-config: Document -drive options

2011-07-19 Thread Kevin Wolf

From: Luiz Capitulino lcapitul...@gmail.com

Signed-off-by: Luiz Capitulino lcapitul...@gmail.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 qemu-config.c |6 ++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/qemu-config.c b/qemu-config.c
index c63741c..93d20c6 100644
--- a/qemu-config.c
+++ b/qemu-config.c
@@ -23,6 +23,7 @@ static QemuOptsList qemu_drive_opts = {
 },{
 .name = index,
 .type = QEMU_OPT_NUMBER,
+.help = index number,
 },{
 .name = cyls,
 .type = QEMU_OPT_NUMBER,
@@ -46,6 +47,7 @@ static QemuOptsList qemu_drive_opts = {
 },{
 .name = snapshot,
 .type = QEMU_OPT_BOOL,
+.help = enable/disable snapshot mode,
 },{
 .name = file,
 .type = QEMU_OPT_STRING,
@@ -65,12 +67,15 @@ static QemuOptsList qemu_drive_opts = {
 },{
 .name = serial,
 .type = QEMU_OPT_STRING,
+.help = disk serial number,
 },{
 .name = rerror,
 .type = QEMU_OPT_STRING,
+.help = read error action,
 },{
 .name = werror,
 .type = QEMU_OPT_STRING,
+.help = write error action,
 },{
 .name = addr,
 .type = QEMU_OPT_STRING,
@@ -78,6 +83,7 @@ static QemuOptsList qemu_drive_opts = {
 },{
 .name = readonly,
 .type = QEMU_OPT_BOOL,
+.help = open drive file as read-only,
 },
 { /* end of list */ }
 },
-- 
1.7.6

[Qemu-devel] [PATCH 06/21] scsi-disk: Fixup debugging statement

2011-07-19 Thread Kevin Wolf

From: Hannes Reinecke h...@suse.de

A debugging statement wasn't converted to the new interface.

Signed-off-by: Hannes Reinecke h...@suse.de
Acked-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/scsi-disk.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index c2a99fe..5804662 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -1007,7 +1007,7 @@ static int32_t scsi_send_command(SCSIRequest *req, 
uint8_t *buf)
 
 command = buf[0];
 outbuf = (uint8_t *)r-iov.iov_base;
-DPRINTF(Command: lun=%d tag=0x%x data=0x%02x, lun, tag, buf[0]);
+DPRINTF(Command: lun=%d tag=0x%x data=0x%02x, req-lun, req-tag, 
buf[0]);
 
 if (scsi_req_parse(r-req, buf) != 0) {
 BADF(Unsupported command length, command %x\n, command);
-- 
1.7.6

Re: [Qemu-devel] [Xen-devel] Re: Upstream QEMU and Xen unstable not working

2011-07-19 Thread Ian Campbell

On Mon, 2011-07-18 at 17:17 +0100, Stefano Stabellini wrote:
 On Mon, 18 Jul 2011, Wei Liu wrote:
  Stefano and Anthony, you once said that you were going to setup a
  public QEMU repository for Xen, how is it going now?
 
 We are getting there, but there are still too many xen patches floating
 around qemu-devel at the moment to announce a new qemu xen tree.

Isn't the presence of all those patches floating around qemu-devel and
the need for people to trawl around collecting so as to have a working
build exactly the problem such a tree would be intended to solve? i.e. a
one stop place to pick up pending patches before they hit the main tree.

 However you can try the following branch for now:
 
 git://xenbits.xen.org/people/sstabellini/qemu-dm.git xen-next

Ian.

[Qemu-devel] [PATCH 13/21] VMDK: separate vmdk_open by format version

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

Separate vmdk_open by subformats to:
* vmdk_open_vmdk3
* vmdk_open_vmdk4

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |  178 -
 1 files changed, 112 insertions(+), 66 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f8a815c..6d7b497 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -458,67 +458,20 @@ static VmdkExtent *vmdk_add_extent(BlockDriverState *bs,
 return extent;
 }
 
-
-static int vmdk_open(BlockDriverState *bs, int flags)
+static int vmdk_init_tables(BlockDriverState *bs, VmdkExtent *extent)
 {
-BDRVVmdkState *s = bs-opaque;
-uint32_t magic;
-int i;
-uint32_t l1_size, l1_entry_sectors;
-VmdkExtent *extent = NULL;
-
-if (bdrv_pread(bs-file, 0, magic, sizeof(magic)) != sizeof(magic))
-goto fail;
-
-magic = be32_to_cpu(magic);
-if (magic == VMDK3_MAGIC) {
-VMDK3Header header;
-if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header))
-!= sizeof(header)) {
-goto fail;
-}
-extent = vmdk_add_extent(bs, bs-file, false,
-  le32_to_cpu(header.disk_sectors),
-  le32_to_cpu(header.l1dir_offset)  9, 0,
-  1  6, 1  9, le32_to_cpu(header.granularity));
-} else if (magic == VMDK4_MAGIC) {
-VMDK4Header header;
-if (bdrv_pread(bs-file, sizeof(magic), header, sizeof(header))
-!= sizeof(header)) {
-goto fail;
-}
-l1_entry_sectors = le32_to_cpu(header.num_gtes_per_gte)
-* le64_to_cpu(header.granularity);
-l1_size = (le64_to_cpu(header.capacity) + l1_entry_sectors - 1)
-/ l1_entry_sectors;
-extent = vmdk_add_extent(bs, bs-file, false,
-  le64_to_cpu(header.capacity),
-  le64_to_cpu(header.gd_offset)  9,
-  le64_to_cpu(header.rgd_offset)  9,
-  l1_size,
-  le32_to_cpu(header.num_gtes_per_gte),
-  le64_to_cpu(header.granularity));
-if (extent-l1_entry_sectors = 0) {
-goto fail;
-}
-// try to open parent images, if exist
-if (vmdk_parent_open(bs) != 0)
-goto fail;
-// write the CID once after the image creation
-s-parent_cid = vmdk_read_cid(bs,1);
-} else {
-goto fail;
-}
+int ret;
+int l1_size, i;
 
 /* read the L1 table */
 l1_size = extent-l1_size * sizeof(uint32_t);
 extent-l1_table = qemu_malloc(l1_size);
-if (bdrv_pread(bs-file,
-extent-l1_table_offset,
-extent-l1_table,
-l1_size)
-!= l1_size) {
-goto fail;
+ret = bdrv_pread(extent-file,
+extent-l1_table_offset,
+extent-l1_table,
+l1_size);
+if (ret  0) {
+goto fail_l1;
 }
 for (i = 0; i  extent-l1_size; i++) {
 le32_to_cpus(extent-l1_table[i]);
@@ -526,12 +479,12 @@ static int vmdk_open(BlockDriverState *bs, int flags)
 
 if (extent-l1_backup_table_offset) {
 extent-l1_backup_table = qemu_malloc(l1_size);
-if (bdrv_pread(bs-file,
-extent-l1_backup_table_offset,
-extent-l1_backup_table,
-l1_size)
-!= l1_size) {
-goto fail;
+ret = bdrv_pread(extent-file,
+extent-l1_backup_table_offset,
+extent-l1_backup_table,
+l1_size);
+if (ret  0) {
+goto fail_l1b;
 }
 for (i = 0; i  extent-l1_size; i++) {
 le32_to_cpus(extent-l1_backup_table[i]);
@@ -541,9 +494,102 @@ static int vmdk_open(BlockDriverState *bs, int flags)
 extent-l2_cache =
 qemu_malloc(extent-l2_size * L2_CACHE_SIZE * sizeof(uint32_t));
 return 0;
+ fail_l1b:
+qemu_free(extent-l1_backup_table);
+ fail_l1:
+qemu_free(extent-l1_table);
+return ret;
+}
+
+static int vmdk_open_vmdk3(BlockDriverState *bs, int flags)
+{
+int ret;
+uint32_t magic;
+VMDK3Header header;
+VmdkExtent *extent;
+
+ret = bdrv_pread(bs-file, sizeof(magic), header, sizeof(header));
+if (ret  0) {
+goto fail;
+}
+extent = vmdk_add_extent(bs,
+ bs-file, false,
+ le32_to_cpu(header.disk_sectors),
+ le32_to_cpu(header.l1dir_offset)  9,
+ 0, 1  6, 1  9,
+ le32_to_cpu(header.granularity));
+ret = vmdk_init_tables(bs, extent);
+if (ret) {
+/*

[Qemu-devel] [PATCH 15/21] VMDK: flush multiple extents

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

Flush all the file that referenced by the image.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |   12 +++-
 1 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 529ae90..f6d2986 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1072,7 +1072,17 @@ static void vmdk_close(BlockDriverState *bs)
 
 static int vmdk_flush(BlockDriverState *bs)
 {
-return bdrv_flush(bs-file);
+int i, ret, err;
+BDRVVmdkState *s = bs-opaque;
+
+ret = bdrv_flush(bs-file);
+for (i = 0; i  s-num_extents; i++) {
+err = bdrv_flush(s-extents[i].file);
+if (err  0) {
+ret = err;
+}
+}
+return ret;
 }
 
 
-- 
1.7.6

[Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Kevin Wolf

From: Hannes Reinecke h...@suse.de

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.
'tag' is still being kept for tracing purposes.

Signed-off-by: Hannes Reinecke h...@suse.de
Acked-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/esp.c  |2 +-
 hw/lsi53c895a.c   |   22 --
 hw/scsi-bus.c |9 ++---
 hw/scsi-disk.c|4 ++--
 hw/scsi-generic.c |5 +++--
 hw/scsi.h |   10 +++---
 hw/spapr_vscsi.c  |   29 +
 hw/usb-msd.c  |9 +
 8 files changed, 37 insertions(+), 53 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index aa50800..9ddd637 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)
 
 DPRINTF(do_busid_cmd: busid 0x%x\n, busid);
 lun = busid  7;
-s-current_req = scsi_req_new(s-current_dev, 0, lun);
+s-current_req = scsi_req_new(s-current_dev, 0, lun, NULL);
 datalen = scsi_req_enqueue(s-current_req, buf);
 s-ti_size = datalen;
 if (datalen != 0) {
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 940b43a..69eec1d 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, uint32_t 
tag)
 static void lsi_request_cancelled(SCSIRequest *req)
 {
 LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
-lsi_request *p;
+lsi_request *p = req-hba_private;
 
 if (s-current  req == s-current-req) {
 scsi_req_unref(req);
@@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
 return;
 }
 
-p = lsi_find_by_tag(s, req-tag);
 if (p) {
 QTAILQ_REMOVE(s-queue, p, next);
 scsi_req_unref(req);
@@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)
 
 /* Record that data is available for a queued command.  Returns zero if
the device was reselected, nonzero if the IO is deferred.  */
-static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
+static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
 {
-lsi_request *p;
-
-p = lsi_find_by_tag(s, tag);
-if (!p) {
-BADF(IO with unknown tag %d\n, tag);
-return 1;
-}
+lsi_request *p = req-hba_private;
 
 if (p-pending) {
-BADF(Multiple IO pending for tag %d\n, tag);
+BADF(Multiple IO pending for request %p\n, p);
 }
 p-pending = len;
 /* Reselect if waiting for it, or if reselection triggers an IRQ
@@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
len)
 LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
 int out;
 
-if (s-waiting == 1 || !s-current || req-tag != s-current-tag ||
+if (s-waiting == 1 || !s-current || req-hba_private != s-current ||
 (lsi_irq_on_rsl(s)  !(s-scntl1  LSI_SCNTL1_CON))) {
-if (lsi_queue_tag(s, req-tag, len)) {
+if (lsi_queue_req(s, req, len)) {
 return;
 }
 }
@@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
 assert(s-current == NULL);
 s-current = qemu_mallocz(sizeof(lsi_request));
 s-current-tag = s-select_tag;
-s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun);
+s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun,
+   s-current);
 
 n = scsi_req_enqueue(s-current-req, buf);
 if (n) {
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index ad6a730..8b1a412 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
 return res;
 }
 
-SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun)
+SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
+uint32_t lun, void *hba_private)
 {
 SCSIRequest *req;
 
@@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, 
uint32_t tag, uint32_t l
 req-dev = d;
 req-tag = tag;
 req-lun = lun;
+req-hba_private = hba_private;
 req-status = -1;
 trace_scsi_req_alloc(req-dev-id, req-lun, req-tag);
 return req;
 }
 
-SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
+SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
+  void *hba_private)
 {
-return d-info-alloc_req(d, tag, lun);
+return d-info-alloc_req(d, tag, lun, hba_private);
 }
 
 uint8_t *scsi_req_get_buf(SCSIRequest *req)
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index a8c7372..c2a99fe 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -81,13 +81,13 @@ static int scsi_handle_rw_error(SCSIDiskReq *r, int error, 
int type);
 static int

[Qemu-devel] [PATCH 16/21] VMDK: move 'static' cid_update flag to bs field

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

Cid_update is the flag for updating CID on first write after opening the
image. This should be per image open rather than per program life cycle,
so change it from static var of vmdk_write to a field in BDRVVmdkState.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f6d2986..8dc58a8 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -82,6 +82,7 @@ typedef struct VmdkExtent {
 
 typedef struct BDRVVmdkState {
 int desc_offset;
+bool cid_updated;
 uint32_t parent_cid;
 int num_extents;
 /* Extent array with num_extents entries, ascend ordered by address */
@@ -853,7 +854,6 @@ static int vmdk_write(BlockDriverState *bs, int64_t 
sector_num,
 int n;
 int64_t index_in_cluster;
 uint64_t cluster_offset;
-static int cid_update = 0;
 VmdkMetaData m_data;
 
 if (sector_num  bs-total_sectors) {
@@ -900,9 +900,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t 
sector_num,
 buf += n * 512;
 
 // update CID on the first write every time the virtual disk is opened
-if (!cid_update) {
+if (!s-cid_updated) {
 vmdk_write_cid(bs, time(NULL));
-cid_update++;
+s-cid_updated = true;
 }
 }
 return 0;
-- 
1.7.6

[Qemu-devel] [PATCH 18/21] VMDK: open/read/write for monolithicFlat image

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

Parse vmdk decriptor file and open mono flat image.
Read/write the flat extent.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |  171 +-
 1 files changed, 158 insertions(+), 13 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index f637d98..e1fb962 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -65,6 +65,7 @@ typedef struct VmdkExtent {
 bool flat;
 int64_t sectors;
 int64_t end_sector;
+int64_t flat_start_offset;
 int64_t l1_table_offset;
 int64_t l1_backup_table_offset;
 uint32_t *l1_table;
@@ -407,9 +408,10 @@ fail:
 static int vmdk_parent_open(BlockDriverState *bs)
 {
 char *p_name;
-char desc[DESC_SIZE];
+char desc[DESC_SIZE + 1];
 BDRVVmdkState *s = bs-opaque;
 
+desc[DESC_SIZE] = '\0';
 if (bdrv_pread(bs-file, s-desc_offset, desc, DESC_SIZE) != DESC_SIZE) {
 return -1;
 }
@@ -584,6 +586,144 @@ static int vmdk_open_vmdk4(BlockDriverState *bs, int 
flags)
 return ret;
 }
 
+/* find an option value out of descriptor file */
+static int vmdk_parse_description(const char *desc, const char *opt_name,
+char *buf, int buf_size)
+{
+char *opt_pos, *opt_end;
+const char *end = desc + strlen(desc);
+
+opt_pos = strstr(desc, opt_name);
+if (!opt_pos) {
+return -1;
+}
+/* Skip =\ following opt_name */
+opt_pos += strlen(opt_name) + 2;
+if (opt_pos = end) {
+return -1;
+}
+opt_end = opt_pos;
+while (opt_end  end  *opt_end != '') {
+opt_end++;
+}
+if (opt_end == end || buf_size  opt_end - opt_pos + 1) {
+return -1;
+}
+pstrcpy(buf, opt_end - opt_pos + 1, opt_pos);
+return 0;
+}
+
+static int vmdk_parse_extents(const char *desc, BlockDriverState *bs,
+const char *desc_file_path)
+{
+int ret;
+char access[11];
+char type[11];
+char fname[512];
+const char *p = desc;
+int64_t sectors = 0;
+int64_t flat_offset;
+
+while (*p) {
+/* parse extent line:
+ * RW [size in sectors] FLAT file-name.vmdk OFFSET
+ * or
+ * RW [size in sectors] SPARSE file-name.vmdk
+ */
+flat_offset = -1;
+ret = sscanf(p, %10s % SCNd64  %10s %511s % SCNd64,
+access, sectors, type, fname, flat_offset);
+if (ret  4 || strcmp(access, RW)) {
+goto next_line;
+} else if (!strcmp(type, FLAT)) {
+if (ret != 5 || flat_offset  0) {
+return -EINVAL;
+}
+} else if (ret != 4) {
+return -EINVAL;
+}
+
+/* trim the quotation marks around */
+if (fname[0] == '') {
+memmove(fname, fname + 1, strlen(fname));
+if (strlen(fname) = 1 || fname[strlen(fname) - 1] != '') {
+return -EINVAL;
+}
+fname[strlen(fname) - 1] = '\0';
+}
+if (sectors = 0 ||
+(strcmp(type, FLAT)  strcmp(type, SPARSE)) ||
+(strcmp(access, RW))) {
+goto next_line;
+}
+
+/* save to extents array */
+if (!strcmp(type, FLAT)) {
+/* FLAT extent */
+char extent_path[PATH_MAX];
+BlockDriverState *extent_file;
+VmdkExtent *extent;
+
+path_combine(extent_path, sizeof(extent_path),
+desc_file_path, fname);
+ret = bdrv_file_open(extent_file, extent_path, bs-open_flags);
+if (ret) {
+return ret;
+}
+extent = vmdk_add_extent(bs, extent_file, true, sectors,
+0, 0, 0, 0, sectors);
+extent-flat_start_offset = flat_offset;
+} else {
+/* SPARSE extent, not supported for now */
+fprintf(stderr,
+VMDK: Not supported extent type \%s\.\n, type);
+return -ENOTSUP;
+}
+next_line:
+/* move to next line */
+while (*p  *p != '\n') {
+p++;
+}
+p++;
+}
+return 0;
+}
+
+static int vmdk_open_desc_file(BlockDriverState *bs, int flags)
+{
+int ret;
+char buf[2048];
+char ct[128];
+BDRVVmdkState *s = bs-opaque;
+
+ret = bdrv_pread(bs-file, 0, buf, sizeof(buf));
+if (ret  0) {
+return ret;
+}
+buf[2047] = '\0';
+if (vmdk_parse_description(buf, createType, ct, sizeof(ct))) {
+return -EINVAL;
+}
+if (strcmp(ct, monolithicFlat)) {
+fprintf(stderr,
+VMDK: Not supported image type \%s\.\n, ct);
+return -ENOTSUP;
+}
+s-desc_offset = 0;
+ret = vmdk_parse_extents(buf, bs, bs-file-filename);
+if (ret) {
+return ret;
+}
+
+/* try to open parent images, if exist */
+if

[Qemu-devel] [PATCH] qcow2: Use Qcow2Cache in writeback mode during loadvm/savevm

2011-07-19 Thread Kevin Wolf

In snapshotting there is no guest involved, so we can safely use a writeback
mode and do the flushes in the right place (i.e. at the very end). This
improves the time that creating/restoring an internal snapshot takes with an
image in writethrough mode.

Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/qcow2-cache.c|   12 
 block/qcow2-refcount.c |   38 +++---
 block/qcow2.h  |2 ++
 3 files changed, 41 insertions(+), 11 deletions(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 3824739..8408847 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -312,3 +312,15 @@ found:
 c-entries[i].dirty = true;
 }
 
+bool qcow2_cache_set_writethrough(BlockDriverState *bs, Qcow2Cache *c,
+bool enable)
+{
+bool old = c-writethrough;
+
+if (!old  enable) {
+qcow2_cache_flush(bs, c);
+}
+
+c-writethrough = enable;
+return old;
+}
diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c
index ac95b88..14b2f67 100644
--- a/block/qcow2-refcount.c
+++ b/block/qcow2-refcount.c
@@ -705,8 +705,15 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 BDRVQcowState *s = bs-opaque;
 uint64_t *l1_table, *l2_table, l2_offset, offset, l1_size2, l1_allocated;
 int64_t old_offset, old_l2_offset;
-int i, j, l1_modified, nb_csectors, refcount;
+int i, j, l1_modified = 0, nb_csectors, refcount;
 int ret;
+bool old_l2_writethrough, old_refcount_writethrough;
+
+/* Switch caches to writeback mode during update */
+old_l2_writethrough =
+qcow2_cache_set_writethrough(bs, s-l2_table_cache, false);
+old_refcount_writethrough =
+qcow2_cache_set_writethrough(bs, s-refcount_block_cache, false);
 
 l2_table = NULL;
 l1_table = NULL;
@@ -720,7 +727,11 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 l1_allocated = 1;
 if (bdrv_pread(bs-file, l1_table_offset,
l1_table, l1_size2) != l1_size2)
+{
+ret = -EIO;
 goto fail;
+}
+
 for(i = 0;i  l1_size; i++)
 be64_to_cpus(l1_table[i]);
 } else {
@@ -729,7 +740,6 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 l1_allocated = 0;
 }
 
-l1_modified = 0;
 for(i = 0; i  l1_size; i++) {
 l2_offset = l1_table[i];
 if (l2_offset) {
@@ -773,6 +783,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 }
 
 if (refcount  0) {
+ret = -EIO;
 goto fail;
 }
 }
@@ -803,6 +814,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 refcount = get_refcount(bs, l2_offset  s-cluster_bits);
 }
 if (refcount  0) {
+ret = -EIO;
 goto fail;
 } else if (refcount == 1) {
 l2_offset |= QCOW_OFLAG_COPIED;
@@ -813,6 +825,18 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 }
 }
 }
+
+ret = 0;
+fail:
+if (l2_table) {
+qcow2_cache_put(bs, s-l2_table_cache, (void**) l2_table);
+}
+
+/* Enable writethrough cache mode again */
+qcow2_cache_set_writethrough(bs, s-l2_table_cache, old_l2_writethrough);
+qcow2_cache_set_writethrough(bs, s-refcount_block_cache,
+old_refcount_writethrough);
+
 if (l1_modified) {
 for(i = 0; i  l1_size; i++)
 cpu_to_be64s(l1_table[i]);
@@ -824,15 +848,7 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs,
 }
 if (l1_allocated)
 qemu_free(l1_table);
-return 0;
- fail:
-if (l2_table) {
-qcow2_cache_put(bs, s-l2_table_cache, (void**) l2_table);
-}
-
-if (l1_allocated)
-qemu_free(l1_table);
-return -EIO;
+return ret;
 }
 
 
diff --git a/block/qcow2.h b/block/qcow2.h
index e1ae3e8..6a0a21b 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -228,6 +228,8 @@ int qcow2_read_snapshots(BlockDriverState *bs);
 Qcow2Cache *qcow2_cache_create(BlockDriverState *bs, int num_tables,
 bool writethrough);
 int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
+bool qcow2_cache_set_writethrough(BlockDriverState *bs, Qcow2Cache *c,
+bool enable);
 
 void qcow2_cache_entry_mark_dirty(Qcow2Cache *c, void *table);
 int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
-- 
1.7.6

[Qemu-devel] [PATCH 14/21] VMDK: add field BDRVVmdkState.desc_offset

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

There are several occurrence of magic number 0x200 as the descriptor
offset within mono sparse image file. This is not the case for images
with separate descriptor file. So a field is added to BDRVVmdkState to
hold the correct value.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |   27 ++-
 1 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 6d7b497..529ae90 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -81,6 +81,7 @@ typedef struct VmdkExtent {
 } VmdkExtent;
 
 typedef struct BDRVVmdkState {
+int desc_offset;
 uint32_t parent_cid;
 int num_extents;
 /* Extent array with num_extents entries, ascend ordered by address */
@@ -175,10 +176,11 @@ static uint32_t vmdk_read_cid(BlockDriverState *bs, int 
parent)
 uint32_t cid;
 const char *p_name, *cid_str;
 size_t cid_str_size;
+BDRVVmdkState *s = bs-opaque;
 
-/* the descriptor offset = 0x200 */
-if (bdrv_pread(bs-file, 0x200, desc, DESC_SIZE) != DESC_SIZE)
+if (bdrv_pread(bs-file, s-desc_offset, desc, DESC_SIZE) != DESC_SIZE) {
 return 0;
+}
 
 if (parent) {
 cid_str = parentCID;
@@ -200,10 +202,12 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t 
cid)
 {
 char desc[DESC_SIZE], tmp_desc[DESC_SIZE];
 char *p_name, *tmp_str;
+BDRVVmdkState *s = bs-opaque;
 
-/* the descriptor offset = 0x200 */
-if (bdrv_pread(bs-file, 0x200, desc, DESC_SIZE) != DESC_SIZE)
-return -1;
+memset(desc, 0, sizeof(desc));
+if (bdrv_pread(bs-file, s-desc_offset, desc, DESC_SIZE) != DESC_SIZE) {
+return -EIO;
+}
 
 tmp_str = strstr(desc,parentCID);
 pstrcpy(tmp_desc, sizeof(tmp_desc), tmp_str);
@@ -213,8 +217,9 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t 
cid)
 pstrcat(desc, sizeof(desc), tmp_desc);
 }
 
-if (bdrv_pwrite_sync(bs-file, 0x200, desc, DESC_SIZE)  0)
-return -1;
+if (bdrv_pwrite_sync(bs-file, s-desc_offset, desc, DESC_SIZE)  0) {
+return -EIO;
+}
 return 0;
 }
 
@@ -402,10 +407,11 @@ static int vmdk_parent_open(BlockDriverState *bs)
 {
 char *p_name;
 char desc[DESC_SIZE];
+BDRVVmdkState *s = bs-opaque;
 
-/* the descriptor offset = 0x200 */
-if (bdrv_pread(bs-file, 0x200, desc, DESC_SIZE) != DESC_SIZE)
+if (bdrv_pread(bs-file, s-desc_offset, desc, DESC_SIZE) != DESC_SIZE) {
 return -1;
+}
 
 if ((p_name = strstr(desc,parentFileNameHint)) != NULL) {
 char *end_name;
@@ -506,8 +512,10 @@ static int vmdk_open_vmdk3(BlockDriverState *bs, int flags)
 int ret;
 uint32_t magic;
 VMDK3Header header;
+BDRVVmdkState *s = bs-opaque;
 VmdkExtent *extent;
 
+s-desc_offset = 0x200;
 ret = bdrv_pread(bs-file, sizeof(magic), header, sizeof(header));
 if (ret  0) {
 goto fail;
@@ -539,6 +547,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs, int flags)
 BDRVVmdkState *s = bs-opaque;
 VmdkExtent *extent;
 
+s-desc_offset = 0x200;
 ret = bdrv_pread(bs-file, sizeof(magic), header, sizeof(header));
 if (ret  0) {
 goto fail;
-- 
1.7.6

[Qemu-devel] [PATCH 07/21] scsi-disk: Mask out serial number EVPD

2011-07-19 Thread Kevin Wolf

From: Hannes Reinecke h...@suse.de

If the serial number is not set we should mask it out in the
list of supported VPD pages and mark it as not supported.

Signed-off-by: Hannes Reinecke h...@suse.de
Acked-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 hw/scsi-disk.c |   15 ---
 1 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index 5804662..05d14ab 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -398,7 +398,8 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 buffer size %zd\n, req-cmd.xfer);
 pages = buflen++;
 outbuf[buflen++] = 0x00; // list of supported pages (this page)
-outbuf[buflen++] = 0x80; // unit serial number
+if (s-serial)
+outbuf[buflen++] = 0x80; // unit serial number
 outbuf[buflen++] = 0x83; // device identification
 if (s-drive_kind == SCSI_HD) {
 outbuf[buflen++] = 0xb0; // block limits
@@ -409,8 +410,14 @@ static int scsi_disk_emulate_inquiry(SCSIRequest *req, 
uint8_t *outbuf)
 }
 case 0x80: /* Device serial number, optional */
 {
-int l = strlen(s-serial);
+int l;
 
+if (!s-serial) {
+DPRINTF(Inquiry (EVPD[Serial number] not supported\n);
+return -1;
+}
+
+l = strlen(s-serial);
 if (l  req-cmd.xfer)
 l = req-cmd.xfer;
 if (l  20)
@@ -1203,7 +1210,9 @@ static int scsi_initfn(SCSIDevice *dev, SCSIDriveKind 
kind)
 if (!s-serial) {
 /* try to fall back to value set with legacy -drive serial=... */
 dinfo = drive_get_by_blockdev(s-bs);
-s-serial = qemu_strdup(*dinfo-serial ? dinfo-serial : 0);
+if (*dinfo-serial) {
+s-serial = qemu_strdup(dinfo-serial);
+}
 }
 
 if (!s-version) {
-- 
1.7.6

[Qemu-devel] [PATCH 20/21] VMDK: fix coding style

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

Conform coding style in vmdk.c to pass scripts/checkpatch.pl checks.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |   76 +++---
 1 files changed, 46 insertions(+), 30 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index b53c5f5..de08d0c 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -102,8 +102,9 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 {
 uint32_t magic;
 
-if (buf_size  4)
+if (buf_size  4) {
 return 0;
+}
 magic = be32_to_cpu(*(uint32_t *)buf);
 if (magic == VMDK3_MAGIC ||
 magic == VMDK4_MAGIC) {
@@ -193,9 +194,10 @@ static uint32_t vmdk_read_cid(BlockDriverState *bs, int 
parent)
 cid_str_size = sizeof(CID);
 }
 
-if ((p_name = strstr(desc,cid_str)) != NULL) {
+p_name = strstr(desc, cid_str);
+if (p_name != NULL) {
 p_name += cid_str_size;
-sscanf(p_name,%x,cid);
+sscanf(p_name, %x, cid);
 }
 
 return cid;
@@ -212,9 +214,10 @@ static int vmdk_write_cid(BlockDriverState *bs, uint32_t 
cid)
 return -EIO;
 }
 
-tmp_str = strstr(desc,parentCID);
+tmp_str = strstr(desc, parentCID);
 pstrcpy(tmp_desc, sizeof(tmp_desc), tmp_str);
-if ((p_name = strstr(desc,CID)) != NULL) {
+p_name = strstr(desc, CID);
+if (p_name != NULL) {
 p_name += sizeof(CID);
 snprintf(p_name, sizeof(desc) - (p_name - desc), %x\n, cid);
 pstrcat(desc, sizeof(desc), tmp_desc);
@@ -234,13 +237,14 @@ static int vmdk_is_cid_valid(BlockDriverState *bs)
 uint32_t cur_pcid;
 
 if (p_bs) {
-cur_pcid = vmdk_read_cid(p_bs,0);
-if (s-parent_cid != cur_pcid)
-// CID not valid
+cur_pcid = vmdk_read_cid(p_bs, 0);
+if (s-parent_cid != cur_pcid) {
+/* CID not valid */
 return 0;
+}
 }
 #endif
-// CID valid
+/* CID valid */
 return 1;
 }
 
@@ -255,14 +259,18 @@ static int vmdk_parent_open(BlockDriverState *bs)
 return -1;
 }
 
-if ((p_name = strstr(desc,parentFileNameHint)) != NULL) {
+p_name = strstr(desc, parentFileNameHint);
+if (p_name != NULL) {
 char *end_name;
 
 p_name += sizeof(parentFileNameHint) + 1;
-if ((end_name = strchr(p_name,'\')) == NULL)
+end_name = strchr(p_name, '\');
+if (end_name == NULL) {
 return -1;
-if ((end_name - p_name)  sizeof (bs-backing_file) - 1)
+}
+if ((end_name - p_name)  sizeof(bs-backing_file) - 1) {
 return -1;
+}
 
 pstrcpy(bs-backing_file, end_name - p_name + 1, p_name);
 }
@@ -595,8 +603,9 @@ static int get_whole_cluster(BlockDriverState *bs,
 if (bs-backing_hd) {
 int ret;
 
-if (!vmdk_is_cid_valid(bs))
+if (!vmdk_is_cid_valid(bs)) {
 return -1;
+}
 
 /* floor offset to cluster */
 offset -= offset % (extent-cluster_sectors * 512);
@@ -655,8 +664,9 @@ static int get_cluster_offset(BlockDriverState *bs,
 int min_index, i, j;
 uint32_t min_count, *l2_table, tmp = 0;
 
-if (m_data)
+if (m_data) {
 m_data-valid = 0;
+}
 if (extent-flat) {
 *cluster_offset = extent-flat_start_offset;
 return 0;
@@ -712,7 +722,7 @@ static int get_cluster_offset(BlockDriverState *bs,
 return -1;
 }
 
-// Avoid the L2 tables update for the images that have snapshots.
+/* Avoid the L2 tables update for the images that have snapshots. */
 *cluster_offset = bdrv_getlength(extent-file);
 bdrv_truncate(
 extent-file,
@@ -729,8 +739,9 @@ static int get_cluster_offset(BlockDriverState *bs,
  * or inappropriate VM shutdown.
  */
 if (get_whole_cluster(
-bs, extent, *cluster_offset, offset, allocate) == -1)
+bs, extent, *cluster_offset, offset, allocate) == -1) {
 return -1;
+}
 
 if (m_data) {
 m_data-offset = tmp;
@@ -780,8 +791,9 @@ static int vmdk_is_allocated(BlockDriverState *bs, int64_t 
sector_num,
 
 index_in_cluster = sector_num % extent-cluster_sectors;
 n = extent-cluster_sectors - index_in_cluster;
-if (n  nb_sectors)
+if (n  nb_sectors) {
 n = nb_sectors;
+}
 *pnum = n;
 return ret;
 }
@@ -805,16 +817,19 @@ static int vmdk_read(BlockDriverState *bs, int64_t 
sector_num,
 sector_num  9, 0, cluster_offset);
 index_in_cluster = sector_num % extent-cluster_sectors;
 n = extent-cluster_sectors - index_in_cluster;
-if (n  nb_sectors)
+if (n  nb_sectors) {
 n = nb_sectors;
+}
 if (ret) {
 /* if not

Re: [Qemu-devel] [Xen-devel] Re: Upstream QEMU and Xen unstable not working

2011-07-19 Thread Stefano Stabellini

On Tue, 19 Jul 2011, Ian Campbell wrote:
 On Mon, 2011-07-18 at 17:17 +0100, Stefano Stabellini wrote:
  On Mon, 18 Jul 2011, Wei Liu wrote:
   Stefano and Anthony, you once said that you were going to setup a
   public QEMU repository for Xen, how is it going now?
  
  We are getting there, but there are still too many xen patches floating
  around qemu-devel at the moment to announce a new qemu xen tree.
 
 Isn't the presence of all those patches floating around qemu-devel and
 the need for people to trawl around collecting so as to have a working
 build exactly the problem such a tree would be intended to solve? i.e. a
 one stop place to pick up pending patches before they hit the main tree.
 

Yes, however the base hasn't been stable enough so far even collecting
all the patches together.
With Anthony's latest patch series we are almost there.

[Qemu-devel] [PATCH 21/21] block: add bdrv_get_allocated_file_size() operation

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

qemu-img.c wants to count allocated file size of image. Previously it
counts a single bs-file by 'stat' or Window API. As VMDK introduces
multiple file support, the operation becomes format specific with
platform specific meanwhile.

The functions are moved to block/raw-{posix,win32}.c and qemu-img.c calls
bdrv_get_allocated_file_size to count the bs. And also added VMDK code
to count his own extents.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block.c   |   19 +++
 block.h   |1 +
 block/raw-posix.c |   21 +
 block/raw-win32.c |   29 +
 block/vmdk.c  |   24 
 block_int.h   |1 +
 qemu-img.c|   31 +--
 7 files changed, 96 insertions(+), 30 deletions(-)

diff --git a/block.c b/block.c
index 24a25d5..9549b9e 100644
--- a/block.c
+++ b/block.c
@@ -1147,6 +1147,25 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 }
 
 /**
+ * Length of a allocated file in bytes. Sparse files are counted by actual
+ * allocated space. Return  0 if error or unknown.
+ */
+int64_t bdrv_get_allocated_file_size(BlockDriverState *bs)
+{
+BlockDriver *drv = bs-drv;
+if (!drv) {
+return -ENOMEDIUM;
+}
+if (drv-bdrv_get_allocated_file_size) {
+return drv-bdrv_get_allocated_file_size(bs);
+}
+if (bs-file) {
+return bdrv_get_allocated_file_size(bs-file);
+}
+return -ENOTSUP;
+}
+
+/**
  * Length of a file in bytes. Return  0 if error or unknown.
  */
 int64_t bdrv_getlength(BlockDriverState *bs)
diff --git a/block.h b/block.h
index 859d1d9..59cc410 100644
--- a/block.h
+++ b/block.h
@@ -89,6 +89,7 @@ int bdrv_write_sync(BlockDriverState *bs, int64_t sector_num,
 const uint8_t *buf, int nb_sectors);
 int bdrv_truncate(BlockDriverState *bs, int64_t offset);
 int64_t bdrv_getlength(BlockDriverState *bs);
+int64_t bdrv_get_allocated_file_size(BlockDriverState *bs);
 void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr);
 void bdrv_guess_geometry(BlockDriverState *bs, int *pcyls, int *pheads, int 
*psecs);
 int bdrv_commit(BlockDriverState *bs);
diff --git a/block/raw-posix.c b/block/raw-posix.c
index 34b64aa..cd89c83 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -793,6 +793,17 @@ static int64_t raw_getlength(BlockDriverState *bs)
 }
 #endif
 
+static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
+{
+struct stat st;
+BDRVRawState *s = bs-opaque;
+
+if (fstat(s-fd, st)  0) {
+return -errno;
+}
+return (int64_t)st.st_blocks * 512;
+}
+
 static int raw_create(const char *filename, QEMUOptionParameter *options)
 {
 int fd;
@@ -888,6 +899,8 @@ static BlockDriver bdrv_file = {
 
 .bdrv_truncate = raw_truncate,
 .bdrv_getlength = raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 .create_options = raw_create_options,
 };
@@ -1156,6 +1169,8 @@ static BlockDriver bdrv_host_device = {
 .bdrv_read  = raw_read,
 .bdrv_write = raw_write,
 .bdrv_getlength= raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 /* generic scsi device */
 #ifdef __linux__
@@ -1277,6 +1292,8 @@ static BlockDriver bdrv_host_floppy = {
 .bdrv_read  = raw_read,
 .bdrv_write = raw_write,
 .bdrv_getlength= raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 /* removable device support */
 .bdrv_is_inserted   = floppy_is_inserted,
@@ -1380,6 +1397,8 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_read  = raw_read,
 .bdrv_write = raw_write,
 .bdrv_getlength = raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 /* removable device support */
 .bdrv_is_inserted   = cdrom_is_inserted,
@@ -1503,6 +1522,8 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_read  = raw_read,
 .bdrv_write = raw_write,
 .bdrv_getlength = raw_getlength,
+.bdrv_get_allocated_file_size
+= raw_get_allocated_file_size,
 
 /* removable device support */
 .bdrv_is_inserted   = cdrom_is_inserted,
diff --git a/block/raw-win32.c b/block/raw-win32.c
index 56bd719..91067e7 100644
--- a/block/raw-win32.c
+++ b/block/raw-win32.c
@@ -213,6 +213,31 @@ static int64_t raw_getlength(BlockDriverState *bs)
 return l.QuadPart;
 }
 
+static int64_t raw_get_allocated_file_size(BlockDriverState *bs)
+{
+typedef DWORD (WINAPI * get_compressed_t)(const char *filename,
+  DWORD * high);
+get_compressed_t get_compressed;
+struct

[Qemu-devel] [PATCH 17/21] VMDK: change get_cluster_offset return type

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

The return type of get_cluster_offset was an offset that use 0 to denote
'not allocated', this will be no longer true for flat extents, as we see
flat extent file as a single huge cluster whose offset is 0 and length
is the whole file length.
So now we use int return value, 0 means success and otherwise offset
invalid.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |   79 ++---
 1 files changed, 42 insertions(+), 37 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 8dc58a8..f637d98 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -665,26 +665,31 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData 
*m_data)
 return 0;
 }
 
-static uint64_t get_cluster_offset(BlockDriverState *bs,
+static int get_cluster_offset(BlockDriverState *bs,
 VmdkExtent *extent,
 VmdkMetaData *m_data,
-uint64_t offset, int allocate)
+uint64_t offset,
+int allocate,
+uint64_t *cluster_offset)
 {
 unsigned int l1_index, l2_offset, l2_index;
 int min_index, i, j;
 uint32_t min_count, *l2_table, tmp = 0;
-uint64_t cluster_offset;
 
 if (m_data)
 m_data-valid = 0;
+if (extent-flat) {
+*cluster_offset = 0;
+return 0;
+}
 
 l1_index = (offset  9) / extent-l1_entry_sectors;
 if (l1_index = extent-l1_size) {
-return 0;
+return -1;
 }
 l2_offset = extent-l1_table[l1_index];
 if (!l2_offset) {
-return 0;
+return -1;
 }
 for (i = 0; i  L2_CACHE_SIZE; i++) {
 if (l2_offset == extent-l2_cache_offsets[i]) {
@@ -714,28 +719,29 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
 l2_table,
 extent-l2_size * sizeof(uint32_t)
 ) != extent-l2_size * sizeof(uint32_t)) {
-return 0;
+return -1;
 }
 
 extent-l2_cache_offsets[min_index] = l2_offset;
 extent-l2_cache_counts[min_index] = 1;
  found:
 l2_index = ((offset  9) / extent-cluster_sectors) % extent-l2_size;
-cluster_offset = le32_to_cpu(l2_table[l2_index]);
+*cluster_offset = le32_to_cpu(l2_table[l2_index]);
 
-if (!cluster_offset) {
-if (!allocate)
-return 0;
+if (!*cluster_offset) {
+if (!allocate) {
+return -1;
+}
 
 // Avoid the L2 tables update for the images that have snapshots.
-cluster_offset = bdrv_getlength(extent-file);
+*cluster_offset = bdrv_getlength(extent-file);
 bdrv_truncate(
 extent-file,
-cluster_offset + (extent-cluster_sectors  9)
+*cluster_offset + (extent-cluster_sectors  9)
 );
 
-cluster_offset = 9;
-tmp = cpu_to_le32(cluster_offset);
+*cluster_offset = 9;
+tmp = cpu_to_le32(*cluster_offset);
 l2_table[l2_index] = tmp;
 
 /* First of all we write grain itself, to avoid race condition
@@ -744,8 +750,8 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
  * or inappropriate VM shutdown.
  */
 if (get_whole_cluster(
-bs, extent, cluster_offset, offset, allocate) == -1)
-return 0;
+bs, extent, *cluster_offset, offset, allocate) == -1)
+return -1;
 
 if (m_data) {
 m_data-offset = tmp;
@@ -755,8 +761,8 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
 m_data-valid = 1;
 }
 }
-cluster_offset = 9;
-return cluster_offset;
+*cluster_offset = 9;
+return 0;
 }
 
 static VmdkExtent *find_extent(BDRVVmdkState *s,
@@ -780,7 +786,6 @@ static int vmdk_is_allocated(BlockDriverState *bs, int64_t 
sector_num,
  int nb_sectors, int *pnum)
 {
 BDRVVmdkState *s = bs-opaque;
-
 int64_t index_in_cluster, n, ret;
 uint64_t offset;
 VmdkExtent *extent;
@@ -789,15 +794,13 @@ static int vmdk_is_allocated(BlockDriverState *bs, 
int64_t sector_num,
 if (!extent) {
 return 0;
 }
-if (extent-flat) {
-n = extent-end_sector - sector_num;
-ret = 1;
-} else {
-offset = get_cluster_offset(bs, extent, NULL, sector_num * 512, 0);
-index_in_cluster = sector_num % extent-cluster_sectors;
-n = extent-cluster_sectors - index_in_cluster;
-ret = offset ? 1 : 0;
-}
+ret = get_cluster_offset(bs, extent, NULL,
+sector_num * 512, 0, offset);
+/* get_cluster_offset returning 0 means success */
+ret = !ret;
+
+index_in_cluster = sector_num % extent-cluster_sectors;
+n =

[Qemu-devel] [PATCH 19/21] VMDK: create different subformats

2011-07-19 Thread Kevin Wolf

From: Fam Zheng famc...@gmail.com

Add create option 'format', with enums:
monolithicSparse
monolithicFlat
twoGbMaxExtentSparse
twoGbMaxExtentFlat
Each creates a subformat image file. The default is monolithicSparse.

Signed-off-by: Fam Zheng famc...@gmail.com
Reviewed-by: Stefan Hajnoczi stefa...@linux.vnet.ibm.com
Signed-off-by: Kevin Wolf kw...@redhat.com
---
 block/vmdk.c |  503 +++--
 block_int.h  |1 +
 2 files changed, 275 insertions(+), 229 deletions(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index e1fb962..b53c5f5 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -156,8 +156,9 @@ static int vmdk_probe(const uint8_t *buf, int buf_size, 
const char *filename)
 #define CHECK_CID 1
 
 #define SECTOR_SIZE 512
-#define DESC_SIZE 20*SECTOR_SIZE   // 20 sectors of 512 bytes each
-#define HEADER_SIZE 512// first sector of 512 bytes
+#define DESC_SIZE (20 * SECTOR_SIZE)/* 20 sectors of 512 bytes each */
+#define BUF_SIZE 4096
+#define HEADER_SIZE 512 /* first sector of 512 bytes */
 
 static void vmdk_free_extents(BlockDriverState *bs)
 {
@@ -243,168 +244,6 @@ static int vmdk_is_cid_valid(BlockDriverState *bs)
 return 1;
 }
 
-static int vmdk_snapshot_create(const char *filename, const char *backing_file)
-{
-int snp_fd, p_fd;
-int ret;
-uint32_t p_cid;
-char *p_name, *gd_buf, *rgd_buf;
-const char *real_filename, *temp_str;
-VMDK4Header header;
-uint32_t gde_entries, gd_size;
-int64_t gd_offset, rgd_offset, capacity, gt_size;
-char p_desc[DESC_SIZE], s_desc[DESC_SIZE], hdr[HEADER_SIZE];
-static const char desc_template[] =
-# Disk DescriptorFile\n
-version=1\n
-CID=%x\n
-parentCID=%x\n
-createType=\monolithicSparse\\n
-parentFileNameHint=\%s\\n
-\n
-# Extent description\n
-RW %u SPARSE \%s\\n
-\n
-# The Disk Data Base \n
-#DDB\n
-\n;
-
-snp_fd = open(filename, O_RDWR | O_CREAT | O_TRUNC | O_BINARY | 
O_LARGEFILE, 0644);
-if (snp_fd  0)
-return -errno;
-p_fd = open(backing_file, O_RDONLY | O_BINARY | O_LARGEFILE);
-if (p_fd  0) {
-close(snp_fd);
-return -errno;
-}
-
-/* read the header */
-if (lseek(p_fd, 0x0, SEEK_SET) == -1) {
-ret = -errno;
-goto fail;
-}
-if (read(p_fd, hdr, HEADER_SIZE) != HEADER_SIZE) {
-ret = -errno;
-goto fail;
-}
-
-/* write the header */
-if (lseek(snp_fd, 0x0, SEEK_SET) == -1) {
-ret = -errno;
-goto fail;
-}
-if (write(snp_fd, hdr, HEADER_SIZE) == -1) {
-ret = -errno;
-goto fail;
-}
-
-memset(header, 0, sizeof(header));
-memcpy(header,hdr[4], sizeof(header)); // skip the VMDK4_MAGIC
-
-if (ftruncate(snp_fd, header.grain_offset  9)) {
-ret = -errno;
-goto fail;
-}
-/* the descriptor offset = 0x200 */
-if (lseek(p_fd, 0x200, SEEK_SET) == -1) {
-ret = -errno;
-goto fail;
-}
-if (read(p_fd, p_desc, DESC_SIZE) != DESC_SIZE) {
-ret = -errno;
-goto fail;
-}
-
-if ((p_name = strstr(p_desc,CID)) != NULL) {
-p_name += sizeof(CID);
-sscanf(p_name,%x,p_cid);
-}
-
-real_filename = filename;
-if ((temp_str = strrchr(real_filename, '\\')) != NULL)
-real_filename = temp_str + 1;
-if ((temp_str = strrchr(real_filename, '/')) != NULL)
-real_filename = temp_str + 1;
-if ((temp_str = strrchr(real_filename, ':')) != NULL)
-real_filename = temp_str + 1;
-
-snprintf(s_desc, sizeof(s_desc), desc_template, p_cid, p_cid, backing_file,
- (uint32_t)header.capacity, real_filename);
-
-/* write the descriptor */
-if (lseek(snp_fd, 0x200, SEEK_SET) == -1) {
-ret = -errno;
-goto fail;
-}
-if (write(snp_fd, s_desc, strlen(s_desc)) == -1) {
-ret = -errno;
-goto fail;
-}
-
-gd_offset = header.gd_offset * SECTOR_SIZE; // offset of GD table
-rgd_offset = header.rgd_offset * SECTOR_SIZE;   // offset of RGD table
-capacity = header.capacity * SECTOR_SIZE;   // Extent size
-/*
- * Each GDE span 32M disk, means:
- * 512 GTE per GT, each GTE points to grain
- */
-gt_size = (int64_t)header.num_gtes_per_gte * header.granularity * 
SECTOR_SIZE;
-if (!gt_size) {
-ret = -EINVAL;
-goto fail;
-}
-gde_entries = (uint32_t)(capacity / gt_size);  // number of gde/rgde
-gd_size = gde_entries * sizeof(uint32_t);
-
-/* write RGD */
-rgd_buf = qemu_malloc(gd_size);
-if (lseek(p_fd, rgd_offset, SEEK_SET) == -1) {
-ret = -errno;
-goto fail_rgd;
-}
-if (read(p_fd, rgd_buf, gd_size) != gd_size) {
-ret = -errno;
-goto fail_rgd;
-}
-if (lseek(snp_fd, rgd_offset, SEEK_SET) == -1) {
-ret = -errno;
-goto fail_rgd;
-}
-if

Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Stefano Stabellini

On Tue, 19 Jul 2011, Wei Liu wrote:
 Good, this is it.
 
 But this patch is not yet pulled in the tree.
 
I pushed few commits that I had in my local tree, they should be in
xen-next now.

Re: [Qemu-devel] External COW format for raw images

2011-07-19 Thread Kevin Wolf

Am 19.07.2011 11:25, schrieb Robert Wang:
 As you known, raw image is very popular,but the raw image format does
 NOT support Copy-On-Write,a raw image file can NOT be used as a copy
 destination, then image streaming/Live Block Copy will NOT work.
 
 To fix this, we need to add a new block driver raw-cow to QEMU. If
 finished, we can use qemu-img like this:
 qemu-img create -f raw-cow -o backing_file=ubuntu.img,raw_file=my_vm.img
 my_vm.raw-cow

Just one comment for the start: This is not only useful for raw (while
certainly being the most important case), but also for every other image
format for which qemu doesn't support backing files.

This means that we should look for a better name than raw-cow. I know,
it was me who introduced this, but only for lack of a better name.

 1) ubuntu.img is the backing file, my_vm.img is a raw file,
 my_vm.raw-cow stores a COW bitmap related to my_vm.img.
 
 2) If the entire COW bitmap is set to dirty flag then we can get all
 information from my_vm.img and can ignore ubuntu.img and my_vm.raw-cow
 from now.
 
 To implement this, I think I can follow these steps:
 1) Add a new member to BlockDriverState struct:
 char raw_file[1024];
 This member will track raw_file parameter related to raw-cow file from
 command line.

Can't this be private to the COW driver? It certainly will have a
BDRVRawCowState or something like that.

 2)* Create a new file block/raw-cow.c. It will be much more like the
 mixture of block/cow.c and block/raw.c.
 
 So I will change some functions in cow.c and raw.c to none-static, then
 raw-cow.c can re-use them.

I think it's better to keep drivers cleanly separated. If we really need
to share code, we should provide some sort of a library that is used by
both, but I doubt that it's required in this case.

What the driver should probably do, is to open the raw file internally
and keep a BlockDriverState of the raw file in its private structure.
For all accesses to the raw file, use the official interfaces of the
raw driver, like bdrv_aio_readv/writev.

  When read operation occurs, determine whether
 dirty flag in raw-cow image is set. If true, read directly from the raw
 file. After write operation, set related dirty flag in raw-cow image.
 And other functions might also be modified.
 
   * Of course, format_name member of BlockDriver struct will be raw-cow.
 And in order to keep relationship with raw file( like my_vm.img) ,
 raw_cow_header struct should be
 struct raw_cow_header {
 uint32_t magic;
 uint32_t version;
 char backing_file[1024];
 char raw_file[1024];/* added*/
 int32_t mtime;
 uint64_t size;
 uint32_t sectorsize;

I don't think any of mtime, size and sectorsize are necessary. They will
just be taken from the raw file (if needed at all).

 };
   * Struct raw_cow_create_options should be one member plus based on
 cow_create_options:
 {
 .name = BLOCK_OPT_RAW_FILE,
 .type = OPT_STRING,
 .help = Raw file name
 },
 
 3) Add bdrv_get_raw_filename in img_info function of qemu-img.c. In
 bdrv_get_raw_filename, if the format of the image file is raw-cow,
 print the related raw file.

Hm... Won't be implemented by any other driver, but I guess it makes
some sense to provide this information.

Kevin

Re: [Qemu-devel] [PATCH 3/3] qemu-x86: Set tsc_khz in kvm when supported

2011-07-19 Thread Marcelo Tosatti

On Thu, Jul 07, 2011 at 04:13:13PM +0200, Joerg Roedel wrote:
 Make use of the KVM_TSC_CONTROL feature if available.
 
 Signed-off-by: Joerg Roedel joerg.roe...@amd.com
 ---
  target-i386/kvm.c |   18 +-
  1 files changed, 17 insertions(+), 1 deletions(-)
 
 diff --git a/target-i386/kvm.c b/target-i386/kvm.c
 index 10fb2c4..923d2d5 100644
 --- a/target-i386/kvm.c
 +++ b/target-i386/kvm.c
 @@ -354,6 +354,7 @@ int kvm_arch_init_vcpu(CPUState *env)
  uint32_t unused;
  struct kvm_cpuid_entry2 *c;
  uint32_t signature[3];
 +int r;
  
  env-cpuid_features = kvm_arch_get_supported_cpuid(s, 1, 0, R_EDX);
  
 @@ -499,7 +500,22 @@ int kvm_arch_init_vcpu(CPUState *env)
  
  qemu_add_vm_change_state_handler(cpu_update_state, env);
  
 -return kvm_vcpu_ioctl(env, KVM_SET_CPUID2, cpuid_data);
 +r = kvm_vcpu_ioctl(env, KVM_SET_CPUID2, cpuid_data);
 +if (r)
 + return r;
 +
 +#ifdef KVM_CAP_TSC_CONTROL
 +r = kvm_check_extension(env-kvm_state, KVM_CAP_TSC_CONTROL);
 +if (r  env-tsc_khz) {
 +r = kvm_vcpu_ioctl(env, KVM_SET_TSC_KHZ, env-tsc_khz);
 +if (r  0) {
 +fprintf(stderr, KVM_SET_TSC_KHZ failed\n);
 +return r;
 +}
 +}
 +#endif

And this should be moved to kvm_arch_put_registers, in case 
level == KVM_PUT_FULL_STATE.

Re: [Qemu-devel] coroutines and block I/O considerations

2011-07-19 Thread Stefan Hajnoczi

On Tue, Jul 19, 2011 at 11:10 AM, Kevin Wolf kw...@redhat.com wrote:
 Am 19.07.2011 10:06, schrieb Frediano Ziglio:
 2- memory considerations on coroutines. Beside coroutines allow more
 readable code I wonder if somebody considered memory. For every
 coroutines a different stack has to be allocated. For instance
 ucontext and win32 implementation use 4mb. Assuming 128 concurrent AIO
 this require about 512mb of ram (mostly only committed but not used
 and coroutines are reused).

 128 concurrent requests is a lot. And even then, it's only virtual
 memory. I doubt that we're actually using much more than we do in the
 old code with the AIOCBs (which will disappear and become local
 variables when we complete the conversion).

From what I understand committed on Windows means that physical
pages have been allocated and pagefile space has been set aside:
http://msdn.microsoft.com/en-us/library/ms810627.aspx

On Linux memory is overcommitted and will not require swap space or
any actual pages.  This behavior can be configured differently IIRC
but the default is to be lazy about claiming memory resources so that
even 4 MB thread/coroutine stacks are not an issue.

The question is how can we get the same effect on Windows and does the
current Fibers implementation not already work?

Stefan

Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Marcelo Tosatti

On Thu, Jul 07, 2011 at 04:13:12PM +0200, Joerg Roedel wrote:
 To let the user configure the desired tsc frequency for the
 guest if running in KVM.
 
 Signed-off-by: Joerg Roedel joerg.roe...@amd.com
 ---
  target-i386/cpu.h   |1 +
  target-i386/cpuid.c |   13 +
  2 files changed, 14 insertions(+), 0 deletions(-)
 
 diff --git a/target-i386/cpu.h b/target-i386/cpu.h
 index cdf68ff..399e124 100644
 --- a/target-i386/cpu.h
 +++ b/target-i386/cpu.h
 @@ -743,6 +743,7 @@ typedef struct CPUX86State {
  uint32_t cpuid_kvm_features;
  uint32_t cpuid_svm_features;
  bool tsc_valid;
 +int tsc_khz;

This should be saved/restore in migration data (missing VMSTATE entry).

  /* in order to simplify APIC support, we leave this pointer to the
 user */
 diff --git a/target-i386/cpuid.c b/target-i386/cpuid.c
 index e1ae3af..89e9623 100644
 --- a/target-i386/cpuid.c
 +++ b/target-i386/cpuid.c
 @@ -224,6 +224,7 @@ typedef struct x86_def_t {
  int family;
  int model;
  int stepping;
 +int tsc_khz;
  uint32_t features, ext_features, ext2_features, ext3_features;
  uint32_t kvm_features, svm_features;
  uint32_t xlevel;
 @@ -704,6 +705,17 @@ static int cpu_x86_find_by_name(x86_def_t *x86_cpu_def, 
 const char *cpu_model)
  } else if (!strcmp(featurestr, model_id)) {
  pstrcpy(x86_cpu_def-model_id, sizeof(x86_cpu_def-model_id),
  val);
 +} else if (!strcmp(featurestr, tsc_freq)) {
 +int64_t tsc_freq;
 +char *err;
 +
 +tsc_freq = strtosz_suffix_unit(val, err,
 +   STRTOSZ_DEFSUFFIX_B, 1000);
 +if (!*val || *err) {
 +fprintf(stderr, bad numerical value %s\n, val);
 +goto error;
 +}
 +x86_cpu_def-tsc_khz = tsc_freq / 1000;
  } else {
  fprintf(stderr, unrecognized feature %s\n, featurestr);
  goto error;
 @@ -872,6 +884,7 @@ int cpu_x86_register (CPUX86State *env, const char 
 *cpu_model)
  env-cpuid_svm_features = def-svm_features;
  env-cpuid_ext4_features = def-ext4_features;
  env-cpuid_xlevel2 = def-xlevel2;
 +env-tsc_khz = def-tsc_khz;
  if (!kvm_enabled()) {
  env-cpuid_features = TCG_FEATURES;
  env-cpuid_ext_features = TCG_EXT_FEATURES;
 -- 
 1.7.4.1
 
 
 --
 To unsubscribe from this list: send the line unsubscribe kvm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Wei Liu

On Tue, 2011-07-19 at 12:09 +0100, Stefano Stabellini wrote:
 On Tue, 19 Jul 2011, Wei Liu wrote:
  Good, this is it.
  
  But this patch is not yet pulled in the tree.
  
 I pushed few commits that I had in my local tree, they should be in
 xen-next now.

The commit 2aa8f492c85604b91b263350560042d632fcdeb2 in your xen-next
tree.

Author: Anthony PERARD anthony.per...@citrix.com
Date:   Wed Jul 6 18:58:14 2011 +0100

hw/piix_pci.c: Fix PIIX3-xen to initialize ids

Signed-off-by: Anthony PERARD anthony.per...@citrix.com

The content mismatch with description.

It patches hw/ide/piix.c, which doesn't solve the problem. It should be
hw/piix_pci.c .

Seems that you got the wrong patch. The right one is here.

http://marc.info/?l=qemu-develm=130876651402847w=2

Wei.

Re: [Qemu-devel] [PATCH 0/2] netdev fixes

2011-07-19 Thread Michael S. Tsirkin

On Thu, Jun 16, 2011 at 06:45:35PM +0200, Markus Armbruster wrote:
 Markus Armbruster (2):
   Fix automatically assigned network names for netdev
   Fix netdev name lookup in -device, device_add, netdev_del
 
  net.c |   19 +++
  1 files changed, 15 insertions(+), 4 deletions(-)

Thanks, applied.
I think going forward we should bring more order into ways we assign
IDs.

 -- 
 1.7.2.3

[Qemu-devel] [PATCH] do not call monitor_resume() from migrate_fd_put_buffer() error path

2011-07-19 Thread Michael Tokarev

If we do, it results in double monitor_resume() (second being called
from migrate_fd_cleanup() anyway) and monitor suspend count becoming
negative.

Cc'ing people from `git blame' list for the lines in question: the
change fixes the problem but I'm not sure what the original intention
of this code was in this place.  Unfortunately noone replied to two
my attempts to raise this issue.

Signed-Off-By: Michael Tokarev m...@tls.msk.ru
---
 migration.c |3 ---
 1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/migration.c b/migration.c
index af3a1f2..115588c 100644
--- a/migration.c
+++ b/migration.c
@@ -330,9 +330,6 @@ ssize_t migrate_fd_put_buffer(void *opaque, const void 
*data, size_t size)
 if (ret == -EAGAIN) {
 qemu_set_fd_handler2(s-fd, NULL, NULL, migrate_fd_put_notify, s);
 } else if (ret  0) {
-if (s-mon) {
-monitor_resume(s-mon);
-}
 s-state = MIG_STATE_ERROR;
 notifier_list_notify(migration_state_notifiers);
 }
-- 
1.7.2.5

Re: [Qemu-devel] [Xen-devel] Upstream QEMU and Xen unstable not working

2011-07-19 Thread Stefano Stabellini

On Tue, 19 Jul 2011, Wei Liu wrote:
 On Tue, 2011-07-19 at 12:09 +0100, Stefano Stabellini wrote:
  On Tue, 19 Jul 2011, Wei Liu wrote:
   Good, this is it.
   
   But this patch is not yet pulled in the tree.
   
  I pushed few commits that I had in my local tree, they should be in
  xen-next now.
 
 The commit 2aa8f492c85604b91b263350560042d632fcdeb2 in your xen-next
 tree.
 
 Author: Anthony PERARD anthony.per...@citrix.com
 Date:   Wed Jul 6 18:58:14 2011 +0100
 
 hw/piix_pci.c: Fix PIIX3-xen to initialize ids
 
 Signed-off-by: Anthony PERARD anthony.per...@citrix.com
 
 The content mismatch with description.
 
 It patches hw/ide/piix.c, which doesn't solve the problem. It should be
 hw/piix_pci.c .
 
 Seems that you got the wrong patch. The right one is here.
 
 http://marc.info/?l=qemu-develm=130876651402847w=2
 

Yeah, I realized it right after sending the email.

Unfortunately there is also a regression in Xen, similar to the one
fixed by 23550, that causes PV on HVM guests to hang during boot at the
moment.
The offending commit is CS 23573:

replace d-nr_pirqs sized arrays with radix tree

[Qemu-devel] [PATCHv2] target-arm: support for ARM1176JZF-s cores

2011-07-19 Thread Jamie Iles

Add support for v6K ARM1176JZF-S.  This core includes the VA-PA
translation capability and security extensions.

v2: Model the version with the VFP

Cc: Peter Maydell peter.mayd...@linaro.org
Cc: Paul Brook p...@codesourcery.com
Cc: Aurelien Jarno aurel...@aurel32.net
Signed-off-by: Jamie Iles ja...@jamieiles.com
---
 target-arm/cpu.h|1 +
 target-arm/helper.c |   23 +++
 2 files changed, 24 insertions(+), 0 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 01f5b57..8708f9e 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -414,6 +414,7 @@ void cpu_arm_set_cp_io(CPUARMState *env, int cpnum,
 #define ARM_CPUID_PXA270_C5   0x69054117
 #define ARM_CPUID_ARM1136 0x4117b363
 #define ARM_CPUID_ARM1136_R2  0x4107b362
+#define ARM_CPUID_ARM1176 0x410fb767
 #define ARM_CPUID_ARM11MPCORE 0x410fb022
 #define ARM_CPUID_CORTEXA80x410fc080
 #define ARM_CPUID_CORTEXA90x410fc090
diff --git a/target-arm/helper.c b/target-arm/helper.c
index eda881b..c5ba5a6 100644
--- a/target-arm/helper.c
+++ b/target-arm/helper.c
@@ -36,6 +36,12 @@ static uint32_t arm1136_cp15_c0_c1[8] =
 static uint32_t arm1136_cp15_c0_c2[8] =
 { 0x00140011, 0x12002111, 0x1123, 0x01102131, 0x141, 0, 0, 0 };
 
+static uint32_t arm1176_cp15_c0_c1[8] =
+{ 0x111, 0x11, 0x33, 0x01130003, 0x01130003, 0x10030302, 0x01222100, 0 };
+
+static uint32_t arm1176_cp15_c0_c2[8] =
+{ 0x0140011, 0x12002111, 0x11231121, 0x01102131, 0x01141, 0, 0, 0 };
+
 static uint32_t cpu_arm_find_by_name(const char *name);
 
 static inline void set_feature(CPUARMState *env, int feature)
@@ -86,6 +92,21 @@ static void cpu_reset_model_id(CPUARMState *env, uint32_t id)
 env-cp15.c0_cachetype = 0x1dd20d2;
 env-cp15.c1_sys = 0x00050078;
 break;
+case ARM_CPUID_ARM1176:
+set_feature(env, ARM_FEATURE_V4T);
+set_feature(env, ARM_FEATURE_V5);
+set_feature(env, ARM_FEATURE_V6);
+set_feature(env, ARM_FEATURE_V6K);
+set_feature(env, ARM_FEATURE_VFP);
+set_feature(env, ARM_FEATURE_AUXCR);
+env-vfp.xregs[ARM_VFP_FPSID] = 0x410120b5;
+env-vfp.xregs[ARM_VFP_MVFR0] = 0x;
+env-vfp.xregs[ARM_VFP_MVFR1] = 0x;
+memcpy(env-cp15.c0_c1, arm1176_cp15_c0_c1, 8 * sizeof(uint32_t));
+memcpy(env-cp15.c0_c2, arm1176_cp15_c0_c2, 8 * sizeof(uint32_t));
+env-cp15.c0_cachetype = 0x1dd20d2;
+env-cp15.c1_sys = 0x00050078;
+break;
 case ARM_CPUID_ARM11MPCORE:
 set_feature(env, ARM_FEATURE_V4T);
 set_feature(env, ARM_FEATURE_V5);
@@ -377,6 +398,7 @@ static const struct arm_cpu_t arm_cpu_names[] = {
 { ARM_CPUID_ARM1026, arm1026},
 { ARM_CPUID_ARM1136, arm1136},
 { ARM_CPUID_ARM1136_R2, arm1136-r2},
+{ ARM_CPUID_ARM1176, arm1176},
 { ARM_CPUID_ARM11MPCORE, arm11mpcore},
 { ARM_CPUID_CORTEXM3, cortex-m3},
 { ARM_CPUID_CORTEXA8, cortex-a8},
@@ -1770,6 +1792,7 @@ uint32_t HELPER(get_cp15)(CPUState *env, uint32_t insn)
 return 1;
 case ARM_CPUID_ARM1136:
 case ARM_CPUID_ARM1136_R2:
+case ARM_CPUID_ARM1176:
 return 7;
 case ARM_CPUID_ARM11MPCORE:
 return 1;
-- 
1.7.4.1

Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Avi Kivity


On 07/19/2011 02:46 PM, Marcelo Tosatti wrote:

On Thu, Jul 07, 2011 at 04:13:12PM +0200, Joerg Roedel wrote:
  To let the user configure the desired tsc frequency for the
  guest if running in KVM.

  Signed-off-by: Joerg Roedeljoerg.roe...@amd.com
  ---
   target-i386/cpu.h   |1 +
   target-i386/cpuid.c |   13 +
   2 files changed, 14 insertions(+), 0 deletions(-)

  diff --git a/target-i386/cpu.h b/target-i386/cpu.h
  index cdf68ff..399e124 100644
  --- a/target-i386/cpu.h
  +++ b/target-i386/cpu.h
  @@ -743,6 +743,7 @@ typedef struct CPUX86State {
   uint32_t cpuid_kvm_features;
   uint32_t cpuid_svm_features;
   bool tsc_valid;
  +int tsc_khz;

This should be saved/restore in migration data (missing VMSTATE entry).


Why?  It's static data.  Traditionally we only migrate runtime data.

(although we've been talking about starting a naked qemu and pushing all 
of the configuration from the source).


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Anthony Liguori


On 07/19/2011 05:15 AM, Kevin Wolf wrote:

From: Hannes Reineckeh...@suse.de

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.
'tag' is still being kept for tracing purposes.

Signed-off-by: Hannes Reineckeh...@suse.de
Acked-by: Paolo Bonzinipbonz...@redhat.com
Signed-off-by: Kevin Wolfkw...@redhat.com
---
  hw/esp.c  |2 +-
  hw/lsi53c895a.c   |   22 --
  hw/scsi-bus.c |9 ++---
  hw/scsi-disk.c|4 ++--
  hw/scsi-generic.c |5 +++--
  hw/scsi.h |   10 +++---
  hw/spapr_vscsi.c  |   29 +
  hw/usb-msd.c  |9 +
  8 files changed, 37 insertions(+), 53 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index aa50800..9ddd637 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)

  DPRINTF(do_busid_cmd: busid 0x%x\n, busid);
  lun = busid  7;
-s-current_req = scsi_req_new(s-current_dev, 0, lun);
+s-current_req = scsi_req_new(s-current_dev, 0, lun, NULL);
  datalen = scsi_req_enqueue(s-current_req, buf);
  s-ti_size = datalen;
  if (datalen != 0) {
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 940b43a..69eec1d 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, uint32_t 
tag)
  static void lsi_request_cancelled(SCSIRequest *req)
  {
  LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
-lsi_request *p;
+lsi_request *p = req-hba_private;

  if (s-current  req == s-current-req) {
  scsi_req_unref(req);
@@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
  return;
  }

-p = lsi_find_by_tag(s, req-tag);
  if (p) {
  QTAILQ_REMOVE(s-queue, p, next);
  scsi_req_unref(req);
@@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)

  /* Record that data is available for a queued command.  Returns zero if
 the device was reselected, nonzero if the IO is deferred.  */
-static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
+static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
  {
-lsi_request *p;
-
-p = lsi_find_by_tag(s, tag);
-if (!p) {
-BADF(IO with unknown tag %d\n, tag);
-return 1;
-}
+lsi_request *p = req-hba_private;

  if (p-pending) {
-BADF(Multiple IO pending for tag %d\n, tag);
+BADF(Multiple IO pending for request %p\n, p);
  }
  p-pending = len;
  /* Reselect if waiting for it, or if reselection triggers an IRQ
@@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
len)
  LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
  int out;

-if (s-waiting == 1 || !s-current || req-tag != s-current-tag ||
+if (s-waiting == 1 || !s-current || req-hba_private != s-current ||
  (lsi_irq_on_rsl(s)  !(s-scntl1  LSI_SCNTL1_CON))) {
-if (lsi_queue_tag(s, req-tag, len)) {
+if (lsi_queue_req(s, req, len)) {
  return;
  }
  }
@@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
  assert(s-current == NULL);
  s-current = qemu_mallocz(sizeof(lsi_request));
  s-current-tag = s-select_tag;
-s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun);
+s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun,
+   s-current);

  n = scsi_req_enqueue(s-current-req, buf);
  if (n) {
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index ad6a730..8b1a412 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
  return res;
  }

-SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun)
+SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
+uint32_t lun, void *hba_private)
  {
  SCSIRequest *req;

@@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, 
uint32_t tag, uint32_t l
  req-dev = d;
  req-tag = tag;
  req-lun = lun;
+req-hba_private = hba_private;
  req-status = -1;
  trace_scsi_req_alloc(req-dev-id, req-lun, req-tag);
  return req;
  }

-SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
+SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
+  void *hba_private)
  {
-return d-info-alloc_req(d, tag, lun);
+return d-info-alloc_req(d, tag, lun, hba_private);
  }

  uint8_t *scsi_req_get_buf(SCSIRequest *req)
diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c
index a8c7372..c2a99fe 100644
--- a/hw/scsi-disk.c
+++ b/hw/scsi-disk.c
@@ -81,13 +81,13 @@

Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Marcelo Tosatti

On Tue, Jul 19, 2011 at 03:20:37PM +0300, Avi Kivity wrote:
 On 07/19/2011 02:46 PM, Marcelo Tosatti wrote:
 On Thu, Jul 07, 2011 at 04:13:12PM +0200, Joerg Roedel wrote:
   To let the user configure the desired tsc frequency for the
   guest if running in KVM.
 
   Signed-off-by: Joerg Roedeljoerg.roe...@amd.com
   ---
target-i386/cpu.h   |1 +
target-i386/cpuid.c |   13 +
2 files changed, 14 insertions(+), 0 deletions(-)
 
   diff --git a/target-i386/cpu.h b/target-i386/cpu.h
   index cdf68ff..399e124 100644
   --- a/target-i386/cpu.h
   +++ b/target-i386/cpu.h
   @@ -743,6 +743,7 @@ typedef struct CPUX86State {
uint32_t cpuid_kvm_features;
uint32_t cpuid_svm_features;
bool tsc_valid;
   +int tsc_khz;
 
 This should be saved/restore in migration data (missing VMSTATE entry).
 
 Why?  It's static data.  Traditionally we only migrate runtime data.
 
 (although we've been talking about starting a naked qemu and pushing
 all of the configuration from the source).

Right.

[Qemu-devel] [PATCH] Add missing documentation for qemu-img -p

2011-07-19 Thread Jes . Sorensen

From: Jes Sorensen jes.soren...@redhat.com

Signed-off-by: Jes Sorensen jes.soren...@redhat.com
---
 qemu-img-cmds.hx |4 ++--
 qemu-img.texi|6 --
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/qemu-img-cmds.hx b/qemu-img-cmds.hx
index 2b70618..1299e83 100644
--- a/qemu-img-cmds.hx
+++ b/qemu-img-cmds.hx
@@ -30,7 +30,7 @@ ETEXI
 DEF(convert, img_convert,
 convert [-c] [-p] [-f fmt] [-t cache] [-O output_fmt] [-o options] [-s 
snapshot_name] filename [filename2 [...]] output_filename)
 STEXI
-@item convert [-c] [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] [-s 
@var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
+@item convert [-c] [-p] [-f @var{fmt}] [-O @var{output_fmt}] [-o 
@var{options}] [-s @var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
 ETEXI
 
 DEF(info, img_info,
@@ -48,7 +48,7 @@ ETEXI
 DEF(rebase, img_rebase,
 rebase [-f fmt] [-t cache] [-p] [-u] -b backing_file [-F backing_fmt] 
filename)
 STEXI
-@item rebase [-f @var{fmt}] [-u] -b @var{backing_file} [-F @var{backing_fmt}] 
@var{filename}
+@item rebase [-f @var{fmt}] [-p] [-u] -b @var{backing_file} [-F 
@var{backing_fmt}] @var{filename}
 ETEXI
 
 DEF(resize, img_resize,
diff --git a/qemu-img.texi b/qemu-img.texi
index 526474c..495a1b6 100644
--- a/qemu-img.texi
+++ b/qemu-img.texi
@@ -38,6 +38,8 @@ by the used format or see the format descriptions below for 
details.
 indicates that target image must be compressed (qcow format only)
 @item -h
 with or without a command shows help and lists the supported formats
+@item -p
+display progress bar (convert and rebase commands only)
 @end table
 
 Parameters to snapshot subcommand:
@@ -84,7 +86,7 @@ it doesn't need to be specified separately in this case.
 
 Commit the changes recorded in @var{filename} in its base image.
 
-@item convert [-c] [-f @var{fmt}] [-O @var{output_fmt}] [-o @var{options}] [-s 
@var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
+@item convert [-c] [-p] [-f @var{fmt}] [-O @var{output_fmt}] [-o 
@var{options}] [-s @var{snapshot_name}] @var{filename} [@var{filename2} [...]] 
@var{output_filename}
 
 Convert the disk image @var{filename} or a snapshot @var{snapshot_name} to 
disk image @var{output_filename}
 using format @var{output_fmt}. It can be optionally compressed (@code{-c}
@@ -114,7 +116,7 @@ they are displayed too.
 
 List, apply, create or delete snapshots in image @var{filename}.
 
-@item rebase [-f @var{fmt}] [-u] -b @var{backing_file} [-F @var{backing_fmt}] 
@var{filename}
+@item rebase [-f @var{fmt}] [-p] [-u] -b @var{backing_file} [-F 
@var{backing_fmt}] @var{filename}
 
 Changes the backing file of an image. Only the formats @code{qcow2} and
 @code{qed} support changing the backing file.
-- 
1.7.4.4

Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Kevin Wolf

Am 19.07.2011 14:43, schrieb Anthony Liguori:
 On 07/19/2011 05:15 AM, Kevin Wolf wrote:
 From: Hannes Reineckeh...@suse.de

 'tag' is just an abstraction to identify the command
 from the driver. So we should make that explicit by
 replacing 'tag' with a driver-defined pointer 'hba_private'.
 This saves the lookup for driver handling several commands
 in parallel.
 'tag' is still being kept for tracing purposes.

 Signed-off-by: Hannes Reineckeh...@suse.de
 Acked-by: Paolo Bonzinipbonz...@redhat.com
 Signed-off-by: Kevin Wolfkw...@redhat.com
 ---
   hw/esp.c  |2 +-
   hw/lsi53c895a.c   |   22 --
   hw/scsi-bus.c |9 ++---
   hw/scsi-disk.c|4 ++--
   hw/scsi-generic.c |5 +++--
   hw/scsi.h |   10 +++---
   hw/spapr_vscsi.c  |   29 +
   hw/usb-msd.c  |9 +
   8 files changed, 37 insertions(+), 53 deletions(-)

 diff --git a/hw/esp.c b/hw/esp.c
 index aa50800..9ddd637 100644
 --- a/hw/esp.c
 +++ b/hw/esp.c
 @@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, 
 uint8_t busid)

   DPRINTF(do_busid_cmd: busid 0x%x\n, busid);
   lun = busid  7;
 -s-current_req = scsi_req_new(s-current_dev, 0, lun);
 +s-current_req = scsi_req_new(s-current_dev, 0, lun, NULL);
   datalen = scsi_req_enqueue(s-current_req, buf);
   s-ti_size = datalen;
   if (datalen != 0) {
 diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
 index 940b43a..69eec1d 100644
 --- a/hw/lsi53c895a.c
 +++ b/hw/lsi53c895a.c
 @@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, 
 uint32_t tag)
   static void lsi_request_cancelled(SCSIRequest *req)
   {
   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
 -lsi_request *p;
 +lsi_request *p = req-hba_private;

   if (s-current  req == s-current-req) {
   scsi_req_unref(req);
 @@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
   return;
   }

 -p = lsi_find_by_tag(s, req-tag);
   if (p) {
   QTAILQ_REMOVE(s-queue, p, next);
   scsi_req_unref(req);
 @@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)

   /* Record that data is available for a queued command.  Returns zero if
  the device was reselected, nonzero if the IO is deferred.  */
 -static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
 +static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
   {
 -lsi_request *p;
 -
 -p = lsi_find_by_tag(s, tag);
 -if (!p) {
 -BADF(IO with unknown tag %d\n, tag);
 -return 1;
 -}
 +lsi_request *p = req-hba_private;

   if (p-pending) {
 -BADF(Multiple IO pending for tag %d\n, tag);
 +BADF(Multiple IO pending for request %p\n, p);
   }
   p-pending = len;
   /* Reselect if waiting for it, or if reselection triggers an IRQ
 @@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
 len)
   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
   int out;

 -if (s-waiting == 1 || !s-current || req-tag != s-current-tag ||
 +if (s-waiting == 1 || !s-current || req-hba_private != s-current ||
   (lsi_irq_on_rsl(s)  !(s-scntl1  LSI_SCNTL1_CON))) {
 -if (lsi_queue_tag(s, req-tag, len)) {
 +if (lsi_queue_req(s, req, len)) {
   return;
   }
   }
 @@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
   assert(s-current == NULL);
   s-current = qemu_mallocz(sizeof(lsi_request));
   s-current-tag = s-select_tag;
 -s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun);
 +s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun,
 +   s-current);

   n = scsi_req_enqueue(s-current-req, buf);
   if (n) {
 diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
 index ad6a730..8b1a412 100644
 --- a/hw/scsi-bus.c
 +++ b/hw/scsi-bus.c
 @@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
   return res;
   }

 -SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, 
 uint32_t lun)
 +SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
 +uint32_t lun, void *hba_private)
   {
   SCSIRequest *req;

 @@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice 
 *d, uint32_t tag, uint32_t l
   req-dev = d;
   req-tag = tag;
   req-lun = lun;
 +req-hba_private = hba_private;
   req-status = -1;
   trace_scsi_req_alloc(req-dev-id, req-lun, req-tag);
   return req;
   }

 -SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
 +SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
 +  void *hba_private)
   {
 -return d-info-alloc_req(d, tag, lun);
 +return d-info-alloc_req(d, tag, lun, hba_private);
   }

   uint8_t

Re: [Qemu-devel] [PULL] v2: pending linux-user patches

2011-07-19 Thread Anthony Liguori


On 07/18/2011 02:37 AM, Riku Voipio wrote:

The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

   Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 
19:43:00 +)


Pulled.  Thanks.

Regards,

Anthony Liguori



are available in the git repository at:
   git://git.linaro.org/people/rikuvoipio/qemu.git linux-user-for-upstream

Cédric VINCENT (4):
   arm-semi: Provide access to CLI arguments passed through the -append 
option
   linux-user: Add support for KD...LED ioctls
   linux-user: Add support for more VT ioctls
   linux-user: Add support for even more FB ioctls

Peter Maydell (4):
   linux-user: Add syscall numbers from kernel 2.6.39.2
   linux-user: Implement prlimit64 syscall
   linux-user/syscall.c: Enforce pselect6 sigset size restrictions
   linux-user/signal.c: Rename s390 target_ucontext fields to fix ia64

Riku Voipio (2):
   linux-user: correct syscall 123 on sh4
   linux-user: make MIPS and ARM eabi use same argument reordering

Wesley W. Terpstra (5):
   mips: sigaltstack args
   mips: missing syscall returns wrong errno
   mips: null pointer deref should segfault
   mips: rlimit incorrectly converts values
   mips: rlimit codes are not the same

  arm-semi.c |  113 ---
  linux-user/alpha/syscall_nr.h  |   23 +-
  linux-user/arm/syscall_nr.h|   13 +++
  linux-user/cris/syscall_nr.h   |2 +
  linux-user/i386/syscall_nr.h   |   12 +++
  linux-user/ioctls.h|   13 +++
  linux-user/m68k/syscall_nr.h   |   16 
  linux-user/main.c  |   33 +++-
  linux-user/microblaze/syscall_nr.h |   14 +++-
  linux-user/mips/syscall_nr.h   |   13 +++
  linux-user/mips64/syscall_nr.h |   13 +++
  linux-user/mipsn32/syscall_nr.h|   14 +++
  linux-user/ppc/syscall_nr.h|   30 +++
  linux-user/s390x/syscall_nr.h  |   13 +++-
  linux-user/sh4/syscall_nr.h|   34 -
  linux-user/signal.c|   30 
  linux-user/sparc/syscall_nr.h  |   12 +++
  linux-user/sparc64/syscall_nr.h|   12 +++
  linux-user/syscall.c   |  153 +---
  linux-user/syscall_defs.h  |   51 
  linux-user/syscall_types.h |   20 +
  linux-user/x86_64/syscall_nr.h |   12 +++
  22 files changed, 549 insertions(+), 97 deletions(-)

Re: [Qemu-devel] [PULL] virtio-serial: Fixes, trace points

2011-07-19 Thread Anthony Liguori


On 07/19/2011 03:00 AM, Amit Shah wrote:

Hi Anthony,

Please pull for trace points for virtio-serial/console code and a fix
for a host process closing chardev connection causing an abort().

The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

   Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 
19:43:00 +)

are available in the git repository at:
   git://git.kernel.org/pub/scm/qemu/amit/virtio-serial.git for-anthony


Pulled.  Thanks.

Regards,

Anthony Liguori



Amit Shah (4):
   virtio-serial-bus: Add trace events
   virtio-console: Add some trace events
   virtio-serial-bus: Fix trailing \n in error_report string
   virtio-console: Prevent abort()s in case of host chardev close

  hw/virtio-console.c|   25 +++--
  hw/virtio-serial-bus.c |9 -
  trace-events   |   11 +++
  3 files changed, 42 insertions(+), 3 deletions(-)


Amit

Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Anthony Liguori


On 07/17/2011 06:13 AM, Avi Kivity wrote:

New in this version:
   MemoryRegionOps gained .old_mmio and .old_portio members, which allow
   reusing old-style callbacks with the new API.  All uses were converted,
   except for eepro100.c, which uses the same MemoryRegionOps for both
   portio and mmio.  Some intermediate patches do introduce dispatching
   callbacks, but they are removed later.

Caveats:
- some devices still grab a global memory region instead of inheriting
   it from their bus.  Seen in the code as #include exec-memory.h


Could you write up a quick document on how to use this new api for docs/?

There's bits I don't like about the interface but I think it's a huge 
improvement over what we have now so I'm inclined to commit once it 
includes documentation.


Regards,

Anthony Liguori



Avi Kivity (58):
   Hierarchical memory region API
   memory: implement dirty tracking
   memory: merge adjacent segments of a single memory region
   Internal interfaces for memory API
   memory: abstract address space operations
   memory: rename MemoryRegion::has_ram_addr to ::terminates
   memory: late initialization of ram_addr
   memory:  I/O address space support
   memory: add backward compatibility for old portio registration
   memory: add backward compatibility for old mmio registration
   memory: add ioeventfd support
   exec.c: initialize memory map
   ioport: register ranges by byte aligned addresses always
   pc: grab system_memory
   pc: convert pc_memory_init() to memory API
   pc: move global memory map out of pc_init1() and into its callers
   pci: pass address space to pci bus when created
   pci: add MemoryRegion based BAR management API
   sysbus: add MemoryRegion based memory management API
   usb-ohci: convert to MemoryRegion
   pci: add API to get a BAR's mapped address
   vmsvga: don't remember pci BAR address in callback any more
   vga: convert vga and its derivatives to the memory API
   cirrus: simplify mmio BAR access functions
   cirrus: simplify bitblt BAR access functions
   cirrus: simplify vga window mmio access functions
   vga: simplify vga window mmio access functions
   cirrus: simplify linear framebuffer access functions
   Integrate I/O memory regions into qemu
   exec.c: fix initialization of system I/O memory region
   pci: pass I/O address space to new PCI bus
   pci: allow I/O BARs to be registered with pci_register_bar_region()
   rtl8139: convert to memory API
   ac97: convert to memory API
   e1000: convert to memory API
   eepro100: convert to memory API
   es1370: convert to memory API
   ide: convert to memory API
   ivshmem: convert to memory API
   virtio-pci: convert to memory API
   ahci: convert to memory API
   intel-hda: convert to memory API
   lsi53c895a: convert to memory API
   ppc: convert to memory API
   ne2000: convert to memory API
   pcnet: convert to memory API
   i6300esb: convert to memory API
   isa-mmio: concert to memory API
   sun4u: convert to memory API
   ehci: convert to memory API
   uhci: convert to memory API
   xen-platform: convert to memory API
   msix: convert to memory API
   pci: remove pci_register_bar_simple()
   pci: convert pci rom to memory API
   pci: remove pci_register_bar()
   pci: fold BAR mapping function into its caller
   pci: rename pci_register_bar_region() to pci_register_bar()

  Makefile.target|1 +
  exec-memory.h  |   28 ++
  exec.c |   29 ++
  hw/ac97.c  |   88 +++--
  hw/apb_pci.c   |3 +
  hw/bonito.c|5 +-
  hw/cirrus_vga.c|  460 +++---
  hw/cuda.c  |6 +-
  hw/e1000.c |  113 +++
  hw/eepro100.c  |  181 ++---
  hw/es1370.c|   43 ++-
  hw/escc.c  |   42 +-
  hw/escc.h  |2 +-
  hw/grackle_pci.c   |9 +-
  hw/gt64xxx.c   |6 +-
  hw/heathrow_pic.c  |   29 +-
  hw/ide.h   |2 +-
  hw/ide/ahci.c  |   31 +-
  hw/ide/ahci.h  |2 +-
  hw/ide/cmd646.c|  204 +++
  hw/ide/ich.c   |3 +-
  hw/ide/macio.c |   36 +-
  hw/ide/pci.c   |   25 +-
  hw/ide/pci.h   |   19 +-
  hw/ide/piix.c  |   63 +++-
  hw/ide/via.c   |   64 +++-
  hw/intel-hda.c |   35 +-
  hw/isa.h   |2 +
  hw/isa_mmio.c  |   30 +-
  hw/ivshmem.c   |  158 +++-
  hw/lance.c |   31 +-
  hw/lsi53c895a.c|  257 +++--
  hw/mac_dbdma.c |   32 +-
  hw/mac_dbdma.h |4 +-
  hw/mac_nvram.c |   39 +--
  hw/macio.c |   73 ++--
  hw/msix.c  |   64 +--
  hw/msix.h  |6 +-
  hw/ne2000-isa.c|   14 +-
  hw/ne2000.c|   77 +++--
  hw/ne2000.h|8 +-
  hw/openpic.c   |   81 ++--
  hw/openpic.h   |2 +-
  hw/pc.c|   62 ++-
  hw/pc.h|   11 +-
  hw/pc_piix.c   |   24 +-
  hw/pci.c   |  104 +++---
  hw/pci.h   |   30 +-
  hw/pci_host.h  |1 +
  hw/pci_internals.h |2 +
  hw/pcnet-pci.c |   74

Re: [Qemu-devel] [PULL] pci, virtio, vhost, xen

2011-07-19 Thread Anthony Liguori


On 07/17/2011 11:29 AM, Michael S. Tsirkin wrote:

The following changes since commit 89b9ba661bd2d6155308f895ec075d813f0e129b:

   Fix signal handling of SIG_IPI when io-thread is enabled (2011-07-16 
19:43:00 +)

are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu.git for_anthony


Pulled.  Thanks.

Regards,

Anthony Liguori



Anthony PERARD (1):
   hw/piix_pci.c: Fix PIIX3-xen to initialize ids

Michael S. Tsirkin (4):
   vhost: fix double free on device stop
   pci_ids: tweak names to match linux/pci_ids.h
   xen: move to new pci initializers
   virtio: fix indirect descriptor buffer overflow

  hw/pci_ids.h  |3 ++-
  hw/piix_pci.c |3 +++
  hw/vhost.c|1 +
  hw/virtio.c   |8 
  hw/xen_platform.c |   15 +++
  5 files changed, 21 insertions(+), 9 deletions(-)

Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Benjamin Herrenschmidt

On Tue, 2011-07-19 at 15:06 +0200, Kevin Wolf wrote:
 Am 19.07.2011 14:43, schrieb Anthony Liguori:
  On 07/19/2011 05:15 AM, Kevin Wolf wrote:
  From: Hannes Reineckeh...@suse.de
 
  'tag' is just an abstraction to identify the command
  from the driver. So we should make that explicit by
  replacing 'tag' with a driver-defined pointer 'hba_private'.
  This saves the lookup for driver handling several commands
  in parallel.
  'tag' is still being kept for tracing purposes.
 
  Signed-off-by: Hannes Reineckeh...@suse.de
  Acked-by: Paolo Bonzinipbonz...@redhat.com
  Signed-off-by: Kevin Wolfkw...@redhat.com
  ---
hw/esp.c  |2 +-
hw/lsi53c895a.c   |   22 --
hw/scsi-bus.c |9 ++---
hw/scsi-disk.c|4 ++--
hw/scsi-generic.c |5 +++--
hw/scsi.h |   10 +++---
hw/spapr_vscsi.c  |   29 +
hw/usb-msd.c  |9 +
8 files changed, 37 insertions(+), 53 deletions(-)
 
  diff --git a/hw/esp.c b/hw/esp.c
  index aa50800..9ddd637 100644
  --- a/hw/esp.c
  +++ b/hw/esp.c
  @@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, 
  uint8_t busid)
 
DPRINTF(do_busid_cmd: busid 0x%x\n, busid);
lun = busid  7;
  -s-current_req = scsi_req_new(s-current_dev, 0, lun);
  +s-current_req = scsi_req_new(s-current_dev, 0, lun, NULL);
datalen = scsi_req_enqueue(s-current_req, buf);
s-ti_size = datalen;
if (datalen != 0) {
  diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
  index 940b43a..69eec1d 100644
  --- a/hw/lsi53c895a.c
  +++ b/hw/lsi53c895a.c
  @@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, 
  uint32_t tag)
static void lsi_request_cancelled(SCSIRequest *req)
{
LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
  -lsi_request *p;
  +lsi_request *p = req-hba_private;
 
if (s-current  req == s-current-req) {
scsi_req_unref(req);
  @@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
return;
}
 
  -p = lsi_find_by_tag(s, req-tag);
if (p) {
QTAILQ_REMOVE(s-queue, p, next);
scsi_req_unref(req);
  @@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)
 
/* Record that data is available for a queued command.  Returns zero if
   the device was reselected, nonzero if the IO is deferred.  */
  -static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
  +static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
{
  -lsi_request *p;
  -
  -p = lsi_find_by_tag(s, tag);
  -if (!p) {
  -BADF(IO with unknown tag %d\n, tag);
  -return 1;
  -}
  +lsi_request *p = req-hba_private;
 
if (p-pending) {
  -BADF(Multiple IO pending for tag %d\n, tag);
  +BADF(Multiple IO pending for request %p\n, p);
}
p-pending = len;
/* Reselect if waiting for it, or if reselection triggers an IRQ
  @@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, 
  uint32_t len)
LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
int out;
 
  -if (s-waiting == 1 || !s-current || req-tag != s-current-tag ||
  +if (s-waiting == 1 || !s-current || req-hba_private != s-current 
  ||
(lsi_irq_on_rsl(s)  !(s-scntl1  LSI_SCNTL1_CON))) {
  -if (lsi_queue_tag(s, req-tag, len)) {
  +if (lsi_queue_req(s, req, len)) {
return;
}
}
  @@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
assert(s-current == NULL);
s-current = qemu_mallocz(sizeof(lsi_request));
s-current-tag = s-select_tag;
  -s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun);
  +s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun,
  +   s-current);
 
n = scsi_req_enqueue(s-current-req, buf);
if (n) {
  diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
  index ad6a730..8b1a412 100644
  --- a/hw/scsi-bus.c
  +++ b/hw/scsi-bus.c
  @@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
return res;
}
 
  -SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, 
  uint32_t lun)
  +SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
  +uint32_t lun, void *hba_private)
{
SCSIRequest *req;
 
  @@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice 
  *d, uint32_t tag, uint32_t l
req-dev = d;
req-tag = tag;
req-lun = lun;
  +req-hba_private = hba_private;
req-status = -1;
trace_scsi_req_alloc(req-dev-id, req-lun, req-tag);
return req;
}
 
  -SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
  +SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t

Re: [Qemu-devel] coroutines and block I/O considerations

2011-07-19 Thread Anthony Liguori


On 07/19/2011 05:10 AM, Kevin Wolf wrote:

Am 19.07.2011 10:06, schrieb Frediano Ziglio:
They are still all running in the same thread.


2- memory considerations on coroutines. Beside coroutines allow more
readable code I wonder if somebody considered memory. For every
coroutines a different stack has to be allocated. For instance
ucontext and win32 implementation use 4mb. Assuming 128 concurrent AIO
this require about 512mb of ram (mostly only committed but not used
and coroutines are reused).


128 concurrent requests is a lot. And even then, it's only virtual
memory. I doubt that we're actually using much more than we do in the
old code with the AIOCBs (which will disappear and become local
variables when we complete the conversion).


A 4mb stack is probably overkill anyway.  It's easiest to just start 
with a large stack and then once all of the functionality is worked out, 
optimize to a smaller stack.


The same problem exists with using threads FWIW since the default thread 
stack is usually quite large.


Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Hannes Reinecke


On 07/19/2011 03:06 PM, Kevin Wolf wrote:

Am 19.07.2011 14:43, schrieb Anthony Liguori:

On 07/19/2011 05:15 AM, Kevin Wolf wrote:

From: Hannes Reineckeh...@suse.de

'tag' is just an abstraction to identify the command
from the driver. So we should make that explicit by
replacing 'tag' with a driver-defined pointer 'hba_private'.
This saves the lookup for driver handling several commands
in parallel.
'tag' is still being kept for tracing purposes.

Signed-off-by: Hannes Reineckeh...@suse.de
Acked-by: Paolo Bonzinipbonz...@redhat.com
Signed-off-by: Kevin Wolfkw...@redhat.com
---
   hw/esp.c  |2 +-
   hw/lsi53c895a.c   |   22 --
   hw/scsi-bus.c |9 ++---
   hw/scsi-disk.c|4 ++--
   hw/scsi-generic.c |5 +++--
   hw/scsi.h |   10 +++---
   hw/spapr_vscsi.c  |   29 +
   hw/usb-msd.c  |9 +
   8 files changed, 37 insertions(+), 53 deletions(-)

diff --git a/hw/esp.c b/hw/esp.c
index aa50800..9ddd637 100644
--- a/hw/esp.c
+++ b/hw/esp.c
@@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, uint8_t 
busid)

   DPRINTF(do_busid_cmd: busid 0x%x\n, busid);
   lun = busid   7;
-s-current_req = scsi_req_new(s-current_dev, 0, lun);
+s-current_req = scsi_req_new(s-current_dev, 0, lun, NULL);
   datalen = scsi_req_enqueue(s-current_req, buf);
   s-ti_size = datalen;
   if (datalen != 0) {
diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
index 940b43a..69eec1d 100644
--- a/hw/lsi53c895a.c
+++ b/hw/lsi53c895a.c
@@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, uint32_t 
tag)
   static void lsi_request_cancelled(SCSIRequest *req)
   {
   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
-lsi_request *p;
+lsi_request *p = req-hba_private;

   if (s-current   req == s-current-req) {
   scsi_req_unref(req);
@@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
   return;
   }

-p = lsi_find_by_tag(s, req-tag);
   if (p) {
   QTAILQ_REMOVE(s-queue, p, next);
   scsi_req_unref(req);
@@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)

   /* Record that data is available for a queued command.  Returns zero if
  the device was reselected, nonzero if the IO is deferred.  */
-static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
+static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
   {
-lsi_request *p;
-
-p = lsi_find_by_tag(s, tag);
-if (!p) {
-BADF(IO with unknown tag %d\n, tag);
-return 1;
-}
+lsi_request *p = req-hba_private;

   if (p-pending) {
-BADF(Multiple IO pending for tag %d\n, tag);
+BADF(Multiple IO pending for request %p\n, p);
   }
   p-pending = len;
   /* Reselect if waiting for it, or if reselection triggers an IRQ
@@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, uint32_t 
len)
   LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
   int out;

-if (s-waiting == 1 || !s-current || req-tag != s-current-tag ||
+if (s-waiting == 1 || !s-current || req-hba_private != s-current ||
   (lsi_irq_on_rsl(s)   !(s-scntl1   LSI_SCNTL1_CON))) {
-if (lsi_queue_tag(s, req-tag, len)) {
+if (lsi_queue_req(s, req, len)) {
   return;
   }
   }
@@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
   assert(s-current == NULL);
   s-current = qemu_mallocz(sizeof(lsi_request));
   s-current-tag = s-select_tag;
-s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun);
+s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun,
+   s-current);

   n = scsi_req_enqueue(s-current-req, buf);
   if (n) {
diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
index ad6a730..8b1a412 100644
--- a/hw/scsi-bus.c
+++ b/hw/scsi-bus.c
@@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
   return res;
   }

-SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, uint32_t 
lun)
+SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
+uint32_t lun, void *hba_private)
   {
   SCSIRequest *req;

@@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, 
uint32_t tag, uint32_t l
   req-dev = d;
   req-tag = tag;
   req-lun = lun;
+req-hba_private = hba_private;
   req-status = -1;
   trace_scsi_req_alloc(req-dev-id, req-lun, req-tag);
   return req;
   }

-SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
+SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
+  void *hba_private)
   {
-return d-info-alloc_req(d, tag, lun);
+return d-info-alloc_req(d, tag, lun, hba_private);
   }

   uint8_t

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Stefan Hajnoczi

On Tue, Jul 19, 2011 at 8:24 AM, Jes Sorensen jes.soren...@redhat.com wrote:
 On 07/18/11 16:08, Stefan Hajnoczi wrote:
 On Fri, Jul 15, 2011 at 3:58 PM, Jes Sorensen jes.soren...@redhat.com 
 wrote:
 I have been updating the live snapshot wiki for qemu to try and cover
 the commands we will want for async snapshot handling too.

 http://wiki.qemu.org/Features/Snapshots

 Regarding fd passing, do we even support SELinux today with backing files?

 Not sure I understand what you mean. The current code should be happy to
 take an existing file or a raw device for the snapshot.

Sorry, I was off on a tangent.

I think today QEMU does not support opening image files with a backing
file purely using file descriptors.  We currently require the ability
to open files.

Stefan

Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Benjamin Herrenschmidt

On Tue, 2011-07-19 at 07:43 -0500, Anthony Liguori wrote:
 
 This breaks the build:
 
 make[1]: Nothing to be done for `all'.
CCppc64-softmmu/spapr_vscsi.o
 /home/anthony/git/qemu/hw/spapr_vscsi.c: In function 
 ‘vscsi_command_complete’:
 /home/anthony/git/qemu/hw/spapr_vscsi.c:535:34: error: ‘s’ undeclared 
 (first use in this function)
 /home/anthony/git/qemu/hw/spapr_vscsi.c:535:34: note: each undeclared 
 identifier is reported only once for each function it appears in
 
 This file is only built when libfdt is installed which is probably
 why 
 you didn't catch it.
 
 Ben/David, is there a way we can still build most of this stuff
 without 
 libfdt?  libfdt is still not commonly packaged by some distros.

That would be hard ... the DT stuff is pretty deeply involved. Might be
easier to try to fix the distro :-)

Which ones ?

Cheers,
Ben.

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Jes Sorensen

On 07/19/11 15:23, Stefan Hajnoczi wrote:
 On Tue, Jul 19, 2011 at 8:24 AM, Jes Sorensen jes.soren...@redhat.com wrote:
 On 07/18/11 16:08, Stefan Hajnoczi wrote:
 On Fri, Jul 15, 2011 at 3:58 PM, Jes Sorensen jes.soren...@redhat.com 
 wrote:
 I have been updating the live snapshot wiki for qemu to try and cover
 the commands we will want for async snapshot handling too.

 http://wiki.qemu.org/Features/Snapshots

 Regarding fd passing, do we even support SELinux today with backing files?

 Not sure I understand what you mean. The current code should be happy to
 take an existing file or a raw device for the snapshot.
 
 Sorry, I was off on a tangent.
 
 I think today QEMU does not support opening image files with a backing
 file purely using file descriptors.  We currently require the ability
 to open files.

I see what you mean - I don't actually know how that would work, since
the backing file specified in the front image will be a file name.

Eric, what happens if libvirt in an selinux environment tells QEMU to
launch using an image file that is backed by backing file(s)?

Cheers,
Jes

Re: [Qemu-devel] External COW format for raw images

2011-07-19 Thread Anthony Liguori

On 07/19/2011 04:25 AM, Robert Wang wrote:
 As you known, raw image is very popular,but the raw image format does
 NOT support Copy-On-Write,a raw image file can NOT be used as a copy
 destination, then image streaming/Live Block Copy will NOT work.
 
 To fix this, we need to add a new block driver raw-cow to QEMU. If
 finished, we can use qemu-img like this:
 qemu-img create -f raw-cow -o backing_file=ubuntu.img,raw_file=my_vm.img
 my_vm.raw-cow
 
 1) ubuntu.img is the backing file, my_vm.img is a raw file,
 my_vm.raw-cow stores a COW bitmap related to my_vm.img.
 
 2) If the entire COW bitmap is set to dirty flag then we can get all
 information from my_vm.img and can ignore ubuntu.img and my_vm.raw-cow
 from now.
 
 To implement this, I think I can follow these steps:
 1) Add a new member to BlockDriverState struct:
 char raw_file[1024];
 This member will track raw_file parameter related to raw-cow file from
 command line.
 
 2)* Create a new file block/raw-cow.c. It will be much more like the
 mixture of block/cow.c and block/raw.c.
 
 So I will change some functions in cow.c and raw.c to none-static, then
 raw-cow.c can re-use them. When read operation occurs, determine whether
 dirty flag in raw-cow image is set. If true, read directly from the raw
 file. After write operation, set related dirty flag in raw-cow image.
 And other functions might also be modified.
 
   * Of course, format_name member of BlockDriver struct will be raw-cow.
 And in order to keep relationship with raw file( like my_vm.img) ,
 raw_cow_header struct should be
 struct raw_cow_header {
 uint32_t magic;
 uint32_t version;
 char backing_file[1024];
 char raw_file[1024];/* added*/
 int32_t mtime;
 uint64_t size;
 uint32_t sectorsize;
 };

I'd suggest that doing an image format is the wrong approach here.  Why
not just have a image format where you can pass it the location of a
bitmap?  That let's you compose arbitrarily complex backing file chains
and avoids the introduce of a new bitmap.

The bitmap format is also useful for implementing things like dirty
tracking.

Regards,

Anthony Liguori

   * Struct raw_cow_create_options should be one member plus based on
 cow_create_options:
 {
 .name = BLOCK_OPT_RAW_FILE,
 .type = OPT_STRING,
 .help = Raw file name
 },
 
 3) Add bdrv_get_raw_filename in img_info function of qemu-img.c. In
 bdrv_get_raw_filename, if the format of the image file is raw-cow,
 print the related raw file.
 
 Do you think my approach is right?
 Thank you.

Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Avi Kivity


On 07/19/2011 04:09 PM, Anthony Liguori wrote:

On 07/17/2011 06:13 AM, Avi Kivity wrote:

New in this version:
   MemoryRegionOps gained .old_mmio and .old_portio members, which allow
   reusing old-style callbacks with the new API.  All uses were 
converted,

   except for eepro100.c, which uses the same MemoryRegionOps for both
   portio and mmio.  Some intermediate patches do introduce dispatching
   callbacks, but they are removed later.

Caveats:
- some devices still grab a global memory region instead of inheriting
   it from their bus.  Seen in the code as #include exec-memory.h


Could you write up a quick document on how to use this new api for docs/?


Sure.  It's pretty simple.



There's bits I don't like about the interface 


Which bits are these?

but I think it's a huge improvement over what we have now so I'm 
inclined to commit once it includes documentation.




My problem is that to start leveraging it, everything must flow through 
it.  There are still several hundred call sites that are unconverted.


One option is to invert the relationship between ram_addr_t and 
MemoryRegion - implement the former in terms of the latter.  That only 
works for uses which don't invoke IO_MEM_UNASSIGNED or address arithmetic.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Joerg Roedel

On Tue, Jul 19, 2011 at 03:20:37PM +0300, Avi Kivity wrote:
 On 07/19/2011 02:46 PM, Marcelo Tosatti wrote:
 On Thu, Jul 07, 2011 at 04:13:12PM +0200, Joerg Roedel wrote:
   To let the user configure the desired tsc frequency for the
   guest if running in KVM.
 
   Signed-off-by: Joerg Roedeljoerg.roe...@amd.com
   ---
target-i386/cpu.h   |1 +
target-i386/cpuid.c |   13 +
2 files changed, 14 insertions(+), 0 deletions(-)
 
   diff --git a/target-i386/cpu.h b/target-i386/cpu.h
   index cdf68ff..399e124 100644
   --- a/target-i386/cpu.h
   +++ b/target-i386/cpu.h
   @@ -743,6 +743,7 @@ typedef struct CPUX86State {
uint32_t cpuid_kvm_features;
uint32_t cpuid_svm_features;
bool tsc_valid;
   +int tsc_khz;

 This should be saved/restore in migration data (missing VMSTATE entry).

 Why?  It's static data.  Traditionally we only migrate runtime data.

 (although we've been talking about starting a naked qemu and pushing all  
 of the configuration from the source).

Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
hosts and migrate it over so that the destination host can set the
tsc-freq if it supports tsc-scaling.

Joerg

Re: [Qemu-devel] [PATCH 05/21] scsi: Add 'hba_private' to SCSIRequest

2011-07-19 Thread Kevin Wolf

Am 19.07.2011 15:26, schrieb Hannes Reinecke:
 On 07/19/2011 03:06 PM, Kevin Wolf wrote:
 Am 19.07.2011 14:43, schrieb Anthony Liguori:
 On 07/19/2011 05:15 AM, Kevin Wolf wrote:
 From: Hannes Reineckeh...@suse.de

 'tag' is just an abstraction to identify the command
 from the driver. So we should make that explicit by
 replacing 'tag' with a driver-defined pointer 'hba_private'.
 This saves the lookup for driver handling several commands
 in parallel.
 'tag' is still being kept for tracing purposes.

 Signed-off-by: Hannes Reineckeh...@suse.de
 Acked-by: Paolo Bonzinipbonz...@redhat.com
 Signed-off-by: Kevin Wolfkw...@redhat.com
 ---
hw/esp.c  |2 +-
hw/lsi53c895a.c   |   22 --
hw/scsi-bus.c |9 ++---
hw/scsi-disk.c|4 ++--
hw/scsi-generic.c |5 +++--
hw/scsi.h |   10 +++---
hw/spapr_vscsi.c  |   29 +
hw/usb-msd.c  |9 +
8 files changed, 37 insertions(+), 53 deletions(-)

 diff --git a/hw/esp.c b/hw/esp.c
 index aa50800..9ddd637 100644
 --- a/hw/esp.c
 +++ b/hw/esp.c
 @@ -244,7 +244,7 @@ static void do_busid_cmd(ESPState *s, uint8_t *buf, 
 uint8_t busid)

DPRINTF(do_busid_cmd: busid 0x%x\n, busid);
lun = busid   7;
 -s-current_req = scsi_req_new(s-current_dev, 0, lun);
 +s-current_req = scsi_req_new(s-current_dev, 0, lun, NULL);
datalen = scsi_req_enqueue(s-current_req, buf);
s-ti_size = datalen;
if (datalen != 0) {
 diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c
 index 940b43a..69eec1d 100644
 --- a/hw/lsi53c895a.c
 +++ b/hw/lsi53c895a.c
 @@ -661,7 +661,7 @@ static lsi_request *lsi_find_by_tag(LSIState *s, 
 uint32_t tag)
static void lsi_request_cancelled(SCSIRequest *req)
{
LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
 -lsi_request *p;
 +lsi_request *p = req-hba_private;

if (s-current   req == s-current-req) {
scsi_req_unref(req);
 @@ -670,7 +670,6 @@ static void lsi_request_cancelled(SCSIRequest *req)
return;
}

 -p = lsi_find_by_tag(s, req-tag);
if (p) {
QTAILQ_REMOVE(s-queue, p, next);
scsi_req_unref(req);
 @@ -680,18 +679,12 @@ static void lsi_request_cancelled(SCSIRequest *req)

/* Record that data is available for a queued command.  Returns zero if
   the device was reselected, nonzero if the IO is deferred.  */
 -static int lsi_queue_tag(LSIState *s, uint32_t tag, uint32_t len)
 +static int lsi_queue_req(LSIState *s, SCSIRequest *req, uint32_t len)
{
 -lsi_request *p;
 -
 -p = lsi_find_by_tag(s, tag);
 -if (!p) {
 -BADF(IO with unknown tag %d\n, tag);
 -return 1;
 -}
 +lsi_request *p = req-hba_private;

if (p-pending) {
 -BADF(Multiple IO pending for tag %d\n, tag);
 +BADF(Multiple IO pending for request %p\n, p);
}
p-pending = len;
/* Reselect if waiting for it, or if reselection triggers an IRQ
 @@ -743,9 +736,9 @@ static void lsi_transfer_data(SCSIRequest *req, 
 uint32_t len)
LSIState *s = DO_UPCAST(LSIState, dev.qdev, req-bus-qbus.parent);
int out;

 -if (s-waiting == 1 || !s-current || req-tag != s-current-tag ||
 +if (s-waiting == 1 || !s-current || req-hba_private != s-current 
 ||
(lsi_irq_on_rsl(s)   !(s-scntl1   LSI_SCNTL1_CON))) {
 -if (lsi_queue_tag(s, req-tag, len)) {
 +if (lsi_queue_req(s, req, len)) {
return;
}
}
 @@ -789,7 +782,8 @@ static void lsi_do_command(LSIState *s)
assert(s-current == NULL);
s-current = qemu_mallocz(sizeof(lsi_request));
s-current-tag = s-select_tag;
 -s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun);
 +s-current-req = scsi_req_new(dev, s-current-tag, s-current_lun,
 +   s-current);

n = scsi_req_enqueue(s-current-req, buf);
if (n) {
 diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c
 index ad6a730..8b1a412 100644
 --- a/hw/scsi-bus.c
 +++ b/hw/scsi-bus.c
 @@ -131,7 +131,8 @@ int scsi_bus_legacy_handle_cmdline(SCSIBus *bus)
return res;
}

 -SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag, 
 uint32_t lun)
 +SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice *d, uint32_t tag,
 +uint32_t lun, void *hba_private)
{
SCSIRequest *req;

 @@ -141,14 +142,16 @@ SCSIRequest *scsi_req_alloc(size_t size, SCSIDevice 
 *d, uint32_t tag, uint32_t l
req-dev = d;
req-tag = tag;
req-lun = lun;
 +req-hba_private = hba_private;
req-status = -1;
trace_scsi_req_alloc(req-dev-id, req-lun, req-tag);
return req;
}

 -SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun)
 +SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun,
 +

Re: [Qemu-devel] [PATCH V3] e1000: Handle IO Port.

2011-07-19 Thread Juan Quintela

Anthony PERARD anthony.per...@citrix.com wrote:
 This patch introduces the two IOPorts on e1000, IOADDR and IODATA. The
 IOADDR is used to specify which register we want to access when we read
 or write on IODATA.

 This patch fixes some weird behavior that I see when I use e1000 with
 QEMU/Xen, the guest memory can be corrupted by this NIC because it will
 write on memory that it doesn't own anymore after a reset. It's because
 the kernel Linux use the IOPort to reset the network card instead of the
 MMIO.

 Signed-off-by: Anthony PERARD anthony.per...@citrix.com

This used to work, so the question is:
- do ioport_addr normally has a value of 0, and then migration works?
- is very rare that we are in the middle of an io cycle?

To be able to use a subsection, we have to had a way to decide that the
old default value is going to go.  My understanding is that testing for
-ioport_addr == 0 should be the test for a subsection, but the code
never looks to put ioport_addr back to zero.

I am missing anything obvious?  Or is there any easy way to now if we
are in the middle of a couple of io operations?  For my reading of 
e100_ioport_read/writel() it looks like it should be used as:

write(base+IOADDR)
write(base+IODATA)

but, should this always be paired, and we can reset ioport_addr after
the second?  Then just setting ioport_addr to zero after the second
would made the subsection work in the normal case.

Any other clue about _when_ we should send ioport_addr?

Thanks, Juan.
 @@ -202,8 +201,12 @@ rxbufsize(uint32_t v)
  static void
  set_ctrl(E1000State *s, int index, uint32_t val)
  {
 -/* RST is self clearing */
 -s-mac_reg[CTRL] = val  ~E1000_CTRL_RST;
 +DBGOUT(IO, set ctrl = %08x\n, val);
 +if (val  E1000_CTRL_RST) {
 +e1000_reset(s);
 +return;
 +}
 +s-mac_reg[CTRL] = val;
  }


This looks to me as a different fix that can go in a different patch.

 +/* Writes that are less than 32 bits are ignored on IOADDR.
 + * For the Flash access, a write can be less than 32 bits for
 + * IODATA register, but is not handled.
 + */

Code to implement it is almost the same lenght that this O:-)

 +
 +register_ioport_read(addr, size, 1, e1000_ioport_readl, d);
 +
 +register_ioport_read(addr, size, 2, e1000_ioport_readl, d);
 +
 +register_ioport_write(addr, size, 4, e1000_ioport_writel, d);
 +register_ioport_read(addr, size, 4, e1000_ioport_readl, d);

This is curiosity on my part.  Are we returinng 32bits reads for 1,2 and
4 bytes reads, or there is code at some other level that drops the bits
that we are not interested into?  My understanding of iport.c is that
this is not checked done (it is more, but I don't claim to fully
understand it, or if it mattres at all).

Later, Juan.

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Eric Blake


On 07/19/2011 07:27 AM, Jes Sorensen wrote:

On 07/19/11 15:23, Stefan Hajnoczi wrote:

On Tue, Jul 19, 2011 at 8:24 AM, Jes Sorensenjes.soren...@redhat.com  wrote:

On 07/18/11 16:08, Stefan Hajnoczi wrote:

On Fri, Jul 15, 2011 at 3:58 PM, Jes Sorensenjes.soren...@redhat.com  wrote:

I have been updating the live snapshot wiki for qemu to try and cover
the commands we will want for async snapshot handling too.

http://wiki.qemu.org/Features/Snapshots


Regarding fd passing, do we even support SELinux today with backing files?


Not sure I understand what you mean. The current code should be happy to
take an existing file or a raw device for the snapshot.


Sorry, I was off on a tangent.

I think today QEMU does not support opening image files with a backing
file purely using file descriptors.  We currently require the ability
to open files.


I see what you mean - I don't actually know how that would work, since
the backing file specified in the front image will be a file name.

Eric, what happens if libvirt in an selinux environment tells QEMU to
launch using an image file that is backed by backing file(s)?


Before starting qemu, libvirt first parses all the image files, to see 
if any of them have backing images.  For every qcow2 or qed image with a 
backing file, libvirt sets the SELinux context of both the qcow2 image 
and its backing file so that qemu will be able to successfully open() 
them.  But if any of those files reside on NFS, then it is not possible 
to label individual files, so it requires setting the SELinux bool 
virt_use_nfs, which thus gives qemu the power to open() arbitrary files 
on NFS, and you've lost security.


It would be nice if libvirt had a way to pass fds for every disk and 
backing file up front; then, SELinux can work around the lack of NFS 
per-file labelling by blocking open() in qemu.  In fact, this has 
already been proposed:


http://lists.gnu.org/archive/html/qemu-devel/2011-06/msg02072.html
http://lists.gnu.org/archive/html/qemu-devel/2011-06/msg01992.html

That thread mentioned both a command-line syntax for passing in fds for 
backing files, as well as an extension to the getfd monitor command to 
allow association of a runtime fd with a filename.


--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org

Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Avi Kivity


On 07/19/2011 04:30 PM, Joerg Roedel wrote:


  (although we've been talking about starting a naked qemu and pushing all
  of the configuration from the source).

Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
hosts and migrate it over so that the destination host can set the
tsc-freq if it supports tsc-scaling.


This can be done by a management tool if desired.

--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Jes Sorensen

On 07/19/11 15:58, Eric Blake wrote:
 On 07/19/2011 07:27 AM, Jes Sorensen wrote:
 Eric, what happens if libvirt in an selinux environment tells QEMU to
 launch using an image file that is backed by backing file(s)?
 
 Before starting qemu, libvirt first parses all the image files, to see
 if any of them have backing images.  For every qcow2 or qed image with a
 backing file, libvirt sets the SELinux context of both the qcow2 image
 and its backing file so that qemu will be able to successfully open()
 them.  But if any of those files reside on NFS, then it is not possible
 to label individual files, so it requires setting the SELinux bool
 virt_use_nfs, which thus gives qemu the power to open() arbitrary files
 on NFS, and you've lost security.

Urgh, libvirt parsing image files is really unfortunate, it really
doesn't give me warm fuzzy feelings :( libvirt really should not know
about internals of image formats.

 It would be nice if libvirt had a way to pass fds for every disk and
 backing file up front; then, SELinux can work around the lack of NFS
 per-file labelling by blocking open() in qemu.  In fact, this has
 already been proposed:

A cleaner solution seems to have libvirt provide a call-back allowing
QEMU to call out and have libvirt open a file descriptor instead. This
way libvirt can validate it and open it for QEMU and pass it back.

If we cannot do something like this, I would prefer to have backing
files on NFS should simply not be supported when running in an selinux
setup.

Cheers,
Jes

Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Avi Kivity


On 07/19/2011 04:54 PM, Avi Kivity wrote:

On 07/19/2011 04:30 PM, Joerg Roedel wrote:


  (although we've been talking about starting a naked qemu and 
pushing all

  of the configuration from the source).

Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
hosts and migrate it over so that the destination host can set the
tsc-freq if it supports tsc-scaling.


This can be done by a management tool if desired.



Although, if we do this unconditionally (that is, also for tsc-scale 
hosts) then we get stable tsc even without supplying a tsc frequency 
argument... need to think about this.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Avi Kivity


On 07/19/2011 04:56 PM, Michael S. Tsirkin wrote:

On Sun, Jul 17, 2011 at 02:13:27PM +0300, Avi Kivity wrote:
  New in this version:
MemoryRegionOps gained .old_mmio and .old_portio members, which allow
reusing old-style callbacks with the new API.  All uses were converted,
except for eepro100.c, which uses the same MemoryRegionOps for both
portio and mmio.  Some intermediate patches do introduce dispatching
callbacks, but they are removed later.

  Caveats:
  - some devices still grab a global memory region instead of inheriting
it from their bus.  Seen in the code as #include exec-memory.h

Looks good to me.

It looks like with this, users of vga_dirty_log_stop
like qxl_write_config can go away because the region can
stay registered with dirty logging enabled?


Yes.  You set the property once on the framebuffer, and it and all 
aliases are tracked whenever they or a subregion are exposed.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Michael S. Tsirkin

On Sun, Jul 17, 2011 at 02:13:27PM +0300, Avi Kivity wrote:
 New in this version:
   MemoryRegionOps gained .old_mmio and .old_portio members, which allow
   reusing old-style callbacks with the new API.  All uses were converted,
   except for eepro100.c, which uses the same MemoryRegionOps for both
   portio and mmio.  Some intermediate patches do introduce dispatching
   callbacks, but they are removed later.
 
 Caveats:
 - some devices still grab a global memory region instead of inheriting
   it from their bus.  Seen in the code as #include exec-memory.h

Looks good to me.

It looks like with this, users of vga_dirty_log_stop
like qxl_write_config can go away because the region can
stay registered with dirty logging enabled?

-- 
MST

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Jes Sorensen

On 07/19/11 16:24, Eric Blake wrote:
 [adding the libvir-list]
 On 07/19/2011 08:09 AM, Jes Sorensen wrote:
 Urgh, libvirt parsing image files is really unfortunate, it really
 doesn't give me warm fuzzy feelings :( libvirt really should not know
 about internals of image formats.
 
 But even if you add new features to qemu to avoid needing this in the
 future, it doesn't change the past - libvirt will always have to know
 how to parse image files understood by older qemu, and so as long as
 libvirt already knows how to do that parsing, we might as well take
 advantage of it.

What has been done here in the past is plain wrong. Continuing to do it
isn't the right thing to do here.

 Besides, I feel that having a well-documented file format, so that
 independent applications can both parse the same file with the same
 semantics by obeying the file format specification, is a good design goal.

We all know that documentation is rarely uptodate, new features may not
get added and libvirt will never be able to keep up. The driver for a
file format belongs in QEMU and nowhere else.


 It would be nice if libvirt had a way to pass fds for every disk and
 backing file up front; then, SELinux can work around the lack of NFS
 per-file labelling by blocking open() in qemu.  In fact, this has
 already been proposed:

 A cleaner solution seems to have libvirt provide a call-back allowing
 QEMU to call out and have libvirt open a file descriptor instead. This
 way libvirt can validate it and open it for QEMU and pass it back.
 
 Yes, that could probably be made to work with libvirt.

I am a little frustrated this approach wasn't taken up front instead of
the evil hack of having libvirt attempt to parse image files.

 If we cannot do something like this, I would prefer to have backing
 files on NFS should simply not be supported when running in an selinux
 setup.
 
 As nice as that sentiment is, it will never fly, because it would be a
 regression in current behavior.  The whole reason that the virt_use_nfs
 SELinux bool exists is that some people are willing to make the partial
 security tradeoff.  Besides, the use of sVirt via SELinux is more than
 just open() protection - while the current virt_use_nfs bool makes NFS
 less secure than otherwise possible, it still gives some nice guarantees
 to the rest of the qemu process such as passthrough accesses to local
 pci devices.

Well leaving things at status quo is not making it worse, it just leaves
an evil in place.

 Just because it is currently not as secure to mix NFS shared storage
 with backing files doesn't stop some people from wanting to do it [in
 fact, that's my current development setup - I use qcow2 images on NFS
 shared storage, keep SELinux enabled, and enable the virt_use_nfs bool].
  This discussion is about adding enhancements that make SELinux even
 more powerful when using NFS shared storage, by adding fd passing
 (whether libvirt parses in advance, or whether qemu raises an event and
 requires feedback from libvirt), and not about crippling the existing
 capability to use the virt_use_nfs selinux bool.

I do not believe we should try and add extra interfaces to support
something which is inherently broken. This really boils down to whether
we should support fd passing for snapshots in the first place. If it is
to support the broken setup of libvirt parsing image files, then I am
totally against it, if we work on a proper solution that involves this
in some way, then we can discuss it.

Cheers,
Jes

[Qemu-devel] Updated 0.15 release schedule

2011-07-19 Thread Anthony Liguori

Here's my proposal for an updated 0.15 schedule.  Please not that 
stable-0.15 will fork off this Friday.


| 2011-02-01
| Begin of 0.15 development phase
|-
| 2011-05-16
| Soft feature freeze.  Major features should have initial code 
committed by this date.

|-
| 2011-06-15; Now 2011-07-22
| Fork off stable-0.15, development of 0.16 begins. Tag qemu-0.15.0-rc0
|-
| 2011-06-24; Now 2011-07-29
| Tag qemu-0.15.0-rc1
|-
| 2011-06-28; Now 2011-08-02
| Tag qemu-0.15.0-rc2
|-
| 2011-07-01; Now 2011-08-05
| Tag qemu-0.15.0

Regards,

Anthony Liguori

Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Joerg Roedel

On Tue, Jul 19, 2011 at 04:55:53PM +0300, Avi Kivity wrote:
 On 07/19/2011 04:54 PM, Avi Kivity wrote:
 On 07/19/2011 04:30 PM, Joerg Roedel wrote:

 Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
 plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
 hosts and migrate it over so that the destination host can set the
 tsc-freq if it supports tsc-scaling.

 This can be done by a management tool if desired.


 Although, if we do this unconditionally (that is, also for tsc-scale  
 hosts) then we get stable tsc even without supplying a tsc frequency  
 argument... need to think about this.

It has the advantage that it just works, without the need to extend
management tools and the like. And it makes migration more transparent
to the guests.

Joerg

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Eric Blake


[adding the libvir-list]

On 07/19/2011 08:09 AM, Jes Sorensen wrote:

On 07/19/11 15:58, Eric Blake wrote:

On 07/19/2011 07:27 AM, Jes Sorensen wrote:

Eric, what happens if libvirt in an selinux environment tells QEMU to
launch using an image file that is backed by backing file(s)?


Before starting qemu, libvirt first parses all the image files, to see
if any of them have backing images.  For every qcow2 or qed image with a
backing file, libvirt sets the SELinux context of both the qcow2 image
and its backing file so that qemu will be able to successfully open()
them.  But if any of those files reside on NFS, then it is not possible
to label individual files, so it requires setting the SELinux bool
virt_use_nfs, which thus gives qemu the power to open() arbitrary files
on NFS, and you've lost security.


Urgh, libvirt parsing image files is really unfortunate, it really
doesn't give me warm fuzzy feelings :( libvirt really should not know
about internals of image formats.


But even if you add new features to qemu to avoid needing this in the 
future, it doesn't change the past - libvirt will always have to know 
how to parse image files understood by older qemu, and so as long as 
libvirt already knows how to do that parsing, we might as well take 
advantage of it.


Besides, I feel that having a well-documented file format, so that 
independent applications can both parse the same file with the same 
semantics by obeying the file format specification, is a good design goal.





It would be nice if libvirt had a way to pass fds for every disk and
backing file up front; then, SELinux can work around the lack of NFS
per-file labelling by blocking open() in qemu.  In fact, this has
already been proposed:


A cleaner solution seems to have libvirt provide a call-back allowing
QEMU to call out and have libvirt open a file descriptor instead. This
way libvirt can validate it and open it for QEMU and pass it back.


Yes, that could probably be made to work with libvirt.



If we cannot do something like this, I would prefer to have backing
files on NFS should simply not be supported when running in an selinux
setup.


As nice as that sentiment is, it will never fly, because it would be a 
regression in current behavior.  The whole reason that the virt_use_nfs 
SELinux bool exists is that some people are willing to make the partial 
security tradeoff.  Besides, the use of sVirt via SELinux is more than 
just open() protection - while the current virt_use_nfs bool makes NFS 
less secure than otherwise possible, it still gives some nice guarantees 
to the rest of the qemu process such as passthrough accesses to local 
pci devices.


Just because it is currently not as secure to mix NFS shared storage 
with backing files doesn't stop some people from wanting to do it [in 
fact, that's my current development setup - I use qcow2 images on NFS 
shared storage, keep SELinux enabled, and enable the virt_use_nfs bool]. 
 This discussion is about adding enhancements that make SELinux even 
more powerful when using NFS shared storage, by adding fd passing 
(whether libvirt parses in advance, or whether qemu raises an event and 
requires feedback from libvirt), and not about crippling the existing 
capability to use the virt_use_nfs selinux bool.


--
Eric Blake   ebl...@redhat.com+1-801-349-2682
Libvirt virtualization library http://libvirt.org

Re: [Qemu-devel] External COW format for raw images

2011-07-19 Thread Frediano Ziglio

2011/7/19 Robert Wang wdon...@linux.vnet.ibm.com:
 As you known, raw image is very popular,but the raw image format does
 NOT support Copy-On-Write,a raw image file can NOT be used as a copy
 destination, then image streaming/Live Block Copy will NOT work.

 To fix this, we need to add a new block driver raw-cow to QEMU. If
 finished, we can use qemu-img like this:
 qemu-img create -f raw-cow -o backing_file=ubuntu.img,raw_file=my_vm.img
 my_vm.raw-cow

 1) ubuntu.img is the backing file, my_vm.img is a raw file,
 my_vm.raw-cow stores a COW bitmap related to my_vm.img.

 2) If the entire COW bitmap is set to dirty flag then we can get all
 information from my_vm.img and can ignore ubuntu.img and my_vm.raw-cow
 from now.

 To implement this, I think I can follow these steps:
 1) Add a new member to BlockDriverState struct:
 char raw_file[1024];
 This member will track raw_file parameter related to raw-cow file from
 command line.

 2)      * Create a new file block/raw-cow.c. It will be much more like the
 mixture of block/cow.c and block/raw.c.

 So I will change some functions in cow.c and raw.c to none-static, then
 raw-cow.c can re-use them. When read operation occurs, determine whether
 dirty flag in raw-cow image is set. If true, read directly from the raw
 file. After write operation, set related dirty flag in raw-cow image.
 And other functions might also be modified.

        * Of course, format_name member of BlockDriver struct will be 
 raw-cow.
 And in order to keep relationship with raw file( like my_vm.img) ,
 raw_cow_header struct should be
 struct raw_cow_header {
 uint32_t magic;
 uint32_t version;
 char backing_file[1024];
 char raw_file[1024];/* added*/
 int32_t mtime;
 uint64_t size;
 uint32_t sectorsize;
 };
        * Struct raw_cow_create_options should be one member plus based on
 cow_create_options:
 {
 .name = BLOCK_OPT_RAW_FILE,
 .type = OPT_STRING,
 .help = Raw file name
 },

 3) Add bdrv_get_raw_filename in img_info function of qemu-img.c. In
 bdrv_get_raw_filename, if the format of the image file is raw-cow,
 print the related raw file.

 Do you think my approach is right?
 Thank you.


I don't understand if you mean just a way to track clusters/sectors
changed or a new way to implement snapshotting, something that writing
data just store original to cow like like:


normal backfile

is allocated on image
  write to image
else
  allocate on image
  copy from backing to image
  write to image (patch before previous write)

cow-raw (inverse backfile)

is allocated on image
  write to backing
else
  allocate on image
  copy from backing to image
  write to backing


that is


is not allocated on image
  allocate on image
  copy from backing to image
is normal backing
  write to image
else
  write to backing


Frediano

Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Anthony Liguori


On 07/19/2011 08:27 AM, Avi Kivity wrote:

On 07/19/2011 04:09 PM, Anthony Liguori wrote:

On 07/17/2011 06:13 AM, Avi Kivity wrote:

New in this version:
MemoryRegionOps gained .old_mmio and .old_portio members, which allow
reusing old-style callbacks with the new API. All uses were converted,
except for eepro100.c, which uses the same MemoryRegionOps for both
portio and mmio. Some intermediate patches do introduce dispatching
callbacks, but they are removed later.

Caveats:
- some devices still grab a global memory region instead of inheriting
it from their bus. Seen in the code as #include exec-memory.h


Could you write up a quick document on how to use this new api for docs/?


Sure. It's pretty simple.


Thanks.



There's bits I don't like about the interface


Which bits are these?


Nothing I haven't already commented on.  I think there's too much in the 
generic level.  I don't think coalesced I/O belongs here.  It's a 
concept that doesn't fit.  I think a side-band API would be nicer.


Endianness also seems out of place.  There are many layers that can 
affect final endianness.  It depends on how devices handle endianness 
and also whether the bus modifies endianness.


There are numerous devices that have a register that allows endianness 
to be toggled for the device.  That makes the actual endianness of the 
device dynamic which doesn't fit the memory region API very well IMHO.





but I think it's a huge improvement over what we have now so I'm
inclined to commit once it includes documentation.



My problem is that to start leveraging it, everything must flow through
it. There are still several hundred call sites that are unconverted.


Really several hundred?  That surprises me.

Regards,

Anthony Liguori



One option is to invert the relationship between ram_addr_t and
MemoryRegion - implement the former in terms of the latter. That only
works for uses which don't invoke IO_MEM_UNASSIGNED or address arithmetic.

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Stefan Hajnoczi

On Tue, Jul 19, 2011 at 3:30 PM, Jes Sorensen jes.soren...@redhat.com wrote:
 On 07/19/11 16:24, Eric Blake wrote:
 [adding the libvir-list]
 On 07/19/2011 08:09 AM, Jes Sorensen wrote:
 Urgh, libvirt parsing image files is really unfortunate, it really
 doesn't give me warm fuzzy feelings :( libvirt really should not know
 about internals of image formats.

 But even if you add new features to qemu to avoid needing this in the
 future, it doesn't change the past - libvirt will always have to know
 how to parse image files understood by older qemu, and so as long as
 libvirt already knows how to do that parsing, we might as well take
 advantage of it.

 What has been done here in the past is plain wrong. Continuing to do it
 isn't the right thing to do here.

 Besides, I feel that having a well-documented file format, so that
 independent applications can both parse the same file with the same
 semantics by obeying the file format specification, is a good design goal.

 We all know that documentation is rarely uptodate, new features may not
 get added and libvirt will never be able to keep up. The driver for a
 file format belongs in QEMU and nowhere else.

It should be a goal to avoid dependencies in multiple layers of the
stack because it becomes are to add new features - they require
coordinated changes in multiple layers.  Having both QEMU and libvirt
know the internals of image files is such a multi-dependency.  If I
want to add a new format or change an existing format I have to touch
both layers.

For fd-passing perhaps we have an opportunity to use a callback
mechanism (QEMU request: filename - libvirt response: fd) and do all
the image format parsing in QEMU.

Stefan

[Qemu-devel] [RFC] QEMU Object Model

2011-07-19 Thread Anthony Liguori


Hi,

I've started an effort to introduce a consistent object model to QEMU. 
Today, every subsystem implements an ad-hoc object model.  These object 
models all have the same basic properties but do things in arbitrarily 
different ways:


1) Factory interface for object creation
 - Objects usually have names
 - Construction properties for objects

2) Object properties
 - Some have converged around QemuOpts
 - Some support properties on at construction time
 - Most objects don't support introspection of properties

3) Inheritance and Polymorphism
 - Most use a vtable to implement inheritance and polymorphism
 - Only works effectively for one level of inheritance
 - Inconsistency around semantics of overloaded functions
   - Sometimes the base object invokes the overloaded function and 
implements additional behavior


netdev, block, chardev, fsdev, qdev, and displaystate are all examples 
of ad-hoc object models.  They all have their own implementations of the 
above and their own command line/monitor syntaxes.


QOM is a unifying object model inspired by the GObject/GType system.

Here is a short description of the feature it supports or is intended to 
support:


1) All objects derive from a common base object (Plug).  Plug's support 
properties that can be set/get with a Visitor.  This means QMP can be 
natively used to read/write object properties.


2) Properties have a type and flags associated with them.  Properties 
can be read-only, read-write, and locked.


3) Locked properties are read-only after realize.

4) Two special types of properties, plug and socket allow for a 
type-safe way to create a directed graph of objects at run time.  This 
provides a consistent mechanism to create a tree of devices and to 
associate backends with devices.


5) Single inheritance is supported with full polymorphism.  Interfaces 
are also supported which allows a restricted (Java-style) form of MI.


6) All types are registered through the same interface.  Type modules 
can be used to implement anything from new devices, buses, block/net 
backends, or even entirely new types of backends.  In the future, I 
would like to support demand loading of modules to allow a small core of 
QEMU to be loaded which then loads only the bits of code required to run 
a guest.


It has a few key different from GObject:

1) GObject properties are implemented as GValues.  GValues are variants 
that are assumed to be immutable.  A key requirement of QOM is that we 
can use the Visitor framework for interacting with properties.  This 
allows for a richer expression of properties (specifically, complex 
device state to be serialized as a property).


2) GObject properties are installed in the class.  In order to support 
things like multifunction devices, we really need to install properties 
with the object so that we can have arrays of properties that are sized 
from another property.


3) GTypes/GObjects are always heap allocated as discrete objects. 
GObjects are also reference counted.  In order to support object 
composition, it's necessary to be able to initialize an object from an 
existing piece of memory.


I'll follow up later in the week with some documentation on how the type 
system works in more detail.  A tree is available below that has the 
current implementation:


http://repo.or.cz/w/qemu/aliguori.git/tree/qdev2

I'll be documenting the type system at:

http://wiki.qemu.org/Features/QOM

Regards,

Anthony Liguori

Re: [Qemu-devel] [PULL 00/12] Xen patch queue 2011-07-05

2011-07-19 Thread Anthony Liguori


On 07/05/2011 11:51 AM, Alexander Graf wrote:

Hi Anthony,

This is my current patch queue for Xen stuff that accumulated over
the past few weeks.

Please pull.


Pulled.  Thanks.

Regards,

Anthony Liguori



Alex

The following changes since commit 9312805d33e8b106bae356d13a8071fb37d75554:
   Vasily Khoruzhick (1):
 pxa2xx_lcd: add proper rotation support

are available in the git repository at:

   git://repo.or.cz/qemu/agraf.git xen-next

Alexander Graf (2):
   checkpatch: don't error out on },{ lines
   xen_console: fall back to qemu serial device

Jan Kiszka (3):
   xen: Clean up build system
   xen: Clean up map cache API naming
   xen: Fold CONFIG_XEN_MAPCACHE into CONFIG_XEN

Stefano Stabellini (7):
   xen: enable console and disk backend in HVM mode
   xen_console: fix memory leak
   xen: add vkbd support for PV on HVM guests
   xen_disk: cope with missing xenstore params node
   qemu_ram_ptr_length: take ram_addr_t as arguments
   xen_disk: treat aio as raw
   xen_console: support the new extended xenstore protocol

  Makefile.objs |4 +-
  Makefile.target   |   14 +
  configure |2 +-
  cpu-common.h  |2 +-
  exec.c|   55 +---
  hw/xen.h  |   10 +--
  hw/xen_common.h   |   12 
  hw/xen_console.c  |   25 -
  hw/xen_disk.c |   37 -
  hw/xenfb.c|   19 -
  scripts/checkpatch.pl |4 ++-
  trace-events  |6 ++--
  xen-all.c |   73 +++-
  xen-mapcache-stub.c   |   36 
  xen-mapcache.c|   41 +++
  xen-mapcache.h|   14 -
  16 files changed, 217 insertions(+), 137 deletions(-)
  delete mode 100644 xen-mapcache-stub.c

Re: [Qemu-devel] [PATCH 2/3] qemu-x86: Add tsc_freq option to -cpu

2011-07-19 Thread Avi Kivity


On 07/19/2011 05:14 PM, Joerg Roedel wrote:

On Tue, Jul 19, 2011 at 04:55:53PM +0300, Avi Kivity wrote:
  On 07/19/2011 04:54 PM, Avi Kivity wrote:
  On 07/19/2011 04:30 PM, Joerg Roedel wrote:

  Hmm, I planned to do the VMSTATE thing in a follow-on patch-set. The
  plan is to read the VCPU tsc_freq at guest start time on !tsc-scale
  hosts and migrate it over so that the destination host can set the
  tsc-freq if it supports tsc-scaling.

  This can be done by a management tool if desired.


  Although, if we do this unconditionally (that is, also for tsc-scale
  hosts) then we get stable tsc even without supplying a tsc frequency
  argument... need to think about this.

It has the advantage that it just works, without the need to extend
management tools and the like. And it makes migration more transparent
to the guests.



Yes, exactly.  The flip side is that automagic stuff is sometimes 
unexpected and leads to breakage.  I'm not sure what the right thing is.


--
error compiling committee.c: too many arguments to function

Re: [Qemu-devel] [PULL] pci, vhost

2011-07-19 Thread Anthony Liguori


On 07/04/2011 12:01 PM, Michael S. Tsirkin wrote:

The following changes since commit 1dfdcaa83f9ce34aded8bc0669e81753d94f1b7d:

   user: Fix -d debug logging for usermode emulation (2011-06-28 20:57:09 +0200)

are available in the git repository at:
   git://git.kernel.org/pub/scm/linux/kernel/git/mst/qemu.git for_anthony


Pulled.  Thanks.

Regards,

Anthony Liguori



Anthony PERARD (1):
   hw/piix_pci.c: Fix PIIX3-xen to initialize ids

Michael S. Tsirkin (3):
   vhost: fix double free on device stop
   pci_ids: tweak names to match linux/pci_ids.h
   xen: move to new pci initializers

  hw/pci_ids.h  |3 ++-
  hw/piix_pci.c |3 +++
  hw/vhost.c|1 +
  hw/xen_platform.c |   15 +++
  4 files changed, 13 insertions(+), 9 deletions(-)

Re: [Qemu-devel] [PATCH 0/3][uq/master] Basic TSC-Scaling support v2

2011-07-19 Thread Marcelo Tosatti

On Thu, Jul 07, 2011 at 04:13:10PM +0200, Joerg Roedel wrote:
 Hi Avi, Marcelo,
 
 here is v2 of the patches to support setting the guests tsc-frequency
 from the qemu command line. This version addresses the comment from Avi
 on the first version. To reflect that units can be given to the
 frequency, the parameter was renamed from tsc_khz to tsc_freq.
 
 Thanks,
 
   Joerg

Applied to uq/master, thanks.

Re: [Qemu-devel] [PULL] virtio-serial: trace events, trivial fix

2011-07-19 Thread Anthony Liguori


On 07/07/2011 08:13 AM, Amit Shah wrote:

Hello,

This series adds some trace events to virtio-serial-bus.c and
virtio-console.c.  There's also one trivial patch to remove a trailing
\n from an error_report() string.

Note: some mirrors may not yet have received the update.


Pulled.  Thanks.

Regards,

Anthony Liguori




The following changes since commit 9312805d33e8b106bae356d13a8071fb37d75554:

   pxa2xx_lcd: add proper rotation support (2011-07-04 22:12:21 +0200)

are available in the git repository at:
   git://git.kernel.org/pub/scm/virt/qemu/amit/virtio-serial.git for-anthony


Amit Shah (3):
   virtio-serial-bus: Add trace events
   virtio-console: Add some trace events
   virtio-serial-bus: Fix trailing \n in error_report string

  hw/virtio-console.c|9 -
  hw/virtio-serial-bus.c |9 -
  trace-events   |   11 +++
  3 files changed, 27 insertions(+), 2 deletions(-)

Re: [Qemu-devel] live snapshot wiki updated

2011-07-19 Thread Anthony Liguori


On 07/19/2011 09:30 AM, Jes Sorensen wrote:

On 07/19/11 16:24, Eric Blake wrote:

[adding the libvir-list]
On 07/19/2011 08:09 AM, Jes Sorensen wrote:

Urgh, libvirt parsing image files is really unfortunate, it really
doesn't give me warm fuzzy feelings :( libvirt really should not know
about internals of image formats.


But even if you add new features to qemu to avoid needing this in the
future, it doesn't change the past - libvirt will always have to know
how to parse image files understood by older qemu, and so as long as
libvirt already knows how to do that parsing, we might as well take
advantage of it.


What has been done here in the past is plain wrong. Continuing to do it
isn't the right thing to do here.


Besides, I feel that having a well-documented file format, so that
independent applications can both parse the same file with the same
semantics by obeying the file format specification, is a good design goal.


We all know that documentation is rarely uptodate, new features may not
get added and libvirt will never be able to keep up. The driver for a
file format belongs in QEMU and nowhere else.



It would be nice if libvirt had a way to pass fds for every disk and
backing file up front; then, SELinux can work around the lack of NFS
per-file labelling by blocking open() in qemu.  In fact, this has
already been proposed:


A cleaner solution seems to have libvirt provide a call-back allowing
QEMU to call out and have libvirt open a file descriptor instead. This
way libvirt can validate it and open it for QEMU and pass it back.


Yes, that could probably be made to work with libvirt.


I am a little frustrated this approach wasn't taken up front instead of
the evil hack of having libvirt attempt to parse image files.


If we cannot do something like this, I would prefer to have backing
files on NFS should simply not be supported when running in an selinux
setup.


As nice as that sentiment is, it will never fly, because it would be a
regression in current behavior.  The whole reason that the virt_use_nfs
SELinux bool exists is that some people are willing to make the partial
security tradeoff.  Besides, the use of sVirt via SELinux is more than
just open() protection - while the current virt_use_nfs bool makes NFS
less secure than otherwise possible, it still gives some nice guarantees
to the rest of the qemu process such as passthrough accesses to local
pci devices.


Well leaving things at status quo is not making it worse, it just leaves
an evil in place.


NFS and SELinux is a fundamental problem with SELinux and NFS.  We can 
piss and moan as much as we want about it but it's reality.  SELinux 
fundamentally requires extended attributes.  By the time NFS adds 
extended attribute support, we'll all be flying around in hover cars.


As terrible as NFS is, people use it all of the time.

It would be nice if libvirt had the ability to make better use of DAC to 
support isolation.  The fact that MAC is the only way you can do 
isolation between guests is pretty unfortunate.  If I could assign 
specific UIDs to a guest and use that to enforce isolation, it would go 
a long ways to solving this problem.


Regards,

Anthony Liguori

Re: [Qemu-devel] [PULL] usb patch queue

2011-07-19 Thread Anthony Liguori


On 07/08/2011 04:50 AM, Gerd Hoffmann wrote:

   Hi,

Here is the current usb patch queue.  Most noteworthy is the usb
companion controller support added.  There are also a bunch of bug
fixes, some from Hans which he found while doing the companion
controller work and some have been found in patch review.


Pulled.  Thanks.

Regards,

Anthony Liguori



please pull,
   Gerd

The following changes since commit 9312805d33e8b106bae356d13a8071fb37d75554:

   pxa2xx_lcd: add proper rotation support (2011-07-04 22:12:21 +0200)

are available in the git repository at:
   git://git.kraxel.org/qemu usb.19

Gerd Hoffmann (8):
   pci: add ich9 usb controller ids
   uhci: add ich9 controllers
   ehci: fix port count.
   ehci: add ich9 controller.
   usb: update documentation
   usb: fixup bluetooth descriptors
   usb-hub: remove unused descriptor arrays
   usb-ohci: raise interrupt on attach

Hans de Goede (13):
   usb: Add a usb_fill_port helper function
   usb: Move (initial) call of usb_port_location to usb_fill_port
   usb: Add a register_companion USB bus op.
   usb: Make port wakeup and complete ops take a USBPort instead of a Device
   usb: Replace device_destroy bus op with a child_detach port op
   usb-ehci: drop unused num-ports state member
   usb-ehci: Connect Status bit is read only, don't allow changing it by 
the guest
   usb-ehci: cleanup port reset handling
   usb: assert on calling usb_attach(port, NULL) on a port without a dev
   usb-ehci: Fix handling of PED and PEDC port status bits
   usb-ehci: Add support for registering companion controllers
   usb-uhci: Add support for being a companion controller
   usb-ohci: Add support for being a companion controller

Jes Sorensen (1):
   usb_register_port(): do not set port-opaque and port-index twice

Peter Maydell (1):
   hw/usb-musb.c: Don't misuse usb_packet_complete()

  docs/ich9-ehci-uhci.cfg |   37 +++
  docs/usb2.txt   |   33 +-
  hw/milkymist-softusb.c  |9 ++-
  hw/pci_ids.h|8 ++
  hw/usb-bt.c |   24 ++--
  hw/usb-bus.c|   46 +++-
  hw/usb-ehci.c   |  270 ++-
  hw/usb-hub.c|   90 +++-
  hw/usb-musb.c   |   24 +++--
  hw/usb-ohci.c   |   89 +++-
  hw/usb-uhci.c   |   95 +
  hw/usb.c|   13 +--
  hw/usb.h|   20 +++-
  13 files changed, 523 insertions(+), 235 deletions(-)
  create mode 100644 docs/ich9-ehci-uhci.cfg

Re: [Qemu-devel] [PULL] spice patch queue

2011-07-19 Thread Anthony Liguori


On 07/04/2011 10:14 AM, Gerd Hoffmann wrote:

   Hi,

Here is the spice patch queue with a bunch of small fixes and
improvements collected over time.  No major changes.

please pull,
   Gerd


Pulled.  Thanks.

Regards,

Anthony Liguori



Alon Levy (5):
   qxl: set mm_time in vga update
   qxl: interface_get_command: fix reported mode
   qxl-logger: add timestamp to command log
   qxl: add dev id to guest prints
   qxl: allow QXL_IO_LOG also in vga

Gerd Hoffmann (3):
   qxl: device id fixup
   spice: catch spice server initialization failures.
   qxl: put QXL_IO_UPDATE_IRQ into vgamode whitelist

Yonit Halperin (1):
   qxl: make sure primary surface is saved on migration

  hw/qxl-logger.c|4 +++-
  hw/qxl.c   |   50 ++
  ui/spice-core.c|5 -
  ui/spice-display.c |5 +
  4 files changed, 46 insertions(+), 18 deletions(-)

The following changes since commit 75ef849696830fc2ddeff8bb90eea5887ff50df6:

   esp: correctly fill bus id with requested lun (2011-07-02 18:50:19 +)

are available in the git repository at:
   git://anongit.freedesktop.org/spice/qemu spice.v38

Alon Levy (5):
   qxl: set mm_time in vga update
   qxl: interface_get_command: fix reported mode
   qxl-logger: add timestamp to command log
   qxl: add dev id to guest prints
   qxl: allow QXL_IO_LOG also in vga

Gerd Hoffmann (3):
   qxl: device id fixup
   spice: catch spice server initialization failures.
   qxl: put QXL_IO_UPDATE_IRQ into vgamode whitelist

Yonit Halperin (1):
   qxl: make sure primary surface is saved on migration

  hw/qxl-logger.c|4 +++-
  hw/qxl.c   |   50 ++
  ui/spice-core.c|5 -
  ui/spice-display.c |5 +
  4 files changed, 46 insertions(+), 18 deletions(-)

Re: [Qemu-devel] [RFC v4 00/58] Memory API

2011-07-19 Thread Avi Kivity


On 07/19/2011 05:50 PM, Anthony Liguori wrote:




There's bits I don't like about the interface


Which bits are these?


Nothing I haven't already commented on.  I think there's too much in 
the generic level.  I don't think coalesced I/O belongs here.  It's a 
concept that doesn't fit.  I think a side-band API would be nicer.


Well, it's impossible to do it in a side band.  When a range that has 
coalesced mmio is exposed is completely orthogonal to programming the 
BAR register - it can happen, for example, due to another BAR being 
removed or the bridge window being programmed.  You can also have a 
coalesced mmio region being partially clipped.




Endianness also seems out of place.  There are many layers that can 
affect final endianness.  It depends on how devices handle endianness 
and also whether the bus modifies endianness.


That is handled naturally by the API.  Currently only leaves specify 
endianess, but in the futures containers (=buses) would as well.




There are numerous devices that have a register that allows endianness 
to be toggled for the device.  That makes the actual endianness of the 
device dynamic which doesn't fit the memory region API very well IMHO.


static const MemoryRegionOps mydevice_ops = {
...
   .endianess_callback = mydevice_endianess,
...
};

Or

 memory_region_set_endianess(...);






but I think it's a huge improvement over what we have now so I'm
inclined to commit once it includes documentation.



My problem is that to start leveraging it, everything must flow through
it. There are still several hundred call sites that are unconverted.


Really several hundred?  That surprises me.



$ git grep -w cpu_register_physical_memory | wc -l
222

$ git grep -w cpu_register_io_memory | wc -l
233

$ git grep -w qemu_ram_alloc | wc -l
113

$ git grep  memory_region_init | wc -l
134

--
error compiling committee.c: too many arguments to function

1 2 >

1 - 100 of 157 matches

Mail list logo