[Qemu-devel] [PATCH 0/4] Add virtio disk identification support
This patch adds the final missing bits for support of passing a serial/id string to a virtio-blk guest driver. The guest-side component already exists in the virtio driver, and has recently been reworked by Ryan to export a /sys interface for retrival of the id from guest userland. Signed-off-by: john cooperjohn.coo...@redhat.com --- diff --git a/hw/virtio-blk.c b/hw/virtio-blk.c index 0bf929a..bd6b896 100644 --- a/hw/virtio-blk.c +++ b/hw/virtio-blk.c @@ -26,6 +26,7 @@ typedef struct VirtIOBlock QEMUBH *bh; BlockConf *conf; unsigned short sector_mask; +char sn[BLOCK_SERIAL_STRLEN]; } VirtIOBlock; static VirtIOBlock *to_virtio_blk(VirtIODevice *vdev) @@ -324,6 +325,12 @@ static void virtio_blk_handle_request(VirtIOBlockReq *req, virtio_blk_handle_flush(req, mrb); } else if (req-out-type VIRTIO_BLK_T_SCSI_CMD) { virtio_blk_handle_scsi(req); +} else if (req-out-type VIRTIO_BLK_T_GET_ID) { +VirtIOBlock *s = req-dev; + +memcpy(req-elem.in_sg[0].iov_base, s-sn, + MIN(req-elem.in_sg[0].iov_len, sizeof(s-sn))); +virtio_blk_req_complete(req, VIRTIO_BLK_S_OK); } else if (req-out-type VIRTIO_BLK_T_OUT) { qemu_iovec_init_external(req-qiov, req-elem.out_sg[1], req-elem.out_num - 1); @@ -495,6 +502,12 @@ VirtIODevice *virtio_blk_init(DeviceState *dev, BlockConf *conf) s-sector_mask = (s-conf-logical_block_size / BDRV_SECTOR_SIZE) - 1; bdrv_guess_geometry(s-bs, cylinders, heads, secs); +/* NB: per existing s/n string convention we really intend to hard-limit + * the copy length to sizeof (s-sn) even in the case we're left without + * a trailing '\0' + */ +strncpy(s-sn, drive_get_serial(s-bs), sizeof (s-sn)); + s-vq = virtio_add_queue(s-vdev, 128, virtio_blk_handle_output); qemu_add_vm_change_state_handler(virtio_blk_dma_restart_cb, s); diff --git a/hw/virtio-blk.h b/hw/virtio-blk.h index 7a7ece3..fff46da 100644 --- a/hw/virtio-blk.h +++ b/hw/virtio-blk.h @@ -59,6 +59,9 @@ struct virtio_blk_config /* Flush the volatile write cache */ #define VIRTIO_BLK_T_FLUSH 4 +/* return the device ID string */ +#define VIRTIO_BLK_T_GET_ID 8 + /* Barrier before this op. */ #define VIRTIO_BLK_T_BARRIER0x8000 -- john.coo...@redhat.com
Re: [Qemu-devel] [PATCH 0/4] Add virtio disk identification support
Subject is confusing: suggests a series of four parts. john cooper john.coo...@redhat.com writes: This patch adds the final missing bits for support of passing a serial/id string to a virtio-blk guest driver. The guest-side component already exists in the virtio driver, and has recently been reworked by Ryan to export a /sys interface for retrival of the id from guest userland. Signed-off-by: john cooperjohn.coo...@redhat.com Looks like this conflicts with my PATCH v3 00/13] More block-related fixes and cleanups. Perhaps the easiest way to get this in would be to rebase to Kevin's block branch, which already has my series, and cc the respin to him. See ide_dev_initfn() and scsi_initfn() there. Kevin's tree: git://repo.or.cz/qemu/kevin.git
Re: [Qemu-devel] [PATCH 0/4] Add virtio disk identification support
Markus Armbruster wrote: Subject is confusing: suggests a series of four parts. Sorry. My bad for recycling old mail. Looks like this conflicts with my PATCH v3 00/13] More block-related fixes and cleanups. Perhaps the easiest way to get this in would be to rebase to Kevin's block branch, which already has my series, and cc the respin to him. See ide_dev_initfn() and scsi_initfn() there. Kevin's tree: git://repo.or.cz/qemu/kevin.git Will do. -john -- john.coo...@third-harmonic.com
[Qemu-devel] [Bug 595438] Re: KVM segmentation fault, using SCSI+writeback and linux 2.4 guest
(gdb) run Starting program: /usr/bin/kvm -M pc-0.12 -enable-kvm -m 256 -smp 1 -boot d -drive file=/mnt/megadiff/cdiso_400_130.iso,if=ide,media=cdrom,index=2 -drive file=/home/mmarkk/spamsender2.img,if=scsi,index=0,format=qcow2,cache=writeback [Thread debugging using libthread_db enabled] [New Thread 0xb7145b70 (LWP 4715)] pci_add_option_rom: failed to find romfile pxe-rtl8139.bin [New Thread 0xa54c4b70 (LWP 4747)] scsi-disk: Tag 0x0 already in use Program received signal SIGSEGV, Segmentation fault. 0x08468b10 in ?? () (gdb) bt #0 0x08468b10 in ?? () #1 0x080f0ef6 in ?? () #2 0x080d4cf9 in ?? () #3 0x080c470f in ?? () #4 0x080c47c7 in ?? () #5 0x08052266 in ?? () #6 0x0806dcc4 in ?? () #7 0x08055465 in ?? () #8 0xb7a42bd6 in __libc_start_main () from /lib/tls/i686/cmov/libc.so.6 #9 0x0804ec51 in ?? () (gdb) mma...@mmarkk-work:~/src/KVM$ pmap 4712 4712: /usr/bin/kvm -M pc-0.12 -enable-kvm -m 256 -smp 1 -boot d -drive file=/mnt/megadiff/cdiso_400_130.iso,if=ide,media=cdrom,index=2 -drive file=/home/mmarkk/spamsender2.img,if=scsi,index=0,format=qcow2,cache=writeback 08048000 2064K r-x-- /usr/bin/qemu-system-x86_64 0824c000 4K r /usr/bin/qemu-system-x86_64 0824d000 76K rw--- /usr/bin/qemu-system-x86_64 0826 3408K rw---[ anon ] 085b4000 64K rw---[ anon ] It seems, that 0x0804ec51 is in anonymous memory, does not it ? I will try to use debug libraries. Please write how I can help more. -- KVM segmentation fault, using SCSI+writeback and linux 2.4 guest https://bugs.launchpad.net/bugs/595438 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: New Bug description: I Use Ubuntu 32 bit 10.04 with standard KVM. I have Intel E7600 @ 3.06GHz processor with VMX In this system I Run: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -M pc-0.12 -enable-kvm -m 256 -smp 1 -name spamsender -uuid b9cacd5e-08f7-41fd-78c8-89cec59af881 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/spamsender.monitor,server,nowait -monitor chardev:monitor -boot d -drive file=/mnt/megadiff/cdiso_400_130.iso,if=ide,media=cdrom,index=2 -drive file=/home/mmarkk/spamsender2.img,if=scsi,index=0,format=qcow2,cache=writeback -net nic,macaddr=00:00:00:00:00:00,vlan=0,name=nic.0 -net tap,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:0 -vga cirrus .iso image contain custom distro of 2.4-linux kernel based system. During install process (when .tar.gz actively unpacked), kvm dead with segmentation fault. And ONLY when I choose scsi virtual disk and writeback simultaneously. But, writeback+ide, writethrough+scsi works OK. I use qcow2. It seems, that qcow does not have such problems. Virtual machine get down at random time during file copy. It seems, when qcow2 file size need to be expanded.
[Qemu-devel] [Bug 595438] Re: KVM segmentation fault, using SCSI+writeback and linux 2.4 guest
scsi-disk: Tag 0x0 already in use maybe problem here? -- KVM segmentation fault, using SCSI+writeback and linux 2.4 guest https://bugs.launchpad.net/bugs/595438 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: New Bug description: I Use Ubuntu 32 bit 10.04 with standard KVM. I have Intel E7600 @ 3.06GHz processor with VMX In this system I Run: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -M pc-0.12 -enable-kvm -m 256 -smp 1 -name spamsender -uuid b9cacd5e-08f7-41fd-78c8-89cec59af881 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/spamsender.monitor,server,nowait -monitor chardev:monitor -boot d -drive file=/mnt/megadiff/cdiso_400_130.iso,if=ide,media=cdrom,index=2 -drive file=/home/mmarkk/spamsender2.img,if=scsi,index=0,format=qcow2,cache=writeback -net nic,macaddr=00:00:00:00:00:00,vlan=0,name=nic.0 -net tap,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:0 -vga cirrus .iso image contain custom distro of 2.4-linux kernel based system. During install process (when .tar.gz actively unpacked), kvm dead with segmentation fault. And ONLY when I choose scsi virtual disk and writeback simultaneously. But, writeback+ide, writethrough+scsi works OK. I use qcow2. It seems, that qcow does not have such problems. Virtual machine get down at random time during file copy. It seems, when qcow2 file size need to be expanded.
[Qemu-devel] [Bug 595438] Re: KVM segmentation fault, using SCSI+writeback and linux 2.4 guest
sudo apt-get install libaio1-dbg libcomerr2-dbg libdbus-glib-1-2-dbg libgcrypt11-dbg keyutils-dbg libncurses5-dbg zlib1g-dbg libc6-dbg libcurl3-dbg libdirectfb-1.2-0-dbg libgnutls26-dbg libkrb5-dbg libice6-dbg libldap-2.4-2-dbg libogg-dbg libpulse0-dbg gsasl-dbg libsm6-dbg libtasn1-3-dbg libx11-6-dbg libxau6-dbg libxcb1-dbg libxdmcp6-dbg libxext6-dbg libxi6-dbg libxtst6-dbg libc-dbg is this sufficient ? -- KVM segmentation fault, using SCSI+writeback and linux 2.4 guest https://bugs.launchpad.net/bugs/595438 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: New Bug description: I Use Ubuntu 32 bit 10.04 with standard KVM. I have Intel E7600 @ 3.06GHz processor with VMX In this system I Run: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -M pc-0.12 -enable-kvm -m 256 -smp 1 -name spamsender -uuid b9cacd5e-08f7-41fd-78c8-89cec59af881 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/spamsender.monitor,server,nowait -monitor chardev:monitor -boot d -drive file=/mnt/megadiff/cdiso_400_130.iso,if=ide,media=cdrom,index=2 -drive file=/home/mmarkk/spamsender2.img,if=scsi,index=0,format=qcow2,cache=writeback -net nic,macaddr=00:00:00:00:00:00,vlan=0,name=nic.0 -net tap,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:0 -vga cirrus .iso image contain custom distro of 2.4-linux kernel based system. During install process (when .tar.gz actively unpacked), kvm dead with segmentation fault. And ONLY when I choose scsi virtual disk and writeback simultaneously. But, writeback+ide, writethrough+scsi works OK. I use qcow2. It seems, that qcow does not have such problems. Virtual machine get down at random time during file copy. It seems, when qcow2 file size need to be expanded.
[Qemu-devel] Re: Status update
On Thu, Jul 1, 2010 at 8:30 PM, Eduard - Gabriel Munteanu eduard.munte...@linux360.ro wrote: On Wed, Jun 30, 2010 at 09:37:31AM +0100, Stefan Hajnoczi wrote: On Tue, Jun 29, 2010 at 6:25 PM, Eduard - Gabriel Munteanu eduard.munte...@linux360.ro wrote: On the other hand, we could just leave it alone for now. Changing mappings during DMA is stupid anyway: I don't think the guest can recover the results of DMA safely, even though it might be used on transfers in progress you simply don't care about anymore. Paul Brook suggested we could update the cpu_physical_memory_map() mappings somehow, but I think that's kinda difficult to accomplish. A malicious or broken guest shouldn't be able to crash or corrupt QEMU process memory. The IOMMU can only map from bus addresses to guest physical RAM (?) so the worst the guest can do here is corrupt itself? That's true, but it's fair to be concerned about the guest itself. Imagine it runs some possibly malicious apps which program the hardware to do DMA. That should be safe when a IOMMU is present. But suddenly the guest OS changes mappings and expects the IOMMU to enforce them as soon as invalidation commands are completed. The guest then reclaims the old space for other uses. This leaves an opportunity for those processes to corrupt or read sensitive data. As long as QEMU acts in the same way as real hardware we should be okay. Will real hardware change the mappings immediately and abort the DMA from the device if it tries to access an invalidated address? Stefan
[Qemu-devel] Re: [PATCH] device-assignment: Rework name of assigned pci device
[cc: kraxel] Hidetoshi Seto seto.hideto...@jp.fujitsu.com writes: (2010/06/30 15:53), Markus Armbruster wrote: Summary: upstream qemu commit b560a9ab broke -pcidevice and pci_add host in two ways: * Use without options id and name is broken when option host contains ':'. That's because id defaults to host. I recommend to fix it incompatibly: don't default id to host. The alternative is to get upstream qemu to accept ':' in qdev IDs again. * Funny characters in option name no longer work. Same as funny characters in id options all over the place. Intentional change. I recommend to do nothing. Thanks a lot. I'm not a person in really need, so now I'm going to follow your recommendation. Details inline. Hidetoshi Seto seto.hideto...@jp.fujitsu.com writes: Thanks Markus, (2010/06/29 14:28), Markus Armbruster wrote: Hidetoshi Seto seto.hideto...@jp.fujitsu.com writes: Hao, Xudong xudong@intel.com writes: When assign one PCI device, qemu fail to parse the command line: qemu-system_x86 -smp 2 -m 1024 -hda /path/to/img -pcidevice host=00:19.0 Error: qemu-system-x86_64: Parameter 'id' expects an identifier Identifiers consist of letters, digits, '-', '.', '_', starting with a letter. pcidevice argument parse error; please check the help text for usage Could not add assigned device host=00:19.0 https://bugs.launchpad.net/qemu/+bug/597932 This issue caused by qemu-kvm commit b560a9ab9be06afcbb78b3791ab836dad208a239. This patch is a response to the above report. Thanks, H.Seto = Because use of some characters in id is restricted recently, assigned device start to fail having implicit id that uses address string of host device, like 00:19.0 which includes restricted character ':'. It seems that this implicit id is intended to be run as name that could be passed with option -pcidevice ... ,name=... to specify a string to be used in log outputs. In other words it seems that dev-dev.qdev.id of assigned device had been used to have such name, that is user-defined string or address string of host. As far as I can tell, option name is just a leftover from pre-qdev days, kept for compatibility. Yea, I see. I don't know well about the history of such pre-qdev days... It's often useful to examine history to figure out what a piece of code attempts to accomplish. git-blame and git-log are your friends :) I often play with git-log, however I have a little trouble here since qemu tree is too young. The problem is that name for specific use is not equal to id for universal use. So it is better to remove this tricky mix up here. This patch introduces new function assigned_dev_name() that returns proper name string for the device. Now property name is explicitly defined in struct AssignedDevice. When if the device have neither name nor id, address string like :00:19.0 will be created and passed instead. Once created, new field r_name holds the string to be reused and to be released later. [...] @@ -1520,6 +1545,7 @@ static PCIDeviceInfo assign_info = { DEFINE_PROP(host, AssignedDevice, host, qdev_prop_hostaddr, PCIHostDevice), DEFINE_PROP_UINT32(iommu, AssignedDevice, use_iommu, 1), DEFINE_PROP_STRING(configfd, AssignedDevice, configfd_name), +DEFINE_PROP_STRING(name, AssignedDevice, u_name), DEFINE_PROP_END_OF_LIST(), }, }; @@ -1545,24 +1571,25 @@ device_init(assign_register_devices) QemuOpts *add_assigned_device(const char *arg) { QemuOpts *opts = NULL; -char host[64], id[64], dma[8]; +char host[64], buf[64], dma[8]; int r; +/* host must be with -pcidevice */ r = get_param_value(host, sizeof(host), host, arg); if (!r) goto bad; -r = get_param_value(id, sizeof(id), id, arg); -if (!r) -r = get_param_value(id, sizeof(id), name, arg); -if (!r) -r = get_param_value(id, sizeof(id), host, arg); -opts = qemu_opts_create(qemu_device_opts, id, 0); +opts = qemu_opts_create(qemu_device_opts, NULL, 0); I think you break option id here. Its value must become the qdev ID, visible in info qtree and usable as argument to device_del. And if option id is missing, option name must become the qdev ID, for backwards compatibility. Ah, I missed to check hot-add path - I had wonder why id could be here since I could not find documents that mentions use of id with -pcidevice. So, I should use id here if specified. That's right, But in case if id is missing and name is specified, I think there is no reason that the characters in name should be restricted in same manner with that in id. In case that there is neither id nor name but host (in fact it is original case) now ID BB:DD.F (like 00:19.0) will be rejected (because it starts with digit, plus it uses restricted ':'). In other words, your change already breaks backwards compatibility because it
Re: [Qemu-devel] [PATCH 1/2] block: Fix too early free in multiwrite
On Thu, Jul 1, 2010 at 3:31 PM, Kevin Wolf kw...@redhat.com wrote: bdrv_aio_writev may call the callback immediately (and it will commonly do so in error cases). If num_requests doesn't have its final value yet, multiwrite_cb will falsely detect that all requests are completed and frees the mcb. However, the mcb is still used by other requests that are started only afterwards. When all requests are completed, it is freed for the second time. Fix this by setting the right num_requests from the beginning. Looks good to me. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 6 ++ 1 files changed, 2 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index c40dd2c..9719649 100644 --- a/block.c +++ b/block.c @@ -2198,6 +2198,7 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb); // Run the aio requests + mcb-num_requests = num_reqs; for (i = 0; i num_reqs; i++) { acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, reqs[i].nb_sectors, multiwrite_cb, mcb); @@ -2206,16 +2207,13 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // We can only fail the whole thing if no request has been // submitted yet. Otherwise we'll wait for the submitted AIOs to // complete and report the error in the callback. - if (mcb-num_requests == 0) { + if (i == 0) { reqs[i].error = -EIO; goto fail; } else { - mcb-num_requests++; multiwrite_cb(mcb, -EIO); break; } - } else { - mcb-num_requests++; } } -- 1.6.6.1 Stefan
Re: [Qemu-devel] [PATCH 2/2] block: Handle multiwrite errors only when all requests have completed
On Thu, Jul 1, 2010 at 3:31 PM, Kevin Wolf kw...@redhat.com wrote: Don't try to be clever by freeing all temporary data and calling all callbacks when the return value (an error) is certain. Doing so has at least two important problems: * The temporary data that is freed (qiov, possibly zero buffer) is still used by the requests that have not yet completed. * Calling the callbacks for all requests in the multiwrite means for the caller that it may free buffers etc. which are still in use. Just remember the error value and do the cleanup when all requests have completed. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 5 + 1 files changed, 1 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 9719649..bff7d5a 100644 --- a/block.c +++ b/block.c @@ -2056,14 +2056,11 @@ static void multiwrite_cb(void *opaque, int ret) if (ret 0 !mcb-error) { mcb-error = ret; - multiwrite_user_cb(mcb); } mcb-num_requests--; if (mcb-num_requests == 0) { - if (mcb-error == 0) { - multiwrite_user_cb(mcb); - } + multiwrite_user_cb(mcb); I am a little confused at this point. Here is the meat of bdrv_aio_multiwrite() from the kevin/devel branch: // Run the aio requests mcb-num_requests = num_reqs; for (i = 0; i num_reqs; i++) { acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, reqs[i].nb_sectors, multiwrite_cb, mcb); if (acb == NULL) { // We can only fail the whole thing if no request has been // submitted yet. Otherwise we'll wait for the submitted AIOs to // complete and report the error in the callback. if (i == 0) { reqs[i].error = -EIO; goto fail; } else { multiwrite_cb(mcb, -EIO); break; } } } If bdrv_aio_write() fails on request 2 out of 3, then mcb-num_requests = 3 but only 2 requests were issued before breaking the loop. Now the 2 issued requests complete and call multiwrite_cb(). Since was mcb-num_requests = 3, it never reaches zero and neither multiwrite_user_cb(mcb) nor qemu_free(mcb) are ever called. Is it safe to adjust mcb-num_requests = i before calling multiwrite_cb() and breaking the loop in the failure case? That way the num-num_requests will reach zero. I hesitate a little because previous requests could have completed, multiwrite_cb() was called, and mcb-num_requests was decremented? Then the value of i cannot be used for mcb-num_requests because previous requests it counts have already completed. Stefan
Re: [Qemu-devel] [PATCH] win32: Add missing function setenv
On 07/01/10 19:53, Stefan Weil wrote: Two patches are needed anyway. For reasons of economy, I won't send a new patch. Feel free do send one which meets your criteria. Regards, Stefan Well if you are not interested in working the way the community works, please don't expect us to fix your bugs either. Jes
[Qemu-devel] Re: [PATCH] device-assignment: Rework name of assigned pci device
Hi, As far as I can tell, name predates the qdev conversion, and was used just for error messages and such. Yes, was already there when I touched the code the first time. It defaulted to host. When Gerd did the qdev conversion, he made id default to name, then host. See commit 6b5bbd04. Defaulting id that way was probably not such a good idea. We generally don't make up qdev IDs, because that risks collision with user-specified IDs. Since we've broken compatibility already, I figure we could just as well stop defaulting id to host. When we need to identify the device to the user, use id if it exists, else its PCI address. Agree, we should not make up defaults for 'id'. I did that in a few places where some naming existed already to ease transition. nics with name= used to get that as default id too. But in the end it turned out it caused more trouble than it helped ... cheers, Gerd
Re: [Qemu-devel] [PATCH 1/2] block: Fix too early free in multiwrite
On Thu, Jul 01, 2010 at 04:31:57PM +0200, Kevin Wolf wrote: bdrv_aio_writev may call the callback immediately (and it will commonly do so in error cases). If num_requests doesn't have its final value yet, multiwrite_cb will falsely detect that all requests are completed and frees the mcb. However, the mcb is still used by other requests that are started only afterwards. When all requests are completed, it is freed for the second time. Fix this by setting the right num_requests from the beginning. Looks good, Reviewed-by: Christoph Hellwig h...@lst.de
Re: [Qemu-devel] [PATCH 2/2] block: Handle multiwrite errors only when all requests have completed
On Thu, Jul 01, 2010 at 04:31:58PM +0200, Kevin Wolf wrote: Don't try to be clever by freeing all temporary data and calling all callbacks when the return value (an error) is certain. Doing so has at least two important problems: * The temporary data that is freed (qiov, possibly zero buffer) is still used by the requests that have not yet completed. * Calling the callbacks for all requests in the multiwrite means for the caller that it may free buffers etc. which are still in use. Just remember the error value and do the cleanup when all requests have completed. Looks good.
Re: [Qemu-devel] Re: Status update
On Fri, Jul 02, 2010 at 09:03:39AM +0100, Stefan Hajnoczi wrote: On Thu, Jul 1, 2010 at 8:30 PM, Eduard - Gabriel Munteanu eduard.munte...@linux360.ro wrote: On Wed, Jun 30, 2010 at 09:37:31AM +0100, Stefan Hajnoczi wrote: On Tue, Jun 29, 2010 at 6:25 PM, Eduard - Gabriel Munteanu eduard.munte...@linux360.ro wrote: On the other hand, we could just leave it alone for now. Changing mappings during DMA is stupid anyway: I don't think the guest can recover the results of DMA safely, even though it might be used on transfers in progress you simply don't care about anymore. Paul Brook suggested we could update the cpu_physical_memory_map() mappings somehow, but I think that's kinda difficult to accomplish. A malicious or broken guest shouldn't be able to crash or corrupt QEMU process memory. ?The IOMMU can only map from bus addresses to guest physical RAM (?) so the worst the guest can do here is corrupt itself? That's true, but it's fair to be concerned about the guest itself. Imagine it runs some possibly malicious apps which program the hardware to do DMA. That should be safe when a IOMMU is present. But suddenly the guest OS changes mappings and expects the IOMMU to enforce them as soon as invalidation commands are completed. The guest then reclaims the old space for other uses. This leaves an opportunity for those processes to corrupt or read sensitive data. In such a case, OS should put device into quiescence by reset like pci bus reset or pcie function level reset. pci bus reset patch hasn't been merged yet, though. It needs clean up/generalization. As long as QEMU acts in the same way as real hardware we should be okay. Will real hardware change the mappings immediately and abort the DMA from the device if it tries to access an invalidated address? Stefan -- yamahata
[Qemu-devel] [Bug 595438] Re: KVM segmentation fault, using SCSI+writeback and linux 2.4 guest
Yeah. I have compile non-stripped kvm binary. (gdb) bt #0 0x0852cd88 in ?? () #1 0x080f0f16 in scsi_command_complete (r=0x86252d8, status=value optimized out, sense=value optimized out) at /home/mmarkk/src/KVM/qemu-kvm-0.12.3+noroms/hw/scsi-disk.c:105 #2 0x080d4d19 in qcow_aio_write_cb (opaque=0x85e68b8, ret=0) at block/qcow2.c:631 #3 0x080c472f in posix_aio_process_queue (opaque=0x846bd98) at posix-aio-compat.c:460 #4 0x080c47e7 in posix_aio_read (opaque=0x846bd98) at posix-aio-compat.c:501 #5 0x08052266 in main_loop_wait (timeout=1000) at /home/mmarkk/src/KVM/qemu-kvm-0.12.3+noroms/vl.c:3999 #6 0x0806dcc4 in kvm_main_loop () at /home/mmarkk/src/KVM/qemu-kvm-0.12.3+noroms/qemu-kvm.c:2122 #7 0x08055465 in main_loop (argc=14, argv=0xb3e4, envp=0xb420) at /home/mmarkk/src/KVM/qemu-kvm-0.12.3+noroms/vl.c:4210 #8 main (argc=14, argv=0xb3e4, envp=0xb420) at /home/mmarkk/src/KVM/qemu-kvm-0.12.3+noroms/vl.c:6238 -- KVM segmentation fault, using SCSI+writeback and linux 2.4 guest https://bugs.launchpad.net/bugs/595438 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: New Bug description: I Use Ubuntu 32 bit 10.04 with standard KVM. I have Intel E7600 @ 3.06GHz processor with VMX In this system I Run: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -M pc-0.12 -enable-kvm -m 256 -smp 1 -name spamsender -uuid b9cacd5e-08f7-41fd-78c8-89cec59af881 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/spamsender.monitor,server,nowait -monitor chardev:monitor -boot d -drive file=/mnt/megadiff/cdiso_400_130.iso,if=ide,media=cdrom,index=2 -drive file=/home/mmarkk/spamsender2.img,if=scsi,index=0,format=qcow2,cache=writeback -net nic,macaddr=00:00:00:00:00:00,vlan=0,name=nic.0 -net tap,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:0 -vga cirrus .iso image contain custom distro of 2.4-linux kernel based system. During install process (when .tar.gz actively unpacked), kvm dead with segmentation fault. And ONLY when I choose scsi virtual disk and writeback simultaneously. But, writeback+ide, writethrough+scsi works OK. I use qcow2. It seems, that qcow does not have such problems. Virtual machine get down at random time during file copy. It seems, when qcow2 file size need to be expanded.
[Qemu-devel] [Bug 595438] Re: KVM segmentation fault, using SCSI+writeback and linux 2.4 guest
/* Helper function for command completion. */ static void scsi_command_complete(SCSIDiskReq *r, int status, int sense) { DPRINTF(Command complete tag=0x%x status=%d sense=%d\n, r-req.tag, status, sense); scsi_req_set_status(r-req, status, sense); scsi_req_complete(r-req); // - this is line #105 in my sources. scsi_remove_request(r); } What to do next? -- KVM segmentation fault, using SCSI+writeback and linux 2.4 guest https://bugs.launchpad.net/bugs/595438 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: New Bug description: I Use Ubuntu 32 bit 10.04 with standard KVM. I have Intel E7600 @ 3.06GHz processor with VMX In this system I Run: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -M pc-0.12 -enable-kvm -m 256 -smp 1 -name spamsender -uuid b9cacd5e-08f7-41fd-78c8-89cec59af881 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/spamsender.monitor,server,nowait -monitor chardev:monitor -boot d -drive file=/mnt/megadiff/cdiso_400_130.iso,if=ide,media=cdrom,index=2 -drive file=/home/mmarkk/spamsender2.img,if=scsi,index=0,format=qcow2,cache=writeback -net nic,macaddr=00:00:00:00:00:00,vlan=0,name=nic.0 -net tap,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:0 -vga cirrus .iso image contain custom distro of 2.4-linux kernel based system. During install process (when .tar.gz actively unpacked), kvm dead with segmentation fault. And ONLY when I choose scsi virtual disk and writeback simultaneously. But, writeback+ide, writethrough+scsi works OK. I use qcow2. It seems, that qcow does not have such problems. Virtual machine get down at random time during file copy. It seems, when qcow2 file size need to be expanded.
Re: [Qemu-devel] [PATCH] ARM v4t/arm920t support
On Thu, Jul 01, 2010 at 04:35:53PM -0500, Rob Landley wrote: I just confirmed that Vincent Sanders' patch (which he posted on May 29, 2009, and again on November 27, 2009) still applies to (and works with )current qemu-git. It adds a -cpu arm920t option to qemu-system-arm which boots a Linux kernel configured with CONFIG_CPU_ARM920T=y, which isn't possible without this patch. Here is the patch again. There may be more work to be done on top of this, but this patch staying out of tree hasn't noticeably accelerated that work in the past year and change. Could it please be merged? Rob, while I apreciate you digging this back up and having another go, you do not need to explicitly copy me in ;-) Please feel free to take and run with these changes any way you see fit (you may interpret that as me giving you copyright, whatever), just do not include me in the madness! I have previously stated I have given up trying to do anything whatsoever for QEMU because Paul seems to think that adding v4t support means I should clean up and differentiate V5 support etc. Not to mention that it took two *years* of pain and agro to get Paul to actually say thats what he wanted in which time I have moved on and the nine Samsumng platforms I wanted to submit support for are now yestardays news and the emulations have bitrotted to death. -- Regards Vincent
Re: [Qemu-devel] Tracing: outstanding tasks
On Fri, Jul 2, 2010 at 5:23 AM, Ananth N Mavinakayanahalli ana...@in.ibm.com wrote: On Wed, Jun 30, 2010 at 12:51:59PM +0100, Stefan Hajnoczi wrote: On Wed, Jun 30, 2010 at 11:20 AM, Prerna Saxena pre...@linux.vnet.ibm.com wrote: On 06/26/2010 01:36 PM, Stefan Hajnoczi wrote: Here are the outstanding tasks for QEMU tracing, which Prerna and I have been working on. Tracing aids debugging, profiling, and observing execution via lightweight logging at key points in the code path. ... 5. Out-of-line trace file write-out Owner: Stefan Trace buffers are written out to file synchronously. The vcpu thread should not be blocked so an async write-out mechanism is needed. 6. Trace file command Owner: ? Traces are written out to hardcoded /tmp/trace.log. This must be configurable. Tracing at startup time should still be possible so configuration needs to happen early. Another feature needed would be a facility to trigger a sync of trace buffer to file even if the buffer isn't full. Good idea. A related piece that is not in the current tracing branch is syncing to trace buffer to file on exit. Right now a partially filled trace buffer is not written when QEMU exits. Stefan
[Qemu-devel] [Bug 595438] Re: KVM segmentation fault, using SCSI+writeback and linux 2.4 guest
void scsi_req_complete(SCSIRequest *req) { assert(req-status != -1); req-bus-complete(req-bus, SCSI_REASON_DONE, req-tag, req-status); } (gdb) bt 1 #0 0x0852cd88 in ?? () (More stack frames follow...) (gdb) frame 1 #1 0x080f0f16 in scsi_command_complete (r=0x86252d8, status=value optimized out, sense=value optimized out) at /home/mmarkk/src/KVM/qemu-kvm-0.12.3+noroms/hw/scsi-disk.c:105 105 scsi_req_complete(r-req); (gdb) print r-req-bus-complete $7 = (scsi_completionfn) 0x852cd88 (gdb) disassemble 0x852cd88 No function contains specified address. Corrupted memory ? -- KVM segmentation fault, using SCSI+writeback and linux 2.4 guest https://bugs.launchpad.net/bugs/595438 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: New Bug description: I Use Ubuntu 32 bit 10.04 with standard KVM. I have Intel E7600 @ 3.06GHz processor with VMX In this system I Run: LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/sbin:/sbin:/bin QEMU_AUDIO_DRV=none /usr/bin/kvm -M pc-0.12 -enable-kvm -m 256 -smp 1 -name spamsender -uuid b9cacd5e-08f7-41fd-78c8-89cec59af881 -chardev socket,id=monitor,path=/var/lib/libvirt/qemu/spamsender.monitor,server,nowait -monitor chardev:monitor -boot d -drive file=/mnt/megadiff/cdiso_400_130.iso,if=ide,media=cdrom,index=2 -drive file=/home/mmarkk/spamsender2.img,if=scsi,index=0,format=qcow2,cache=writeback -net nic,macaddr=00:00:00:00:00:00,vlan=0,name=nic.0 -net tap,vlan=0,name=tap.0 -chardev pty,id=serial0 -serial chardev:serial0 -parallel none -usb -vnc 127.0.0.1:0 -vga cirrus .iso image contain custom distro of 2.4-linux kernel based system. During install process (when .tar.gz actively unpacked), kvm dead with segmentation fault. And ONLY when I choose scsi virtual disk and writeback simultaneously. But, writeback+ide, writethrough+scsi works OK. I use qcow2. It seems, that qcow does not have such problems. Virtual machine get down at random time during file copy. It seems, when qcow2 file size need to be expanded.
[Qemu-devel] [PATCH] Makefile: Fix compilation for non-standard host kernel path
Set up host kernel include paths specified by --kerneldir When host kernel headers are placed in non-standard paths, the KVM_CFLAGS are presently invoked only for a few .c files (kvm*.c,vhost*.c) and not for other files like machine.c, cpus.c ..etc which also depend on linux/kvm.h Signed-off-by: Prerna Saxena pre...@linux.vnet.ibm.com --- Makefile.target |7 +-- 1 files changed, 5 insertions(+), 2 deletions(-) diff --git a/Makefile.target b/Makefile.target index d58b201..b433112 100644 --- a/Makefile.target +++ b/Makefile.target @@ -29,12 +29,15 @@ QEMU_PROG=qemu-system-$(TARGET_ARCH2)$(EXESUF) endif endif +# Set up host kernel include paths specified by --kerneldir +ifdef CONFIG_KVM +QEMU_CFLAGS+=$(KVM_CFLAGS) +endif + PROGS=$(QEMU_PROG) LIBS+=-lm -kvm.o kvm-all.o vhost.o vhost_net.o: QEMU_CFLAGS+=$(KVM_CFLAGS) - config-target.h: config-target.h-timestamp config-target.h-timestamp: config-target.mak -- 1.6.2.5 -- Prerna Saxena Linux Technology Centre, IBM Systems and Technology Lab, Bangalore, India
[Qemu-devel] [PATCH v2 0/2] block: Fix multiwrite error handling
The bdrv_aio_multiwrite error handling has some bugs that lead to premature cleanup, causing use-after-free and double free problems. v2: - Completely replaced patch 1 which Stefan found to be incorrect (thanks for the good review!). Hope I've got it right this time. Kevin Wolf (2): block: Fix early failure in multiwrite block: Handle multiwrite errors only when all requests have completed block.c | 40 ++-- 1 files changed, 30 insertions(+), 10 deletions(-)
[Qemu-devel] [PATCH v2 2/2] block: Handle multiwrite errors only when all requests have completed
Don't try to be clever by freeing all temporary data and calling all callbacks when the return value (an error) is certain. Doing so has at least two important problems: * The temporary data that is freed (qiov, possibly zero buffer) is still used by the requests that have not yet completed. * Calling the callbacks for all requests in the multiwrite means for the caller that it may free buffers etc. which are still in use. Just remember the error value and do the cleanup when all requests have completed. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c |5 + 1 files changed, 1 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index e65971c..dd6dd76 100644 --- a/block.c +++ b/block.c @@ -2042,14 +2042,11 @@ static void multiwrite_cb(void *opaque, int ret) if (ret 0 !mcb-error) { mcb-error = ret; -multiwrite_user_cb(mcb); } mcb-num_requests--; if (mcb-num_requests == 0) { -if (mcb-error == 0) { -multiwrite_user_cb(mcb); -} +multiwrite_user_cb(mcb); qemu_free(mcb); } } -- 1.6.6.1
[Qemu-devel] [PATCH v2 1/2] block: Fix early failure in multiwrite
bdrv_aio_writev may call the callback immediately (and it will commonly do so in error cases). Current code doesn't consider this. For details see the comment added by this patch. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 35 +-- 1 files changed, 29 insertions(+), 6 deletions(-) diff --git a/block.c b/block.c index 9176dec..e65971c 100644 --- a/block.c +++ b/block.c @@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // Check for mergable requests num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb); -// Run the aio requests +/* + * Run the aio requests. As soon as one request can't be submitted + * successfully, fail all requests that are not yet submitted (we must + * return failure for all requests anyway) + * + * num_requests cannot be set to the right value immediately: If + * bdrv_aio_writev fails for some request, num_requests would be too high + * and therefore multiwrite_cb() would never recognize the multiwrite + * request as completed. We also cannot use the loop variable i to set it + * when the first request fails because the callback may already have been + * called for previously submitted requests. Thus, num_requests must be + * incremented for each request that is submitted. + * + * The problem that callbacks may be called early also means that we need + * to take care that num_requests doesn't become 0 before all requests are + * submitted - multiwrite_cb() would consider the multiwrite request + * completed. A dummy request that is completed by a manual call to + * multiwrite_cb() takes care of this. + */ +mcb-num_requests = 1; + for (i = 0; i num_reqs; i++) { +mcb-num_requests++; acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, reqs[i].nb_sectors, multiwrite_cb, mcb); @@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // We can only fail the whole thing if no request has been // submitted yet. Otherwise we'll wait for the submitted AIOs to // complete and report the error in the callback. -if (mcb-num_requests == 0) { -reqs[i].error = -EIO; +if (i == 0) { goto fail; } else { -mcb-num_requests++; multiwrite_cb(mcb, -EIO); break; } -} else { -mcb-num_requests++; } } +/* Complete the dummy request */ +multiwrite_cb(mcb, 0); + return 0; fail: +for (i = 0; i mcb-num_callbacks; i++) { +reqs[i].error = -EIO; +} qemu_free(mcb); return -1; } -- 1.6.6.1
[Qemu-devel] slow ext4 O_SYNC writes (why qemu qcow2 is so slow on ext4 vs ext3)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hello. I noticed that qcow2 images, esp. fresh ones (so that they receive lots of metadata updates) are very slow on my machine. And on IRC (#kvm), Sheldon Hearn found that on ext3, it is fast again. So I tested different combinations for a bit, and observed the following: for fresh qcow2 file, with default qemu cache settings, copying kernel source is about 10 times slower on ext4 than on ext3. Second copy (rewrite) is significantly faster in both cases (expectable), but still ~20% slower on ext4 than on ext3. Normal cache mode in qemu is writethrough, which translates to O_SYNC file open mode. With cache=none, which translates to O_DIRECT, metadata- intensive writes (fresh qcow) are about as slow as on ext4 with O_SYNC, and rewrite is expectedly faster, but now there's _no_ difference in speed between ext3 and ext4. I did a series of straces of the writer processes, -- time spent in pwrite() syscalls is significantly larger for ext4 with O_SYNC than with ext3 with O_SYNC, the diff is about 50 times. Also, with slower I/O in case of ext4, qemu-kvm starts more I/O threads, which, as it seems, slows whole thing down even further - I changed max_threads from default 64 to 16, and the speed improved slightly. Here, the diff. is again quite significant: on ext3 qemu spawns only 8 threads, while on ext4 all 64 I/O threads are spawned almost immediately. So I've two questions: 1. Why ext4 O_SYNC is too slow compared with ext3 O_SYNC? This is observed on 2.6.32 and 2.6.34 kernels, barriers or data={writeback|ordered} had no difference. I tested whole thing on a partition on a single drive, sheldonh used ext[34]fs on top of lvm on a raid1 volume. 2. The number of threads spawned for I/O... this is a good question, how to find an adequate cap. Different hw has different capabilities, and we may have more users doing I/O at the same time... Comments? Thanks! /mjt -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) iJwEAQECAAYFAkwt36MACgkQUlPFrXTwyDj/CAQAlhaGjk4csnhlP1zaHFubFR8F qiD6HkCUPeofrNAqqbAQYmaK9rNuiFgdiSfkqB1mBCy9Y0ay69XQPXPmTsTH2y66 s+eRC6voIBtGKiPNQN7jSSrHhl3hC1g/FrByppQsM0laWxmW6nQKaZOnlR9vKdvt 2zNKV/9qfM0VXr8Yf6Y= =UIna -END PGP SIGNATURE-
[Qemu-devel] Re: [PATCH v2 1/2] block: Fix early failure in multiwrite
On Fri, Jul 2, 2010 at 1:07 PM, Kevin Wolf kw...@redhat.com wrote: bdrv_aio_writev may call the callback immediately (and it will commonly do so in error cases). Current code doesn't consider this. For details see the comment added by this patch. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 35 +-- 1 files changed, 29 insertions(+), 6 deletions(-) diff --git a/block.c b/block.c index 9176dec..e65971c 100644 --- a/block.c +++ b/block.c @@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // Check for mergable requests num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb); - // Run the aio requests + /* + * Run the aio requests. As soon as one request can't be submitted + * successfully, fail all requests that are not yet submitted (we must + * return failure for all requests anyway) + * + * num_requests cannot be set to the right value immediately: If + * bdrv_aio_writev fails for some request, num_requests would be too high + * and therefore multiwrite_cb() would never recognize the multiwrite + * request as completed. We also cannot use the loop variable i to set it + * when the first request fails because the callback may already have been + * called for previously submitted requests. Thus, num_requests must be + * incremented for each request that is submitted. + * + * The problem that callbacks may be called early also means that we need + * to take care that num_requests doesn't become 0 before all requests are + * submitted - multiwrite_cb() would consider the multiwrite request + * completed. A dummy request that is completed by a manual call to + * multiwrite_cb() takes care of this. + */ + mcb-num_requests = 1; + for (i = 0; i num_reqs; i++) { + mcb-num_requests++; acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, reqs[i].nb_sectors, multiwrite_cb, mcb); @@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // We can only fail the whole thing if no request has been // submitted yet. Otherwise we'll wait for the submitted AIOs to // complete and report the error in the callback. - if (mcb-num_requests == 0) { - reqs[i].error = -EIO; + if (i == 0) { goto fail; } else { - mcb-num_requests++; multiwrite_cb(mcb, -EIO); When bdrv_aio_writev() fails we don't know if the callback has been invoked by the block driver. Qcow2 will invoke the callback in some cases. This is a problem because num_requests will be decremented twice if we unconditionally call it here. Stefan
[Qemu-devel] Re: [PATCH v2 1/2] block: Fix early failure in multiwrite
Am 02.07.2010 15:18, schrieb Stefan Hajnoczi: On Fri, Jul 2, 2010 at 1:07 PM, Kevin Wolf kw...@redhat.com wrote: bdrv_aio_writev may call the callback immediately (and it will commonly do so in error cases). Current code doesn't consider this. For details see the comment added by this patch. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 35 +-- 1 files changed, 29 insertions(+), 6 deletions(-) diff --git a/block.c b/block.c index 9176dec..e65971c 100644 --- a/block.c +++ b/block.c @@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // Check for mergable requests num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb); -// Run the aio requests +/* + * Run the aio requests. As soon as one request can't be submitted + * successfully, fail all requests that are not yet submitted (we must + * return failure for all requests anyway) + * + * num_requests cannot be set to the right value immediately: If + * bdrv_aio_writev fails for some request, num_requests would be too high + * and therefore multiwrite_cb() would never recognize the multiwrite + * request as completed. We also cannot use the loop variable i to set it + * when the first request fails because the callback may already have been + * called for previously submitted requests. Thus, num_requests must be + * incremented for each request that is submitted. + * + * The problem that callbacks may be called early also means that we need + * to take care that num_requests doesn't become 0 before all requests are + * submitted - multiwrite_cb() would consider the multiwrite request + * completed. A dummy request that is completed by a manual call to + * multiwrite_cb() takes care of this. + */ +mcb-num_requests = 1; + for (i = 0; i num_reqs; i++) { +mcb-num_requests++; acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, reqs[i].nb_sectors, multiwrite_cb, mcb); @@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // We can only fail the whole thing if no request has been // submitted yet. Otherwise we'll wait for the submitted AIOs to // complete and report the error in the callback. -if (mcb-num_requests == 0) { -reqs[i].error = -EIO; +if (i == 0) { goto fail; } else { -mcb-num_requests++; multiwrite_cb(mcb, -EIO); When bdrv_aio_writev() fails we don't know if the callback has been invoked by the block driver. Qcow2 will invoke the callback in some cases. This is a problem because num_requests will be decremented twice if we unconditionally call it here. Talked to Stefan on IRC and we came to the conclusion that it's not a problem in fact: qcow_aio_writev() either returns NULL or calls a callback, but it never does both. If a block driver returned NULL and called a callback for the same request that would be a bug in the block driver. Kevin
[Qemu-devel] Re: [PATCH 01/14] Add new data type for fprintf like function pointers
On 04/09/2010 01:20 PM, Stefan Weil wrote: Some restrictions why qemu-common.h was not used might be no longer valid (I think they came from pre-tcg times). Nevertheless, cris-dis.c even says that it cannot include qemu-common.h (without giving a reason). I think these are no longer valid. In fact, almost everything is including the full-blown qemu-common.h, via some other header file. They may be valid only in cpu-exec.c and target-*/op_helper.c, but even then maybe not. :) In particular, I see two reasons to limit the number of included files when global registers are in use. The first is avoiding library calls since they may be unsafe some OS/host combinations, particularly SPARC/glibc. No includes - no prototypes - no calls. cpu-exec.c is careful to only use the global env when it's safe to do so, but logging goes through fprintf and target-*/op_helper.c files require logging. Apparently, printf/fprintf have been audited to work on SPARC/glibc too, so dyngen-exec.h allows those calls. This however does not *require* using custom declarations in place of stdio.h as done in dyngen-exec.h, it's just a small safety net. The second (more real) reason is inline assembly failures, for example (32-bit x86): register int e asm(edi); static inline int h() { int x; asm volatile (mov $0, %0 : =D (x)); } int g() { int f = e; h(); return e - f; } fails to compile because gcc cannot assign edi to %0 in h(). Some host headers may use assembly in a way that breaks qemu. With only one global register in use, however, it makes sense IMO to drop the custom inclusion hacks and see if anyone screams. Paolo
[Qemu-devel] [RFC] env stored in segment register for i386
On 07/02/2010 08:37 AM, Paolo Bonzini wrote: The second (more real) reason is inline assembly failures, for example (32-bit x86): register int e asm(edi); static inline int h() { int x; asm volatile (mov $0, %0 : =D (x)); } int g() { int f = e; h(); return e - f; } fails to compile because gcc cannot assign edi to %0 in h(). Some host headers may use assembly in a way that breaks qemu. With only one global register in use, however, it makes sense IMO to drop the custom inclusion hacks and see if anyone screams. A few months ago I developed a patch that would allow the global env variable to be accessed via %fs (plus a backing TLS variable), which means that no hardware register needs to be reserved for i386. I never quite got around to finishing it because I don't know how to set up a segment register in Windows, and it seemed like the kind of patch that could easily get quagmired. Is there any interest in a patch like this? Should I try to revive it? r~
[Qemu-devel] [PULL 00/23] Block patches
The following changes since commit 8713f8ffb87a28c94b4f22b9a9ec16c55381187e: Andi Kleen (1): Don't declare XSAVE as supported are available in the git repository at: git://repo.or.cz/qemu/kevin.git for-anthony Christoph Hellwig (1): block: allow filenames with colons again for host devices Kevin Wolf (6): qcow2: Fix error handling during metadata preallocation blkdebug: Fix set_state_opts definition blkdebug: Free QemuOpts after having read the config blkdebug: Initialize state as 1 block: Fix early failure in multiwrite block: Handle multiwrite errors only when all requests have completed MORITA Kazutaka (1): qemu-img: avoid calling exit(1) to release resources properly Markus Armbruster (14): scsi: scsi_bus_legacy_handle_cmdline() can fail, fix callers ide: Make it explicit that ide_create_drive() can't fail blockdev: Remove drive_get_serial() blockdev: New drive_get_by_blockdev() blockdev: Clean up automatic drive deletion qdev: Decouple qdev_prop_drive from DriveInfo blockdev: drive_get_by_id() is no longer used, remove block: Catch attempt to attach multiple devices to a blockdev qemu-option: New qemu_opts_reset() savevm: Survive hot-unplug of snapshot device block: Clean up bdrv_snapshots() block: Fix virtual media change for if=none ide: Make PIIX and ISA IDE init functions return the qdev pc: Fix CMOS info for drives defined with -device Ryan Harper (1): Don't reset bs-is_temporary in bdrv_open_common block.c | 125 ++- block.h |5 + block/blkdebug.c |7 ++- block/qcow2.c| 15 ++-- block_int.h |8 +- blockdev.c | 45 ++ blockdev.h |7 +- hw/esp.c |3 +- hw/fdc.c | 32 --- hw/ide.h | 13 ++- hw/ide/core.c| 18 ++-- hw/ide/internal.h|2 +- hw/ide/isa.c |8 +- hw/ide/piix.c|6 +- hw/ide/qdev.c| 22 -- hw/lsi53c895a.c |2 +- hw/pc.c | 94 + hw/pc.h |3 +- hw/pc_piix.c | 16 +++- hw/pci-hotplug.c | 11 ++- hw/qdev-properties.c | 47 +-- hw/qdev.h|7 +- hw/s390-virtio.c |2 +- hw/scsi-bus.c| 20 +++-- hw/scsi-disk.c | 21 +++-- hw/scsi-generic.c|7 +- hw/scsi.h|4 +- hw/usb-msd.c | 30 +-- hw/virtio-blk.c |3 +- hw/virtio-pci.c |4 +- qemu-img.c | 237 +++--- qemu-option.c|9 ++ qemu-option.h|1 + savevm.c | 31 +-- 34 files changed, 606 insertions(+), 259 deletions(-)
[Qemu-devel] [PATCH 02/23] block: allow filenames with colons again for host devices
From: Christoph Hellwig h...@lst.de Before the raw/file split we used to allow filenames with colons for host device only. While this was more by accident than by design people rely on it, so we need to bring it back. So move the host device probing to be before the protocol detection again. Signed-off-by: Christoph Hellwig h...@lst.de Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 29 ++--- 1 files changed, 18 insertions(+), 11 deletions(-) diff --git a/block.c b/block.c index e71a771..0aaec3b 100644 --- a/block.c +++ b/block.c @@ -288,23 +288,30 @@ BlockDriver *bdrv_find_protocol(const char *filename) char protocol[128]; int len; const char *p; -int is_drive; /* TODO Drivers without bdrv_file_open must be specified explicitly */ +/* + * XXX(hch): we really should not let host device detection + * override an explicit protocol specification, but moving this + * later breaks access to device names with colons in them. + * Thanks to the brain-dead persistent naming schemes on udev- + * based Linux systems those actually are quite common. + */ +drv1 = find_hdev_driver(filename); +if (drv1) { +return drv1; +} + #ifdef _WIN32 -is_drive = is_windows_drive(filename) || -is_windows_drive_prefix(filename); -#else -is_drive = 0; + if (is_windows_drive(filename) || + is_windows_drive_prefix(filename)) + return bdrv_find_format(file); #endif + p = strchr(filename, ':'); -if (!p || is_drive) { -drv1 = find_hdev_driver(filename); -if (!drv1) { -drv1 = bdrv_find_format(file); -} -return drv1; +if (!p) { +return bdrv_find_format(file); } len = p - filename; if (len sizeof(protocol) - 1) -- 1.6.6.1
[Qemu-devel] [PATCH 10/23] blockdev: drive_get_by_id() is no longer used, remove
From: Markus Armbruster arm...@redhat.com Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- blockdev.c | 12 blockdev.h |1 - 2 files changed, 0 insertions(+), 13 deletions(-) diff --git a/blockdev.c b/blockdev.c index 4848112..cecde2b 100644 --- a/blockdev.c +++ b/blockdev.c @@ -75,18 +75,6 @@ DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit) return NULL; } -DriveInfo *drive_get_by_id(const char *id) -{ -DriveInfo *dinfo; - -QTAILQ_FOREACH(dinfo, drives, next) { -if (strcmp(id, dinfo-id)) -continue; -return dinfo; -} -return NULL; -} - int drive_get_max_bus(BlockInterfaceType type) { int max_bus; diff --git a/blockdev.h b/blockdev.h index 32e6979..37f3a01 100644 --- a/blockdev.h +++ b/blockdev.h @@ -41,7 +41,6 @@ typedef struct DriveInfo { #define MAX_SCSI_DEVS 7 extern DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit); -extern DriveInfo *drive_get_by_id(const char *id); extern int drive_get_max_bus(BlockInterfaceType type); extern void drive_uninit(DriveInfo *dinfo); extern DriveInfo *drive_get_by_blockdev(BlockDriverState *bs); -- 1.6.6.1
[Qemu-devel] [PATCH 03/23] scsi: scsi_bus_legacy_handle_cmdline() can fail, fix callers
From: Markus Armbruster arm...@redhat.com None of its callers checks for failure. scsi_hot_add() can crash because of that: (qemu) drive_add 4 if=scsi,format=host_device,file=/dev/sg1 scsi-generic: scsi generic interface too old Segmentation fault (core dumped) Fix all callers, not just scsi_hot_add(). Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- hw/esp.c |3 +-- hw/lsi53c895a.c |2 +- hw/pci-hotplug.c |3 +++ hw/scsi-bus.c| 11 +++ hw/scsi.h|2 +- hw/usb-msd.c |3 +++ 6 files changed, 16 insertions(+), 8 deletions(-) diff --git a/hw/esp.c b/hw/esp.c index 7740879..349052a 100644 --- a/hw/esp.c +++ b/hw/esp.c @@ -679,8 +679,7 @@ static int esp_init1(SysBusDevice *dev) qdev_init_gpio_in(dev-qdev, parent_esp_reset, 1); scsi_bus_new(s-bus, dev-qdev, 0, ESP_MAX_DEVS, esp_command_complete); -scsi_bus_legacy_handle_cmdline(s-bus); -return 0; +return scsi_bus_legacy_handle_cmdline(s-bus); } static SysBusDeviceInfo esp_info = { diff --git a/hw/lsi53c895a.c b/hw/lsi53c895a.c index 9a37fed..1bb1caf 100644 --- a/hw/lsi53c895a.c +++ b/hw/lsi53c895a.c @@ -2176,7 +2176,7 @@ static int lsi_scsi_init(PCIDevice *dev) scsi_bus_new(s-bus, dev-qdev, 1, LSI_MAX_DEVS, lsi_command_complete); if (!dev-qdev.hotplugged) { -scsi_bus_legacy_handle_cmdline(s-bus); +return scsi_bus_legacy_handle_cmdline(s-bus); } return 0; } diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c index c39e640..55c9fe3 100644 --- a/hw/pci-hotplug.c +++ b/hw/pci-hotplug.c @@ -90,6 +90,9 @@ static int scsi_hot_add(Monitor *mon, DeviceState *adapter, */ dinfo-unit = qemu_opt_get_number(dinfo-opts, unit, -1); scsidev = scsi_bus_legacy_add_drive(scsibus, dinfo, dinfo-unit); +if (!scsidev) { +return -1; +} dinfo-unit = scsidev-id; if (printinfo) diff --git a/hw/scsi-bus.c b/hw/scsi-bus.c index 24bd060..d5b66c1 100644 --- a/hw/scsi-bus.c +++ b/hw/scsi-bus.c @@ -83,7 +83,6 @@ void scsi_qdev_register(SCSIDeviceInfo *info) } /* handle legacy '-drive if=scsi,...' cmd line args */ -/* FIXME callers should check for failure, but don't */ SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, DriveInfo *dinfo, int unit) { const char *driver; @@ -98,18 +97,22 @@ SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, DriveInfo *dinfo, int unit) return DO_UPCAST(SCSIDevice, qdev, dev); } -void scsi_bus_legacy_handle_cmdline(SCSIBus *bus) +int scsi_bus_legacy_handle_cmdline(SCSIBus *bus) { DriveInfo *dinfo; -int unit; +int res = 0, unit; for (unit = 0; unit MAX_SCSI_DEVS; unit++) { dinfo = drive_get(IF_SCSI, bus-busnr, unit); if (dinfo == NULL) { continue; } -scsi_bus_legacy_add_drive(bus, dinfo, unit); +if (!scsi_bus_legacy_add_drive(bus, dinfo, unit)) { +res = -1; +break; +} } +return res; } void scsi_dev_clear_sense(SCSIDevice *dev) diff --git a/hw/scsi.h b/hw/scsi.h index b668e27..b1b5f73 100644 --- a/hw/scsi.h +++ b/hw/scsi.h @@ -98,7 +98,7 @@ static inline SCSIBus *scsi_bus_from_device(SCSIDevice *d) } SCSIDevice *scsi_bus_legacy_add_drive(SCSIBus *bus, DriveInfo *dinfo, int unit); -void scsi_bus_legacy_handle_cmdline(SCSIBus *bus); +int scsi_bus_legacy_handle_cmdline(SCSIBus *bus); void scsi_dev_clear_sense(SCSIDevice *dev); void scsi_dev_set_sense(SCSIDevice *dev, uint8_t key); diff --git a/hw/usb-msd.c b/hw/usb-msd.c index 003bd8a..8e9718c 100644 --- a/hw/usb-msd.c +++ b/hw/usb-msd.c @@ -531,6 +531,9 @@ static int usb_msd_initfn(USBDevice *dev) s-dev.speed = USB_SPEED_FULL; scsi_bus_new(s-bus, s-dev.qdev, 0, 1, usb_msd_command_complete); s-scsi_dev = scsi_bus_legacy_add_drive(s-bus, s-conf.dinfo, 0); +if (!s-scsi_dev) { +return -1; +} s-bus.qbus.allow_hotplug = 0; usb_msd_handle_reset(dev); -- 1.6.6.1
[Qemu-devel] [PATCH 06/23] Don't reset bs-is_temporary in bdrv_open_common
From: Ryan Harper ry...@us.ibm.com To fix https://bugs.launchpad.net/qemu/+bug/597402 where qemu fails to call unlink() on temporary snapshots due to bs-is_temporary getting clobbered in bdrv_open_common() after being set in bdrv_open() which calls the former. We don't need to initialize bs-is_temporary in bdrv_open_common(). Signed-off-by: Ryan Harper ry...@us.ibm.com Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c |1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/block.c b/block.c index 0aaec3b..31ca4c5 100644 --- a/block.c +++ b/block.c @@ -400,7 +400,6 @@ static int bdrv_open_common(BlockDriverState *bs, const char *filename, bs-file = NULL; bs-total_sectors = 0; -bs-is_temporary = 0; bs-encrypted = 0; bs-valid_key = 0; bs-open_flags = flags; -- 1.6.6.1
[Qemu-devel] [PATCH 19/23] ide: Make PIIX and ISA IDE init functions return the qdev
From: Markus Armbruster arm...@redhat.com Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- hw/ide.h | 11 ++- hw/ide/isa.c |8 hw/ide/piix.c |6 -- 3 files changed, 14 insertions(+), 11 deletions(-) diff --git a/hw/ide.h b/hw/ide.h index bb635b6..82b3c11 100644 --- a/hw/ide.h +++ b/hw/ide.h @@ -1,17 +1,18 @@ #ifndef HW_IDE_H #define HW_IDE_H -#include qdev.h +#include isa.h +#include pci.h /* ide-isa.c */ -int isa_ide_init(int iobase, int iobase2, int isairq, - DriveInfo *hd0, DriveInfo *hd1); +ISADevice *isa_ide_init(int iobase, int iobase2, int isairq, +DriveInfo *hd0, DriveInfo *hd1); /* ide-pci.c */ void pci_cmd646_ide_init(PCIBus *bus, DriveInfo **hd_table, int secondary_ide_enabled); -void pci_piix3_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn); -void pci_piix4_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn); +PCIDevice *pci_piix3_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn); +PCIDevice *pci_piix4_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn); void vt82c686b_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn); /* ide-macio.c */ diff --git a/hw/ide/isa.c b/hw/ide/isa.c index b6c6347..10777ca 100644 --- a/hw/ide/isa.c +++ b/hw/ide/isa.c @@ -75,8 +75,8 @@ static int isa_ide_initfn(ISADevice *dev) return 0; }; -int isa_ide_init(int iobase, int iobase2, int isairq, - DriveInfo *hd0, DriveInfo *hd1) +ISADevice *isa_ide_init(int iobase, int iobase2, int isairq, +DriveInfo *hd0, DriveInfo *hd1) { ISADevice *dev; ISAIDEState *s; @@ -86,14 +86,14 @@ int isa_ide_init(int iobase, int iobase2, int isairq, qdev_prop_set_uint32(dev-qdev, iobase2, iobase2); qdev_prop_set_uint32(dev-qdev, irq, isairq); if (qdev_init(dev-qdev) 0) -return -1; +return NULL; s = DO_UPCAST(ISAIDEState, dev, dev); if (hd0) ide_create_drive(s-bus, 0, hd0); if (hd1) ide_create_drive(s-bus, 1, hd1); -return 0; +return dev; } static ISADeviceInfo isa_ide_info = { diff --git a/hw/ide/piix.c b/hw/ide/piix.c index dad6e86..fa6 100644 --- a/hw/ide/piix.c +++ b/hw/ide/piix.c @@ -160,22 +160,24 @@ static int pci_piix4_ide_initfn(PCIDevice *dev) /* hd_table must contain 4 block drivers */ /* NOTE: for the PIIX3, the IRQs and IOports are hardcoded */ -void pci_piix3_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn) +PCIDevice *pci_piix3_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn) { PCIDevice *dev; dev = pci_create_simple(bus, devfn, piix3-ide); pci_ide_create_devs(dev, hd_table); +return dev; } /* hd_table must contain 4 block drivers */ /* NOTE: for the PIIX4, the IRQs and IOports are hardcoded */ -void pci_piix4_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn) +PCIDevice *pci_piix4_ide_init(PCIBus *bus, DriveInfo **hd_table, int devfn) { PCIDevice *dev; dev = pci_create_simple(bus, devfn, piix4-ide); pci_ide_create_devs(dev, hd_table); +return dev; } static PCIDeviceInfo piix_ide_info[] = { -- 1.6.6.1
[Qemu-devel] [PATCH 11/23] block: Catch attempt to attach multiple devices to a blockdev
From: Markus Armbruster arm...@redhat.com For instance, -device scsi-disk,drive=foo -device scsi-disk,drive=foo happily creates two SCSI disks connected to the same block device. It's all downhill from there. Device usb-storage deliberately attaches twice to the same blockdev, which fails with the fix in place. Detach before the second attach there. Also catch attempt to delete while a guest device model is attached. Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 22 ++ block.h |3 +++ block_int.h |2 ++ hw/fdc.c | 10 +- hw/ide/qdev.c|2 +- hw/pci-hotplug.c |6 +- hw/qdev-properties.c | 22 +- hw/qdev.h|3 ++- hw/s390-virtio.c |2 +- hw/scsi-bus.c|5 - hw/usb-msd.c | 12 11 files changed, 74 insertions(+), 15 deletions(-) diff --git a/block.c b/block.c index 31ca4c5..4c65035 100644 --- a/block.c +++ b/block.c @@ -665,6 +665,8 @@ void bdrv_close_all(void) void bdrv_delete(BlockDriverState *bs) { +assert(!bs-peer); + /* remove from list, if necessary */ if (bs-device_name[0] != '\0') { QTAILQ_REMOVE(bdrv_states, bs, list); @@ -678,6 +680,26 @@ void bdrv_delete(BlockDriverState *bs) qemu_free(bs); } +int bdrv_attach(BlockDriverState *bs, DeviceState *qdev) +{ +if (bs-peer) { +return -EBUSY; +} +bs-peer = qdev; +return 0; +} + +void bdrv_detach(BlockDriverState *bs, DeviceState *qdev) +{ +assert(bs-peer == qdev); +bs-peer = NULL; +} + +DeviceState *bdrv_get_attached(BlockDriverState *bs) +{ +return bs-peer; +} + /* * Run consistency checks on an image * diff --git a/block.h b/block.h index 6a157f4..88ac06e 100644 --- a/block.h +++ b/block.h @@ -71,6 +71,9 @@ int bdrv_file_open(BlockDriverState **pbs, const char *filename, int flags); int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv); void bdrv_close(BlockDriverState *bs); +int bdrv_attach(BlockDriverState *bs, DeviceState *qdev); +void bdrv_detach(BlockDriverState *bs, DeviceState *qdev); +DeviceState *bdrv_get_attached(BlockDriverState *bs); int bdrv_check(BlockDriverState *bs); int bdrv_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors); diff --git a/block_int.h b/block_int.h index e60aed4..a94b801 100644 --- a/block_int.h +++ b/block_int.h @@ -148,6 +148,8 @@ struct BlockDriverState { BlockDriver *drv; /* NULL means no media */ void *opaque; +DeviceState *peer; + char filename[1024]; char backing_file[1024]; /* if non zero, the image is a diff of this file image */ diff --git a/hw/fdc.c b/hw/fdc.c index 08712bc..1496cfa 100644 --- a/hw/fdc.c +++ b/hw/fdc.c @@ -1860,10 +1860,10 @@ FDCtrl *fdctrl_init_isa(DriveInfo **fds) dev = isa_create(isa-fdc); if (fds[0]) { -qdev_prop_set_drive(dev-qdev, driveA, fds[0]-bdrv); +qdev_prop_set_drive_nofail(dev-qdev, driveA, fds[0]-bdrv); } if (fds[1]) { -qdev_prop_set_drive(dev-qdev, driveB, fds[1]-bdrv); +qdev_prop_set_drive_nofail(dev-qdev, driveB, fds[1]-bdrv); } if (qdev_init(dev-qdev) 0) return NULL; @@ -1882,10 +1882,10 @@ FDCtrl *fdctrl_init_sysbus(qemu_irq irq, int dma_chann, fdctrl = sys-state; fdctrl-dma_chann = dma_chann; /* FIXME */ if (fds[0]) { -qdev_prop_set_drive(dev, driveA, fds[0]-bdrv); +qdev_prop_set_drive_nofail(dev, driveA, fds[0]-bdrv); } if (fds[1]) { -qdev_prop_set_drive(dev, driveB, fds[1]-bdrv); +qdev_prop_set_drive_nofail(dev, driveB, fds[1]-bdrv); } qdev_init_nofail(dev); sysbus_connect_irq(sys-busdev, 0, irq); @@ -1903,7 +1903,7 @@ FDCtrl *sun4m_fdctrl_init(qemu_irq irq, target_phys_addr_t io_base, dev = qdev_create(NULL, SUNW,fdtwo); if (fds[0]) { -qdev_prop_set_drive(dev, drive, fds[0]-bdrv); +qdev_prop_set_drive_nofail(dev, drive, fds[0]-bdrv); } qdev_init_nofail(dev); sys = DO_UPCAST(FDCtrlSysBus, busdev.qdev, dev); diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index a5fdab0..b34c473 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -83,7 +83,7 @@ IDEDevice *ide_create_drive(IDEBus *bus, int unit, DriveInfo *drive) dev = qdev_create(bus-qbus, ide-drive); qdev_prop_set_uint32(dev, unit, unit); -qdev_prop_set_drive(dev, drive, drive-bdrv); +qdev_prop_set_drive_nofail(dev, drive, drive-bdrv); qdev_init_nofail(dev); return DO_UPCAST(IDEDevice, qdev, dev); } diff --git a/hw/pci-hotplug.c b/hw/pci-hotplug.c index d743192..fe468d6 100644 --- a/hw/pci-hotplug.c +++ b/hw/pci-hotplug.c @@ -214,7 +214,11 @@ static PCIDevice *qemu_pci_hot_add_storage(Monitor *mon, return NULL;
[Qemu-devel] [PATCH 20/23] pc: Fix CMOS info for drives defined with -device
From: Markus Armbruster arm...@redhat.com Drives defined with -drive if=ide get get created along with the IDE controller, inside machine-init(). That's before cmos_init(). Drives defined with -device get created during generic device init. That's after cmos_init(). Because of that, CMOS has no information on them (type, geometry, translation). Older versions of Windows such as XP reportedly choke on that. Split off the part of CMOS initialization that needs to know about -device devices, and turn it into a reset handler, so it runs after device creation. Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- hw/ide.h |2 + hw/ide/qdev.c |7 hw/pc.c | 94 +++- hw/pc.h |3 +- hw/pc_piix.c | 16 +++--- 5 files changed, 81 insertions(+), 41 deletions(-) diff --git a/hw/ide.h b/hw/ide.h index 82b3c11..2b5ae7c 100644 --- a/hw/ide.h +++ b/hw/ide.h @@ -24,4 +24,6 @@ void mmio_ide_init (target_phys_addr_t membase, target_phys_addr_t membase2, qemu_irq irq, int shift, DriveInfo *hd0, DriveInfo *hd1); +void ide_get_bs(BlockDriverState *bs[], BusState *qbus); + #endif /* HW_IDE_H */ diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index b34c473..2977a16 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -88,6 +88,13 @@ IDEDevice *ide_create_drive(IDEBus *bus, int unit, DriveInfo *drive) return DO_UPCAST(IDEDevice, qdev, dev); } +void ide_get_bs(BlockDriverState *bs[], BusState *qbus) +{ +IDEBus *bus = DO_UPCAST(IDEBus, qbus, qbus); +bs[0] = bus-master ? bus-master-conf.bs : NULL; +bs[1] = bus-slave ? bus-slave-conf.bs : NULL; +} + /* - */ typedef struct IDEDrive { diff --git a/hw/pc.c b/hw/pc.c index 25ebafa..b577fb1 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -25,6 +25,7 @@ #include pc.h #include apic.h #include fdc.h +#include ide.h #include pci.h #include vmware_vga.h #include monitor.h @@ -275,14 +276,65 @@ static int pc_boot_set(void *opaque, const char *boot_device) return set_boot_dev(opaque, boot_device, 0); } -/* hd_table must contain 4 block drivers */ +typedef struct pc_cmos_init_late_arg { +ISADevice *rtc_state; +BusState *idebus0, *idebus1; +} pc_cmos_init_late_arg; + +static void pc_cmos_init_late(void *opaque) +{ +pc_cmos_init_late_arg *arg = opaque; +ISADevice *s = arg-rtc_state; +int val; +BlockDriverState *hd_table[4]; +int i; + +ide_get_bs(hd_table, arg-idebus0); +ide_get_bs(hd_table + 2, arg-idebus1); + +rtc_set_memory(s, 0x12, (hd_table[0] ? 0xf0 : 0) | (hd_table[1] ? 0x0f : 0)); +if (hd_table[0]) +cmos_init_hd(0x19, 0x1b, hd_table[0], s); +if (hd_table[1]) +cmos_init_hd(0x1a, 0x24, hd_table[1], s); + +val = 0; +for (i = 0; i 4; i++) { +if (hd_table[i]) { +int cylinders, heads, sectors, translation; +/* NOTE: bdrv_get_geometry_hint() returns the physical +geometry. It is always such that: 1 = sects = 63, 1 += heads = 16, 1 = cylinders = 16383. The BIOS +geometry can be different if a translation is done. */ +translation = bdrv_get_translation_hint(hd_table[i]); +if (translation == BIOS_ATA_TRANSLATION_AUTO) { +bdrv_get_geometry_hint(hd_table[i], cylinders, heads, sectors); +if (cylinders = 1024 heads = 16 sectors = 63) { +/* No translation. */ +translation = 0; +} else { +/* LBA translation. */ +translation = 1; +} +} else { +translation--; +} +val |= translation (i * 2); +} +} +rtc_set_memory(s, 0x39, val); + +qemu_unregister_reset(pc_cmos_init_late, opaque); +} + void pc_cmos_init(ram_addr_t ram_size, ram_addr_t above_4g_mem_size, - const char *boot_device, DriveInfo **hd_table, + const char *boot_device, + BusState *idebus0, BusState *idebus1, FDCtrl *floppy_controller, ISADevice *s) { int val; int fd0, fd1, nb; -int i; +static pc_cmos_init_late_arg arg; /* various important CMOS locations needed by PC/Bochs bios */ @@ -351,38 +403,10 @@ void pc_cmos_init(ram_addr_t ram_size, ram_addr_t above_4g_mem_size, rtc_set_memory(s, REG_EQUIPMENT_BYTE, val); /* hard drives */ - -rtc_set_memory(s, 0x12, (hd_table[0] ? 0xf0 : 0) | (hd_table[1] ? 0x0f : 0)); -if (hd_table[0]) -cmos_init_hd(0x19, 0x1b, hd_table[0]-bdrv, s); -if (hd_table[1]) -cmos_init_hd(0x1a, 0x24, hd_table[1]-bdrv, s); - -val = 0; -for (i = 0; i 4; i++) { -if (hd_table[i]) { -int cylinders, heads, sectors, translation; -
[Qemu-devel] [PATCH 23/23] block: Handle multiwrite errors only when all requests have completed
Don't try to be clever by freeing all temporary data and calling all callbacks when the return value (an error) is certain. Doing so has at least two important problems: * The temporary data that is freed (qiov, possibly zero buffer) is still used by the requests that have not yet completed. * Calling the callbacks for all requests in the multiwrite means for the caller that it may free buffers etc. which are still in use. Just remember the error value and do the cleanup when all requests have completed. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c |5 + 1 files changed, 1 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index e65971c..dd6dd76 100644 --- a/block.c +++ b/block.c @@ -2042,14 +2042,11 @@ static void multiwrite_cb(void *opaque, int ret) if (ret 0 !mcb-error) { mcb-error = ret; -multiwrite_user_cb(mcb); } mcb-num_requests--; if (mcb-num_requests == 0) { -if (mcb-error == 0) { -multiwrite_user_cb(mcb); -} +multiwrite_user_cb(mcb); qemu_free(mcb); } } -- 1.6.6.1
[Qemu-devel] [PATCH 22/23] block: Fix early failure in multiwrite
bdrv_aio_writev may call the callback immediately (and it will commonly do so in error cases). Current code doesn't consider this. For details see the comment added by this patch. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 35 +-- 1 files changed, 29 insertions(+), 6 deletions(-) diff --git a/block.c b/block.c index 9176dec..e65971c 100644 --- a/block.c +++ b/block.c @@ -2183,8 +2183,29 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // Check for mergable requests num_reqs = multiwrite_merge(bs, reqs, num_reqs, mcb); -// Run the aio requests +/* + * Run the aio requests. As soon as one request can't be submitted + * successfully, fail all requests that are not yet submitted (we must + * return failure for all requests anyway) + * + * num_requests cannot be set to the right value immediately: If + * bdrv_aio_writev fails for some request, num_requests would be too high + * and therefore multiwrite_cb() would never recognize the multiwrite + * request as completed. We also cannot use the loop variable i to set it + * when the first request fails because the callback may already have been + * called for previously submitted requests. Thus, num_requests must be + * incremented for each request that is submitted. + * + * The problem that callbacks may be called early also means that we need + * to take care that num_requests doesn't become 0 before all requests are + * submitted - multiwrite_cb() would consider the multiwrite request + * completed. A dummy request that is completed by a manual call to + * multiwrite_cb() takes care of this. + */ +mcb-num_requests = 1; + for (i = 0; i num_reqs; i++) { +mcb-num_requests++; acb = bdrv_aio_writev(bs, reqs[i].sector, reqs[i].qiov, reqs[i].nb_sectors, multiwrite_cb, mcb); @@ -2192,22 +2213,24 @@ int bdrv_aio_multiwrite(BlockDriverState *bs, BlockRequest *reqs, int num_reqs) // We can only fail the whole thing if no request has been // submitted yet. Otherwise we'll wait for the submitted AIOs to // complete and report the error in the callback. -if (mcb-num_requests == 0) { -reqs[i].error = -EIO; +if (i == 0) { goto fail; } else { -mcb-num_requests++; multiwrite_cb(mcb, -EIO); break; } -} else { -mcb-num_requests++; } } +/* Complete the dummy request */ +multiwrite_cb(mcb, 0); + return 0; fail: +for (i = 0; i mcb-num_callbacks; i++) { +reqs[i].error = -EIO; +} qemu_free(mcb); return -1; } -- 1.6.6.1
[Qemu-devel] [PATCH 21/23] qemu-img: avoid calling exit(1) to release resources properly
From: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp This patch removes exit(1) from error(), and properly releases resources such as a block driver and an allocated memory. For testing the Sheepdog block driver with qemu-iotests, it is necessary to call bdrv_delete() before the program exits. Because the driver releases the lock of VM images in the close handler. Signed-off-by: MORITA Kazutaka morita.kazut...@lab.ntt.co.jp Signed-off-by: Kevin Wolf kw...@redhat.com --- qemu-img.c | 237 +++- 1 files changed, 185 insertions(+), 52 deletions(-) diff --git a/qemu-img.c b/qemu-img.c index ea091f0..700af21 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -39,14 +39,13 @@ typedef struct img_cmd_t { /* Default to cache=writeback as data integrity is not important for qemu-tcg. */ #define BDRV_O_FLAGS BDRV_O_CACHE_WB -static void QEMU_NORETURN error(const char *fmt, ...) +static void error(const char *fmt, ...) { va_list ap; va_start(ap, fmt); fprintf(stderr, qemu-img: ); vfprintf(stderr, fmt, ap); fprintf(stderr, \n); -exit(1); va_end(ap); } @@ -197,57 +196,76 @@ static BlockDriverState *bdrv_new_open(const char *filename, char password[256]; bs = bdrv_new(); -if (!bs) +if (!bs) { error(Not enough memory); +goto fail; +} if (fmt) { drv = bdrv_find_format(fmt); -if (!drv) +if (!drv) { error(Unknown file format '%s', fmt); +goto fail; +} } else { drv = NULL; } if (bdrv_open(bs, filename, flags, drv) 0) { error(Could not open '%s', filename); +goto fail; } if (bdrv_is_encrypted(bs)) { printf(Disk image '%s' is encrypted.\n, filename); -if (read_password(password, sizeof(password)) 0) +if (read_password(password, sizeof(password)) 0) { error(No password given); -if (bdrv_set_key(bs, password) 0) +goto fail; +} +if (bdrv_set_key(bs, password) 0) { error(invalid password); +goto fail; +} } return bs; +fail: +if (bs) { +bdrv_delete(bs); +} +return NULL; } -static void add_old_style_options(const char *fmt, QEMUOptionParameter *list, +static int add_old_style_options(const char *fmt, QEMUOptionParameter *list, int flags, const char *base_filename, const char *base_fmt) { if (flags BLOCK_FLAG_ENCRYPT) { if (set_option_parameter(list, BLOCK_OPT_ENCRYPT, on)) { error(Encryption not supported for file format '%s', fmt); +return -1; } } if (flags BLOCK_FLAG_COMPAT6) { if (set_option_parameter(list, BLOCK_OPT_COMPAT6, on)) { error(VMDK version 6 not supported for file format '%s', fmt); +return -1; } } if (base_filename) { if (set_option_parameter(list, BLOCK_OPT_BACKING_FILE, base_filename)) { error(Backing file not supported for file format '%s', fmt); +return -1; } } if (base_fmt) { if (set_option_parameter(list, BLOCK_OPT_BACKING_FMT, base_fmt)) { error(Backing file format not supported for file format '%s', fmt); +return -1; } } +return 0; } static int img_create(int argc, char **argv) { -int c, ret, flags; +int c, ret = 0, flags; const char *fmt = raw; const char *base_fmt = NULL; const char *filename; @@ -293,12 +311,16 @@ static int img_create(int argc, char **argv) /* Find driver and parse its options */ drv = bdrv_find_format(fmt); -if (!drv) +if (!drv) { error(Unknown file format '%s', fmt); +return 1; +} proto_drv = bdrv_find_protocol(filename); -if (!proto_drv) +if (!proto_drv) { error(Unknown protocol '%s', filename); +return 1; +} create_options = append_option_parameters(create_options, drv-create_options); @@ -307,7 +329,7 @@ static int img_create(int argc, char **argv) if (options !strcmp(options, ?)) { print_option_help(create_options); -return 0; +goto out; } /* Create parameter list with default values */ @@ -319,6 +341,8 @@ static int img_create(int argc, char **argv) param = parse_option_parameters(options, create_options, param); if (param == NULL) { error(Invalid options for file format '%s'., fmt); +ret = -1; +goto out; } } @@ -328,7 +352,10 @@ static int img_create(int argc, char **argv) } /* Add old-style options to parameters */ -add_old_style_options(fmt, param, flags, base_filename, base_fmt); +ret = add_old_style_options(fmt, param, flags, base_filename, base_fmt); +if (ret 0) { +
[Qemu-devel] [PATCH 17/23] block: Clean up bdrv_snapshots()
From: Markus Armbruster arm...@redhat.com Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c |9 - 1 files changed, 4 insertions(+), 5 deletions(-) diff --git a/block.c b/block.c index feda755..003d132 100644 --- a/block.c +++ b/block.c @@ -1789,19 +1789,18 @@ BlockDriverState *bdrv_snapshots(void) { BlockDriverState *bs; -if (bs_snapshots) +if (bs_snapshots) { return bs_snapshots; +} bs = NULL; while ((bs = bdrv_next(bs))) { if (bdrv_can_snapshot(bs)) { -goto ok; +bs_snapshots = bs; +return bs; } } return NULL; - ok: -bs_snapshots = bs; -return bs; } int bdrv_snapshot_create(BlockDriverState *bs, -- 1.6.6.1
[Qemu-devel] [PATCH 16/23] savevm: Survive hot-unplug of snapshot device
From: Markus Armbruster arm...@redhat.com savevm.c keeps a pointer to the snapshot block device. If you manage to get that device deleted, the pointer dangles, and the next snapshot operation will crash burn. Unplugging a guest device that uses it does the trick: $ MALLOC_PERTURB_=234 qemu-system-x86_64 [...] QEMU 0.12.50 monitor - type 'help' for more information (qemu) info snapshots No available block device supports snapshots (qemu) drive_add auto if=none,file=tmp.qcow2 OK (qemu) device_add usb-storage,id=foo,drive=none1 (qemu) info snapshots Snapshot devices: none1 Snapshot list (from none1): IDTAG VM SIZEDATE VM CLOCK (qemu) device_del foo (qemu) info snapshots Snapshot devices: Segmentation fault (core dumped) Move management of that pointer to block.c, and zap it when the device it points becomes unusable. Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c | 26 ++ block.h |1 + savevm.c | 31 --- 3 files changed, 31 insertions(+), 27 deletions(-) diff --git a/block.c b/block.c index 4c65035..feda755 100644 --- a/block.c +++ b/block.c @@ -63,6 +63,9 @@ static QTAILQ_HEAD(, BlockDriverState) bdrv_states = static QLIST_HEAD(, BlockDriver) bdrv_drivers = QLIST_HEAD_INITIALIZER(bdrv_drivers); +/* The device to use for VM snapshots */ +static BlockDriverState *bs_snapshots; + /* If non-zero, use only whitelisted block drivers */ static int use_bdrv_whitelist; @@ -629,6 +632,9 @@ unlink_and_fail: void bdrv_close(BlockDriverState *bs) { if (bs-drv) { +if (bs == bs_snapshots) { +bs_snapshots = NULL; +} if (bs-backing_hd) { bdrv_delete(bs-backing_hd); bs-backing_hd = NULL; @@ -677,6 +683,7 @@ void bdrv_delete(BlockDriverState *bs) bdrv_delete(bs-file); } +assert(bs != bs_snapshots); qemu_free(bs); } @@ -1778,6 +1785,25 @@ int bdrv_can_snapshot(BlockDriverState *bs) return 1; } +BlockDriverState *bdrv_snapshots(void) +{ +BlockDriverState *bs; + +if (bs_snapshots) +return bs_snapshots; + +bs = NULL; +while ((bs = bdrv_next(bs))) { +if (bdrv_can_snapshot(bs)) { +goto ok; +} +} +return NULL; + ok: +bs_snapshots = bs; +return bs; +} + int bdrv_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info) { diff --git a/block.h b/block.h index 88ac06e..012c2a1 100644 --- a/block.h +++ b/block.h @@ -193,6 +193,7 @@ const char *bdrv_get_encrypted_filename(BlockDriverState *bs); void bdrv_get_backing_filename(BlockDriverState *bs, char *filename, int filename_size); int bdrv_can_snapshot(BlockDriverState *bs); +BlockDriverState *bdrv_snapshots(void); int bdrv_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info); int bdrv_snapshot_goto(BlockDriverState *bs, diff --git a/savevm.c b/savevm.c index 20354a8..f1f450e 100644 --- a/savevm.c +++ b/savevm.c @@ -83,9 +83,6 @@ #include qemu_socket.h #include qemu-queue.h -/* point to the block driver where the snapshots are managed */ -static BlockDriverState *bs_snapshots; - #define SELF_ANNOUNCE_ROUNDS 5 #ifndef ETH_P_RARP @@ -1575,26 +1572,6 @@ out: return ret; } -static BlockDriverState *get_bs_snapshots(void) -{ -BlockDriverState *bs; - -if (bs_snapshots) -return bs_snapshots; -/* FIXME what if bs_snapshots gets hot-unplugged? */ - -bs = NULL; -while ((bs = bdrv_next(bs))) { -if (bdrv_can_snapshot(bs)) { -goto ok; -} -} -return NULL; - ok: -bs_snapshots = bs; -return bs; -} - static int bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info, const char *name) { @@ -1674,7 +1651,7 @@ void do_savevm(Monitor *mon, const QDict *qdict) } } -bs = get_bs_snapshots(); +bs = bdrv_snapshots(); if (!bs) { monitor_printf(mon, No block device can accept snapshots\n); return; @@ -1769,7 +1746,7 @@ int load_vmstate(const char *name) } } -bs = get_bs_snapshots(); +bs = bdrv_snapshots(); if (!bs) { error_report(No block device supports snapshots); return -EINVAL; @@ -1833,7 +1810,7 @@ void do_delvm(Monitor *mon, const QDict *qdict) int ret; const char *name = qdict_get_str(qdict, name); -bs = get_bs_snapshots(); +bs = bdrv_snapshots(); if (!bs) { monitor_printf(mon, No block device supports snapshots\n); return; @@ -1863,7 +1840,7 @@ void do_info_snapshots(Monitor *mon) int nb_sns, i; char buf[256]; -bs = get_bs_snapshots(); +bs = bdrv_snapshots(); if (!bs) {
[Qemu-devel] [PATCH 15/23] blkdebug: Initialize state as 1
state = 0 in rules means that the rule is valid for any state. Therefore it's impossible to have a rule that works only in the initial state. This changes the initial state from 0 to 1 to make this possible. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/blkdebug.c |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/block/blkdebug.c b/block/blkdebug.c index 78cbff4..2a63df9 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -301,6 +301,9 @@ static int blkdebug_open(BlockDriverState *bs, const char *filename, int flags) } filename = c + 1; +/* Set initial state */ +s-vars.state = 1; + /* Open the backing file */ ret = bdrv_file_open(bs-file, filename, flags); if (ret 0) { -- 1.6.6.1
[Qemu-devel] [PATCH 09/23] qdev: Decouple qdev_prop_drive from DriveInfo
From: Markus Armbruster arm...@redhat.com Make the property point to BlockDriverState, cutting out the DriveInfo middleman. This prepares the ground for block devices that don't have a DriveInfo. Currently all user-defined ones have a DriveInfo, because the only way to define one is -drive friends (they go through drive_init()). DriveInfo is closely tied to -drive, and like -drive, it mixes information about host and guest part of the block device. I'm working towards a new way to define block devices, with clean host/guest separation, and I need to get DriveInfo out of the way for that. Fortunately, the device models are perfectly happy with BlockDriverState, except for two places: ide_drive_initfn() and scsi_disk_initfn() need to check the DriveInfo for a serial number set with legacy -drive serial=... Use drive_get_by_blockdev() there. Device model code should now use DriveInfo only when explicitly dealing with drives defined the old way, i.e. without -device. Signed-off-by: Markus Armbruster arm...@redhat.com Reviewed-by: Christoph Hellwig h...@lst.de Signed-off-by: Kevin Wolf kw...@redhat.com --- block_int.h |6 ++ hw/fdc.c | 22 ++ hw/ide/core.c| 17 + hw/ide/internal.h|2 +- hw/ide/qdev.c| 12 hw/pci-hotplug.c |4 ++-- hw/qdev-properties.c | 21 - hw/qdev.h|6 +++--- hw/s390-virtio.c |2 +- hw/scsi-bus.c|8 hw/scsi-disk.c | 16 +++- hw/scsi-generic.c|6 +++--- hw/scsi.h|2 +- hw/usb-msd.c | 15 +++ hw/virtio-blk.c |2 +- hw/virtio-pci.c |4 ++-- 16 files changed, 73 insertions(+), 72 deletions(-) diff --git a/block_int.h b/block_int.h index b64a009..e60aed4 100644 --- a/block_int.h +++ b/block_int.h @@ -210,10 +210,8 @@ void *qemu_blockalign(BlockDriverState *bs, size_t size); int is_windows_drive(const char *filename); #endif -struct DriveInfo; - typedef struct BlockConf { -struct DriveInfo *dinfo; +BlockDriverState *bs; uint16_t physical_block_size; uint16_t logical_block_size; uint16_t min_io_size; @@ -234,7 +232,7 @@ static inline unsigned int get_physical_block_exp(BlockConf *conf) } #define DEFINE_BLOCK_PROPERTIES(_state, _conf) \ -DEFINE_PROP_DRIVE(drive, _state, _conf.dinfo),\ +DEFINE_PROP_DRIVE(drive, _state, _conf.bs), \ DEFINE_PROP_UINT16(logical_block_size, _state,\ _conf.logical_block_size, 512), \ DEFINE_PROP_UINT16(physical_block_size, _state, \ diff --git a/hw/fdc.c b/hw/fdc.c index 45a876d..08712bc 100644 --- a/hw/fdc.c +++ b/hw/fdc.c @@ -80,7 +80,6 @@ typedef enum FDiskFlags { } FDiskFlags; typedef struct FDrive { -DriveInfo *dinfo; BlockDriverState *bs; /* Drive status */ FDriveType drive; @@ -100,7 +99,6 @@ typedef struct FDrive { static void fd_init(FDrive *drv) { /* Drive */ -drv-bs = drv-dinfo ? drv-dinfo-bdrv : NULL; drv-drive = FDRIVE_DRV_NONE; drv-perpendicular = 0; /* Disk */ @@ -1862,10 +1860,10 @@ FDCtrl *fdctrl_init_isa(DriveInfo **fds) dev = isa_create(isa-fdc); if (fds[0]) { -qdev_prop_set_drive(dev-qdev, driveA, fds[0]); +qdev_prop_set_drive(dev-qdev, driveA, fds[0]-bdrv); } if (fds[1]) { -qdev_prop_set_drive(dev-qdev, driveB, fds[1]); +qdev_prop_set_drive(dev-qdev, driveB, fds[1]-bdrv); } if (qdev_init(dev-qdev) 0) return NULL; @@ -1884,10 +1882,10 @@ FDCtrl *fdctrl_init_sysbus(qemu_irq irq, int dma_chann, fdctrl = sys-state; fdctrl-dma_chann = dma_chann; /* FIXME */ if (fds[0]) { -qdev_prop_set_drive(dev, driveA, fds[0]); +qdev_prop_set_drive(dev, driveA, fds[0]-bdrv); } if (fds[1]) { -qdev_prop_set_drive(dev, driveB, fds[1]); +qdev_prop_set_drive(dev, driveB, fds[1]-bdrv); } qdev_init_nofail(dev); sysbus_connect_irq(sys-busdev, 0, irq); @@ -1905,7 +1903,7 @@ FDCtrl *sun4m_fdctrl_init(qemu_irq irq, target_phys_addr_t io_base, dev = qdev_create(NULL, SUNW,fdtwo); if (fds[0]) { -qdev_prop_set_drive(dev, drive, fds[0]); +qdev_prop_set_drive(dev, drive, fds[0]-bdrv); } qdev_init_nofail(dev); sys = DO_UPCAST(FDCtrlSysBus, busdev.qdev, dev); @@ -2030,8 +2028,8 @@ static ISADeviceInfo isa_fdc_info = { .qdev.vmsd = vmstate_isa_fdc, .qdev.reset = fdctrl_external_reset_isa, .qdev.props = (Property[]) { -DEFINE_PROP_DRIVE(driveA, FDCtrlISABus, state.drives[0].dinfo), -DEFINE_PROP_DRIVE(driveB, FDCtrlISABus, state.drives[1].dinfo), +DEFINE_PROP_DRIVE(driveA, FDCtrlISABus, state.drives[0].bs), +DEFINE_PROP_DRIVE(driveB,
[Qemu-devel] [PATCH 07/23] blockdev: New drive_get_by_blockdev()
From: Markus Armbruster arm...@redhat.com Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- blockdev.c | 12 blockdev.h |1 + 2 files changed, 13 insertions(+), 0 deletions(-) diff --git a/blockdev.c b/blockdev.c index e0495e5..ba4f66f 100644 --- a/blockdev.c +++ b/blockdev.c @@ -78,6 +78,18 @@ int drive_get_max_bus(BlockInterfaceType type) return max_bus; } +DriveInfo *drive_get_by_blockdev(BlockDriverState *bs) +{ +DriveInfo *dinfo; + +QTAILQ_FOREACH(dinfo, drives, next) { +if (dinfo-bdrv == bs) { +return dinfo; +} +} +return NULL; +} + static void bdrv_format_print(void *opaque, const char *name) { fprintf(stderr, %s, name); diff --git a/blockdev.h b/blockdev.h index a936785..6ab592f 100644 --- a/blockdev.h +++ b/blockdev.h @@ -40,6 +40,7 @@ extern DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit); extern DriveInfo *drive_get_by_id(const char *id); extern int drive_get_max_bus(BlockInterfaceType type); extern void drive_uninit(DriveInfo *dinfo); +extern DriveInfo *drive_get_by_blockdev(BlockDriverState *bs); extern QemuOpts *drive_add(const char *file, const char *fmt, ...); extern DriveInfo *drive_init(QemuOpts *arg, int default_to_scsi, -- 1.6.6.1
[Qemu-devel] [PATCH 08/23] blockdev: Clean up automatic drive deletion
From: Markus Armbruster arm...@redhat.com We automatically delete blockdev host parts on unplug of the guest device. Too much magic, but we can't change that now. The delete happens early in the guest device teardown, before the connection to the host part is severed. Thus, the guest part's pointer to the host part dangles for a brief time. No actual harm comes from this, but we'll catch such dangling pointers a few commits down the road. Clean up the dangling pointers by delaying the automatic deletion until the guest part's pointer is gone. Device usb-storage deliberately makes two qdev properties refer to the same drive, because it automatically creates a second device. Again, too much magic we can't change now. Multiple references worked okay before, but now free_drive() dies for the second one. Zap the extra reference. Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- blockdev.c | 23 +++ blockdev.h |4 hw/qdev-properties.c | 10 ++ hw/scsi-disk.c |2 +- hw/scsi-generic.c|2 +- hw/usb-msd.c | 20 hw/virtio-pci.c |2 +- 7 files changed, 56 insertions(+), 7 deletions(-) diff --git a/blockdev.c b/blockdev.c index ba4f66f..4848112 100644 --- a/blockdev.c +++ b/blockdev.c @@ -17,6 +17,29 @@ static QTAILQ_HEAD(drivelist, DriveInfo) drives = QTAILQ_HEAD_INITIALIZER(drives); +/* + * We automatically delete the drive when a device using it gets + * unplugged. Questionable feature, but we can't just drop it. + * Device models call blockdev_mark_auto_del() to schedule the + * automatic deletion, and generic qdev code calls blockdev_auto_del() + * when deletion is actually safe. + */ +void blockdev_mark_auto_del(BlockDriverState *bs) +{ +DriveInfo *dinfo = drive_get_by_blockdev(bs); + +dinfo-auto_del = 1; +} + +void blockdev_auto_del(BlockDriverState *bs) +{ +DriveInfo *dinfo = drive_get_by_blockdev(bs); + +if (dinfo-auto_del) { +drive_uninit(dinfo); +} +} + QemuOpts *drive_add(const char *file, const char *fmt, ...) { va_list ap; diff --git a/blockdev.h b/blockdev.h index 6ab592f..32e6979 100644 --- a/blockdev.h +++ b/blockdev.h @@ -13,6 +13,9 @@ #include block.h #include qemu-queue.h +void blockdev_mark_auto_del(BlockDriverState *bs); +void blockdev_auto_del(BlockDriverState *bs); + typedef enum { IF_NONE, IF_IDE, IF_SCSI, IF_FLOPPY, IF_PFLASH, IF_MTD, IF_SD, IF_VIRTIO, IF_XEN, @@ -28,6 +31,7 @@ typedef struct DriveInfo { BlockInterfaceType type; int bus; int unit; +int auto_del; /* see blockdev_mark_auto_del() */ QemuOpts *opts; char serial[BLOCK_SERIAL_STRLEN + 1]; QTAILQ_ENTRY(DriveInfo) next; diff --git a/hw/qdev-properties.c b/hw/qdev-properties.c index 5b7fd77..d7dc234 100644 --- a/hw/qdev-properties.c +++ b/hw/qdev-properties.c @@ -313,6 +313,15 @@ static int parse_drive(DeviceState *dev, Property *prop, const char *str) return 0; } +static void free_drive(DeviceState *dev, Property *prop) +{ +DriveInfo **ptr = qdev_get_prop_ptr(dev, prop); + +if (*ptr) { +blockdev_auto_del((*ptr)-bdrv); +} +} + static int print_drive(DeviceState *dev, Property *prop, char *dest, size_t len) { DriveInfo **ptr = qdev_get_prop_ptr(dev, prop); @@ -325,6 +334,7 @@ PropertyInfo qdev_prop_drive = { .size = sizeof(DriveInfo*), .parse = parse_drive, .print = print_drive, +.free = free_drive, }; /* --- character device --- */ diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index 2b38984..d76e640 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -1043,7 +1043,7 @@ static void scsi_destroy(SCSIDevice *dev) SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev); scsi_disk_purge_requests(s); -drive_uninit(s-qdev.conf.dinfo); +blockdev_mark_auto_del(s-qdev.conf.dinfo-bdrv); } static int scsi_disk_initfn(SCSIDevice *dev) diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c index e31060e..1859c94 100644 --- a/hw/scsi-generic.c +++ b/hw/scsi-generic.c @@ -453,7 +453,7 @@ static void scsi_destroy(SCSIDevice *d) r = DO_UPCAST(SCSIGenericReq, req, QTAILQ_FIRST(s-qdev.requests)); scsi_remove_request(r); } -drive_uninit(s-qdev.conf.dinfo); +blockdev_mark_auto_del(s-qdev.conf.dinfo-bdrv); } static int scsi_generic_initfn(SCSIDevice *dev) diff --git a/hw/usb-msd.c b/hw/usb-msd.c index 8e9718c..3dbfcab 100644 --- a/hw/usb-msd.c +++ b/hw/usb-msd.c @@ -522,24 +522,36 @@ static void usb_msd_password_cb(void *opaque, int err) static int usb_msd_initfn(USBDevice *dev) { MSDState *s = DO_UPCAST(MSDState, dev, dev); +DriveInfo *dinfo = s-conf.dinfo; -if (!s-conf.dinfo || !s-conf.dinfo-bdrv) { +if (!dinfo || !dinfo-bdrv) { error_report(usb-msd: drive property not set); return -1; } +/* + * Hack alert:
[Qemu-devel] [PATCH 05/23] blockdev: Remove drive_get_serial()
From: Markus Armbruster arm...@redhat.com Unused since commit 6ced55a5. Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- blockdev.c | 12 blockdev.h |1 - 2 files changed, 0 insertions(+), 13 deletions(-) diff --git a/blockdev.c b/blockdev.c index 3b8c606..e0495e5 100644 --- a/blockdev.c +++ b/blockdev.c @@ -78,18 +78,6 @@ int drive_get_max_bus(BlockInterfaceType type) return max_bus; } -const char *drive_get_serial(BlockDriverState *bdrv) -{ -DriveInfo *dinfo; - -QTAILQ_FOREACH(dinfo, drives, next) { -if (dinfo-bdrv == bdrv) -return dinfo-serial; -} - -return \0; -} - static void bdrv_format_print(void *opaque, const char *name) { fprintf(stderr, %s, name); diff --git a/blockdev.h b/blockdev.h index 23ea576..a936785 100644 --- a/blockdev.h +++ b/blockdev.h @@ -40,7 +40,6 @@ extern DriveInfo *drive_get(BlockInterfaceType type, int bus, int unit); extern DriveInfo *drive_get_by_id(const char *id); extern int drive_get_max_bus(BlockInterfaceType type); extern void drive_uninit(DriveInfo *dinfo); -extern const char *drive_get_serial(BlockDriverState *bdrv); extern QemuOpts *drive_add(const char *file, const char *fmt, ...); extern DriveInfo *drive_init(QemuOpts *arg, int default_to_scsi, -- 1.6.6.1
[Qemu-devel] [PATCH 18/23] block: Fix virtual media change for if=none
From: Markus Armbruster arm...@redhat.com BlockDriverState member removable controls whether virtual media change (monitor commands change, eject) is allowed. It is set when the type hint is BDRV_TYPE_CDROM or BDRV_TYPE_FLOPPY. The type hint is only set by drive_init(). It sets BDRV_TYPE_FLOPPY for if=floppy. It sets BDRV_TYPE_CDROM for media=cdrom and if=ide, scsi, xen, or none. if=ide and if=scsi work, because the type hint makes it a CD-ROM. if=xen likewise, I think. For the same reason, if=none works when it's used by ide-drive or scsi-disk. For other guest devices, there are problems: * fdc: you can't change virtual media $ qemu [...] -drive if=none,id=foo,... -global isa-fdc.driveA=foo QEMU 0.12.50 monitor - type 'help' for more information (qemu) eject foo Device 'foo' is not removable unless you add media=cdrom, but that makes it readonly. * virtio: if you add media=cdrom, you can change virtual media. If you eject, the guest gets I/O errors. If you change, the guest sees the drive's contents suddenly change. * scsi-generic: if you add media=cdrom, you can change virtual media. I didn't test what that does to the guest or the physical device, but it can't be pretty. Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c |8 block.h |1 + hw/fdc.c | 10 -- hw/ide/core.c |1 + hw/scsi-disk.c|5 - hw/scsi-generic.c |1 + hw/virtio-blk.c |1 + 7 files changed, 24 insertions(+), 3 deletions(-) diff --git a/block.c b/block.c index 003d132..9176dec 100644 --- a/block.c +++ b/block.c @@ -1299,6 +1299,14 @@ BlockErrorAction bdrv_get_on_error(BlockDriverState *bs, int is_read) return is_read ? bs-on_read_error : bs-on_write_error; } +void bdrv_set_removable(BlockDriverState *bs, int removable) +{ +bs-removable = removable; +if (removable bs == bs_snapshots) { +bs_snapshots = NULL; +} +} + int bdrv_is_removable(BlockDriverState *bs) { return bs-removable; diff --git a/block.h b/block.h index 012c2a1..3d03b3e 100644 --- a/block.h +++ b/block.h @@ -162,6 +162,7 @@ int bdrv_get_translation_hint(BlockDriverState *bs); void bdrv_set_on_error(BlockDriverState *bs, BlockErrorAction on_read_error, BlockErrorAction on_write_error); BlockErrorAction bdrv_get_on_error(BlockDriverState *bs, int is_read); +void bdrv_set_removable(BlockDriverState *bs, int removable); int bdrv_is_removable(BlockDriverState *bs); int bdrv_is_read_only(BlockDriverState *bs); int bdrv_is_sg(BlockDriverState *bs); diff --git a/hw/fdc.c b/hw/fdc.c index 1496cfa..6c74878 100644 --- a/hw/fdc.c +++ b/hw/fdc.c @@ -1847,10 +1847,16 @@ static void fdctrl_result_timer(void *opaque) static void fdctrl_connect_drives(FDCtrl *fdctrl) { unsigned int i; +FDrive *drive; for (i = 0; i MAX_FD; i++) { -fd_init(fdctrl-drives[i]); -fd_revalidate(fdctrl-drives[i]); +drive = fdctrl-drives[i]; + +fd_init(drive); +fd_revalidate(drive); +if (drive-bs) { +bdrv_set_removable(drive-bs, 1); +} } } diff --git a/hw/ide/core.c b/hw/ide/core.c index cc4591b..ebdceb5 100644 --- a/hw/ide/core.c +++ b/hw/ide/core.c @@ -2629,6 +2629,7 @@ void ide_init_drive(IDEState *s, BlockDriverState *bs, pstrcpy(s-version, sizeof(s-version), QEMU_VERSION); } ide_reset(s); +bdrv_set_removable(bs, s-is_cdrom); } static void ide_init1(IDEBus *bus, int unit) diff --git a/hw/scsi-disk.c b/hw/scsi-disk.c index e900eff..3e41011 100644 --- a/hw/scsi-disk.c +++ b/hw/scsi-disk.c @@ -1049,6 +1049,7 @@ static void scsi_destroy(SCSIDevice *dev) static int scsi_disk_initfn(SCSIDevice *dev) { SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev, dev); +int is_cd; DriveInfo *dinfo; if (!s-qdev.conf.bs) { @@ -1056,6 +1057,7 @@ static int scsi_disk_initfn(SCSIDevice *dev) return -1; } s-bs = s-qdev.conf.bs; +is_cd = bdrv_get_type_hint(s-bs) == BDRV_TYPE_CDROM; if (!s-serial) { /* try to fall back to value set with legacy -drive serial=... */ @@ -1072,7 +1074,7 @@ static int scsi_disk_initfn(SCSIDevice *dev) return -1; } -if (bdrv_get_type_hint(s-bs) == BDRV_TYPE_CDROM) { +if (is_cd) { s-qdev.blocksize = 2048; } else { s-qdev.blocksize = s-qdev.conf.logical_block_size; @@ -1081,6 +1083,7 @@ static int scsi_disk_initfn(SCSIDevice *dev) s-qdev.type = TYPE_DISK; qemu_add_vm_change_state_handler(scsi_dma_restart_cb, s); +bdrv_set_removable(s-bs, is_cd); return 0; } diff --git a/hw/scsi-generic.c b/hw/scsi-generic.c index 79347f4..3915e78 100644 --- a/hw/scsi-generic.c +++ b/hw/scsi-generic.c @@ -509,6 +509,7 @@ static int scsi_generic_initfn(SCSIDevice *dev) DPRINTF(block size %d\n, s-qdev.blocksize);
[Qemu-devel] Re: Status update
On Fri, Jul 02, 2010 at 09:03:39AM +0100, Stefan Hajnoczi wrote: That's true, but it's fair to be concerned about the guest itself. Imagine it runs some possibly malicious apps which program the hardware to do DMA. That should be safe when a IOMMU is present. But suddenly the guest OS changes mappings and expects the IOMMU to enforce them as soon as invalidation commands are completed. The guest then reclaims the old space for other uses. This leaves an opportunity for those processes to corrupt or read sensitive data. As long as QEMU acts in the same way as real hardware we should be okay. Will real hardware change the mappings immediately and abort the DMA from the device if it tries to access an invalidated address? Stefan Real, non-IOMMU aware hardware looks like this (simplified): Device - IOMMU - Memory (1) Most QEMU translated devices would do exactly that. But AIO is something like this: Device + ---translation--- IOMMU (2) ^ | -R/W Memory There are two reasons this form isn't reducible to the former in case of AIO: 1. It uses cpu_physical_memory_map(). 2. The actual I/O happens on a separate thread. Real hardware of form (1) has no problems since all access is serialized through the IOMMU. That doesn't mean mapping updates happen instantly. But software can wait (and be notified) for the updates to happen. Real hardware of form (2) is comprised of devices which have their own IOTLB, I think. But unlike cpu_physical_memory_map() in software-land, these mappings can be updated during I/O. (These devices are necessarily IOMMU-aware.) As I see it, this mmaping trick is quite crucial to AIO and I'm not sure there's a way around it. The solution I proposed is making the IOMMU wait* for AIO to finish. Perhaps we can also break mmaping into smaller chunks so they complete in a reasonable time. * - Waiting here means to delay notifying the guest OS that invalidation commands completed. This is the important part: if the guest is told the mappings have been updated, then they really have to be. Eduard
[Qemu-devel] [PATCH 04/23] ide: Make it explicit that ide_create_drive() can't fail
From: Markus Armbruster arm...@redhat.com All callers of ide_create_drive() ignore its value. Currently harmless, because it fails only when qdev_init() fails, which fails only when ide_drive_initfn() fails, which never fails. Brittle. Change it to die instead of silently ignoring failure. Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- hw/ide/qdev.c |3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/hw/ide/qdev.c b/hw/ide/qdev.c index 0f9f22e..127478b 100644 --- a/hw/ide/qdev.c +++ b/hw/ide/qdev.c @@ -84,8 +84,7 @@ IDEDevice *ide_create_drive(IDEBus *bus, int unit, DriveInfo *drive) dev = qdev_create(bus-qbus, ide-drive); qdev_prop_set_uint32(dev, unit, unit); qdev_prop_set_drive(dev, drive, drive); -if (qdev_init(dev) 0) -return NULL; +qdev_init_nofail(dev); return DO_UPCAST(IDEDevice, qdev, dev); } -- 1.6.6.1
[Qemu-devel] [PATCH v3 01/16] Remove uses of ram.last_offset (aka last_ram_offset)
We currently need this either to allocate the next ram_addr_t for a new block, or for total memory to be migrated. Both of which we can calculate without need of this to keep us in a contiguous address space. Signed-off-by: Alex Williamson alex.william...@redhat.com --- arch_init.c | 23 --- cpu-all.h |1 - exec.c | 19 ++- 3 files changed, 30 insertions(+), 13 deletions(-) diff --git a/arch_init.c b/arch_init.c index eb5b67c..109dcef 100644 --- a/arch_init.c +++ b/arch_init.c @@ -108,9 +108,10 @@ static int ram_save_block(QEMUFile *f) static ram_addr_t current_addr = 0; ram_addr_t saved_addr = current_addr; ram_addr_t addr = 0; +uint64_t total_ram = ram_bytes_total(); int bytes_sent = 0; -while (addr ram_list.last_offset) { +while (addr total_ram) { if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) { uint8_t *p; @@ -133,7 +134,7 @@ static int ram_save_block(QEMUFile *f) break; } addr += TARGET_PAGE_SIZE; -current_addr = (saved_addr + addr) % ram_list.last_offset; +current_addr = (saved_addr + addr) % total_ram; } return bytes_sent; @@ -145,8 +146,9 @@ static ram_addr_t ram_save_remaining(void) { ram_addr_t addr; ram_addr_t count = 0; +uint64_t total_ram = ram_bytes_total(); -for (addr = 0; addr ram_list.last_offset; addr += TARGET_PAGE_SIZE) { +for (addr = 0; addr total_ram; addr += TARGET_PAGE_SIZE) { if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) { count++; } @@ -167,7 +169,13 @@ uint64_t ram_bytes_transferred(void) uint64_t ram_bytes_total(void) { -return ram_list.last_offset; +RAMBlock *block; +uint64_t total = 0; + +QLIST_FOREACH(block, ram_list.blocks, next) +total += block-length; + +return total; } int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) @@ -188,10 +196,11 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) } if (stage == 1) { +uint64_t total_ram = ram_bytes_total(); bytes_transferred = 0; /* Make sure all dirty bits are set */ -for (addr = 0; addr ram_list.last_offset; addr += TARGET_PAGE_SIZE) { +for (addr = 0; addr total_ram; addr += TARGET_PAGE_SIZE) { if (!cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) { cpu_physical_memory_set_dirty(addr); } @@ -200,7 +209,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) /* Enable dirty memory tracking */ cpu_physical_memory_set_dirty_tracking(1); -qemu_put_be64(f, ram_list.last_offset | RAM_SAVE_FLAG_MEM_SIZE); +qemu_put_be64(f, total_ram | RAM_SAVE_FLAG_MEM_SIZE); } bytes_transferred_last = bytes_transferred; @@ -259,7 +268,7 @@ int ram_load(QEMUFile *f, void *opaque, int version_id) addr = TARGET_PAGE_MASK; if (flags RAM_SAVE_FLAG_MEM_SIZE) { -if (addr != ram_list.last_offset) { +if (addr != ram_bytes_total()) { return -EINVAL; } } diff --git a/cpu-all.h b/cpu-all.h index e31c2de..dbb2139 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -870,7 +870,6 @@ typedef struct RAMBlock { typedef struct RAMList { uint8_t *phys_dirty; -ram_addr_t last_offset; QLIST_HEAD(ram, RAMBlock) blocks; } RAMList; extern RAMList ram_list; diff --git a/exec.c b/exec.c index 137175c..a6b3f21 100644 --- a/exec.c +++ b/exec.c @@ -2767,6 +2767,17 @@ static void *file_ram_alloc(ram_addr_t memory, const char *path) } #endif +static ram_addr_t find_ram_offset(ram_addr_t size) +{ +RAMBlock *block; +ram_addr_t last = 0; + +QLIST_FOREACH(block, ram_list.blocks, next) +last = MAX(last, block-offset + block-length); + +return last; +} + ram_addr_t qemu_ram_alloc(ram_addr_t size) { RAMBlock *new_block; @@ -2800,18 +2811,16 @@ ram_addr_t qemu_ram_alloc(ram_addr_t size) madvise(new_block-host, size, MADV_MERGEABLE); #endif } -new_block-offset = ram_list.last_offset; +new_block-offset = find_ram_offset(size); new_block-length = size; QLIST_INSERT_HEAD(ram_list.blocks, new_block, next); ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty, -(ram_list.last_offset + size) TARGET_PAGE_BITS); -memset(ram_list.phys_dirty + (ram_list.last_offset TARGET_PAGE_BITS), +(new_block-offset + size) TARGET_PAGE_BITS); +memset(ram_list.phys_dirty + (new_block-offset TARGET_PAGE_BITS), 0xff, size TARGET_PAGE_BITS); -ram_list.last_offset += size; - if (kvm_enabled()) kvm_setup_guest_memory(new_block-host, size);
[Qemu-devel] [PATCH v3 08/16] virtio-net: Incorporate a DeviceState pointer and let savevm track instances
Stuff a pointer to the DeviceState into the VirtIONet structure so that we can easily remove the vmstate entry later. Also, let vmstate track the instance number (it should always be zero internally since the device path should now be unique). Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/virtio-net.c |7 --- 1 files changed, 4 insertions(+), 3 deletions(-) diff --git a/hw/virtio-net.c b/hw/virtio-net.c index e9768e0..f41db45 100644 --- a/hw/virtio-net.c +++ b/hw/virtio-net.c @@ -60,6 +60,7 @@ typedef struct VirtIONet uint8_t *macs; } mac_table; uint32_t *vlans; +DeviceState *qdev; } VirtIONet; /* TODO @@ -890,7 +891,6 @@ static void virtio_net_vmstate_change(void *opaque, int running, int reason) VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf) { VirtIONet *n; -static int virtio_net_id; n = (VirtIONet *)virtio_common_init(virtio-net, VIRTIO_ID_NET, sizeof(struct virtio_net_config), @@ -923,7 +923,8 @@ VirtIODevice *virtio_net_init(DeviceState *dev, NICConf *conf) n-vlans = qemu_mallocz(MAX_VLAN 3); -register_savevm(NULL, virtio-net, virtio_net_id++, VIRTIO_NET_VM_VERSION, +n-qdev = dev; +register_savevm(dev, virtio-net, -1, VIRTIO_NET_VM_VERSION, virtio_net_save, virtio_net_load, n); n-vmstate = qemu_add_vm_change_state_handler(virtio_net_vmstate_change, n); @@ -941,7 +942,7 @@ void virtio_net_exit(VirtIODevice *vdev) qemu_purge_queued_packets(n-nic-nc); -unregister_savevm(NULL, virtio-net, n); +unregister_savevm(n-qdev, virtio-net, n); qemu_free(n-mac_table.macs); qemu_free(n-vlans);
[Qemu-devel] [PATCH v3 10/16] ramblocks: Make use of DeviceState pointer and BusInfo.get_dev_path
With these two pieces in place, we can start naming ramblocks. When the device is present and it lives on a bus that provides a device path, we concatenate the path and the provided name. Otherwise we just use name. The resulting id string must be unique. For now we assume an allocation for the same name and size is a device that has been removed and reinserted and return the same block. This will go away once qemu_ram_free() is implemented. Signed-off-by: Alex Williamson alex.william...@redhat.com --- cpu-all.h |1 + exec.c| 29 +++-- 2 files changed, 28 insertions(+), 2 deletions(-) diff --git a/cpu-all.h b/cpu-all.h index dbb2139..5d8342b 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -865,6 +865,7 @@ typedef struct RAMBlock { uint8_t *host; ram_addr_t offset; ram_addr_t length; +char idstr[256]; QLIST_ENTRY(RAMBlock) next; } RAMBlock; diff --git a/exec.c b/exec.c index 164ba16..fd47d5b 100644 --- a/exec.c +++ b/exec.c @@ -36,6 +36,7 @@ #include qemu-common.h #include tcg.h #include hw/hw.h +#include hw/qdev.h #include osdep.h #include kvm.h #include qemu-timer.h @@ -2780,10 +2781,34 @@ static ram_addr_t find_ram_offset(ram_addr_t size) ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size) { -RAMBlock *new_block; +RAMBlock *new_block, *block; size = TARGET_PAGE_ALIGN(size); -new_block = qemu_malloc(sizeof(*new_block)); +new_block = qemu_mallocz(sizeof(*new_block)); + +if (dev dev-parent_bus dev-parent_bus-info-get_dev_path) { +char *id = dev-parent_bus-info-get_dev_path(dev); +if (id) { +snprintf(new_block-idstr, sizeof(new_block-idstr), %s/, id); +qemu_free(id); +} +} +pstrcat(new_block-idstr, sizeof(new_block-idstr), name); + +QLIST_FOREACH(block, ram_list.blocks, next) { +if (!strcmp(block-idstr, new_block-idstr)) { +if (block-length == new_block-length) { +fprintf(stderr, RAMBlock \%s\ exists, assuming lack of +free.\n, new_block-idstr); +qemu_free(new_block); +return block-offset; +} else { +fprintf(stderr, RAMBlock \%s\ already registered with +different size, abort\n, new_block-idstr); +abort(); +} +} +} if (mem_path) { #if defined (__linux__) !defined(TARGET_S390X)
[Qemu-devel] [PATCH 14/23] blkdebug: Free QemuOpts after having read the config
Forgetting to free them means that the next instance inherits all rules and gets its own rules only additionally. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/blkdebug.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/block/blkdebug.c b/block/blkdebug.c index 4ec8ca6..78cbff4 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -267,6 +267,8 @@ static int read_config(BDRVBlkdebugState *s, const char *filename) ret = 0; fail: +qemu_opts_reset(inject_error_opts); +qemu_opts_reset(set_state_opts); fclose(f); return ret; } -- 1.6.6.1
[Qemu-devel] [PATCH v3 04/16] pci: Implement BusInfo.get_dev_path()
This works great for PCI since a segment:bus:dev.fn uniquely describes a global address. No need to traverse up the qdev tree. PCI segment support is a placeholder for compatibility once we support multiple segments. Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/pci.c | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index 7787005..1e77ae6 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -58,11 +58,13 @@ struct PCIBus { }; static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent); +static char *pcibus_get_dev_path(DeviceState *dev); static struct BusInfo pci_bus_info = { .name = PCI, .size = sizeof(PCIBus), .print_dev = pcibus_dev_print, +.get_dev_path = pcibus_get_dev_path, .props = (Property[]) { DEFINE_PROP_PCI_DEVFN(addr, PCIDevice, devfn, -1), DEFINE_PROP_STRING(romfile, PCIDevice, romfile), @@ -1853,6 +1855,18 @@ static void pcibus_dev_print(Monitor *mon, DeviceState *dev, int indent) } } +static char *pcibus_get_dev_path(DeviceState *dev) +{ +PCIDevice *d = (PCIDevice *)dev; +char path[16]; + +snprintf(path, sizeof(path), %04x:%02x:%02x.%x, + pci_find_domain(d-bus), d-config[PCI_SECONDARY_BUS], + PCI_SLOT(d-devfn), PCI_FUNC(d-devfn)); + +return strdup(path); +} + static PCIDeviceInfo bridge_info = { .qdev.name= pci-bridge, .qdev.size= sizeof(PCIBridge),
[Qemu-devel] [PATCH 12/23] qemu-option: New qemu_opts_reset()
From: Markus Armbruster arm...@redhat.com Signed-off-by: Markus Armbruster arm...@redhat.com Signed-off-by: Kevin Wolf kw...@redhat.com --- qemu-option.c |9 + qemu-option.h |1 + 2 files changed, 10 insertions(+), 0 deletions(-) diff --git a/qemu-option.c b/qemu-option.c index 7f70d0f..30327d4 100644 --- a/qemu-option.c +++ b/qemu-option.c @@ -719,6 +719,15 @@ QemuOpts *qemu_opts_create(QemuOptsList *list, const char *id, int fail_if_exist return opts; } +void qemu_opts_reset(QemuOptsList *list) +{ +QemuOpts *opts, *next_opts; + +QTAILQ_FOREACH_SAFE(opts, list-head, next, next_opts) { +qemu_opts_del(opts); +} +} + int qemu_opts_set(QemuOptsList *list, const char *id, const char *name, const char *value) { diff --git a/qemu-option.h b/qemu-option.h index 4823219..9e2406c 100644 --- a/qemu-option.h +++ b/qemu-option.h @@ -115,6 +115,7 @@ int qemu_opt_foreach(QemuOpts *opts, qemu_opt_loopfunc func, void *opaque, QemuOpts *qemu_opts_find(QemuOptsList *list, const char *id); QemuOpts *qemu_opts_create(QemuOptsList *list, const char *id, int fail_if_exists); +void qemu_opts_reset(QemuOptsList *list); int qemu_opts_set(QemuOptsList *list, const char *id, const char *name, const char *value); const char *qemu_opts_id(QemuOpts *opts); -- 1.6.6.1
[Qemu-devel] [PATCH v3 03/16] qdev: Add a get_dev_path() function to BusInfo
This function is meant to provide a stable device path for buses which are able to implement it. If a bus has a globally unique addresses scheme, one address level may be sufficient to provide a path. Other buses may need to recursively traverse up the qdev tree. Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/qdev.h |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/hw/qdev.h b/hw/qdev.h index be5ad67..d64619f 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -49,10 +49,13 @@ struct DeviceState { }; typedef void (*bus_dev_printfn)(Monitor *mon, DeviceState *dev, int indent); +typedef char *(*bus_get_dev_path)(DeviceState *dev); + struct BusInfo { const char *name; size_t size; bus_dev_printfn print_dev; +bus_get_dev_path get_dev_path; Property *props; };
[Qemu-devel] [PATCH v3 00/16] Make migration work with hotplug
v3: Still looking for comments... changes: - Rebase to latest upstream to pickup a few more callers - Fix a bug in patch introducing qemu_ram_free(). Since we now try to place blocks into existing holes in the address space, the end of the current block isn't necessarily the top ram_addr_t for realloc'ing the dirty bitmap. v2: Not too many comments, hope that's because everyone agrees ;) A couple minor changes. The 2nd patch is new and provides a bit of an optimization for large memory pc guets. The first two patches stand on their own even if we're undecided about the rest. Thanks, Alex changes: - Use pci_find_domain() for PCI domain, thanks Isaku - Convert pc to allocate all ram in one chunk, which avoids penalizing large memory VMs bouncing between ramblocks during migration. v1: Ok, new approach. I'm going to attempt to extract myself for the canonical device path approach, because we're missing too many pieces to make that work. Instead, I'll take Anthony's advice and try to simplify. We still want a unique name for ramblocks and savevm, but the hotplug problem today is only for PCI devices. PCI conveniently has globally unique, dare I say canonical, addressing in the form of domain:bus:device.func. To get to this, let's add a new function on the BusInfo structure called get_dev_path(). For a PCI device, we can simply traverse up the qdev tree to the BusInfo structure, look for the function, and call it to return a global PCI address. For some buses, these functions could chain up to their parent bus appending strings together to get a unique path. An example would be USB, where the USB port number may not be unique. If we traverse up to the PCI device providing USB, and then to the PCI bus, we get a globally unique PCI path, appended with a USB port number. To make this work for ramblocks and savevm, we need a DeviceState pointer when the they are create/registered, and we need a caller provided context in case there are multiple ramblocks/savevm associated with a device. Savevm already provies the context, and I've attempted to make reasonable guesses at these for the ramblocks. Note that most of the ramblocks aren't associated with a device, so I don't think it makes sense to link savevm and ramblocks together with the same absolute id string. Once we have savevm with unique id strings, rather than hotplug unfriendly instance numbers, we can be sure that the right driver instance is loading the correct vmstate. I've also implemented a compat field for this, so we can still accept incoming migrations from previous versions. Once we have ramblocks with a unique id string, we can switch to using id + offset for migration, which enables a ram_addr_t space that supports gaps, which enables us to implement qemu_ram_free(). With that, I think we can finally do migrations reliable after hotplug! Note that the target VM still needs to be created to match the current devices and bus addresses of the source VM. We can also still maintain compatibility for migrations here by bumping the ram migration version and supporting both new and old (just hope the source hasn't done any hotplugs). Sound reasonable? Is get_dev_path the right name? In the right place? The PCI return is currently :bb:dd.f, should this be PCI::bb:dd.f? Something else? Thanks, Alex --- Alex Williamson (16): ramblocks: No more being lazy about duplicate names pci: Free the space allocated for the option rom on removal qemu_ram_free: Implement it savevm: Create a new continue flag to avoid resending block name savevm: Use RAM blocks for basis of migration savevm: Migrate RAM based on name/offset ramblocks: Make use of DeviceState pointer and BusInfo.get_dev_path qemu_ram_alloc: Add DeviceState and name parameters virtio-net: Incorporate a DeviceState pointer and let savevm track instances eepro100: Add a dev field to eeprom new/free functions savevm: Make use of DeviceState savevm: Add DeviceState param pci: Implement BusInfo.get_dev_path() qdev: Add a get_dev_path() function to BusInfo pc: Allocate all ram in a single qemu_ram_alloc() Remove uses of ram.last_offset (aka last_ram_offset) arch_init.c | 183 +++-- audio/audio.c |2 block-migration.c |4 - cpu-all.h |5 + cpu-common.h |2 exec.c| 107 +--- hw/adb.c |4 - hw/ads7846.c |2 hw/an5206.c |4 - hw/arm_gic.c |2 hw/arm_timer.c|4 - hw/armv7m.c |9 +- hw/armv7m_nvic.c |2 hw/axis_dev88.c |4 - hw/cirrus_vga.c |2 hw/cuda.c |2
[Qemu-devel] [PATCH 13/23] blkdebug: Fix set_state_opts definition
The list head was initialized to point to the wrong list, so all actions ended up being handled as inject-error even if they were set-state in fact. Signed-off-by: Kevin Wolf kw...@redhat.com --- block/blkdebug.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/block/blkdebug.c b/block/blkdebug.c index 98fed94..4ec8ca6 100644 --- a/block/blkdebug.c +++ b/block/blkdebug.c @@ -111,7 +111,7 @@ static QemuOptsList inject_error_opts = { static QemuOptsList set_state_opts = { .name = set-state, -.head = QTAILQ_HEAD_INITIALIZER(inject_error_opts.head), +.head = QTAILQ_HEAD_INITIALIZER(set_state_opts.head), .desc = { { .name = event, -- 1.6.6.1
[Qemu-devel] [PATCH v3 13/16] savevm: Create a new continue flag to avoid resending block name
Allows us to compress the protocol a bit by setting a flag on the offset which indicates we're still working within the same block as last time. That way we can avoid sending the block name for every page. Suggested by Anthony Liguori. Signed-off-by: Alex Williamson alex.william...@redhat.com --- arch_init.c | 94 +++ 1 files changed, 50 insertions(+), 44 deletions(-) diff --git a/arch_init.c b/arch_init.c index 186645b..2f082f3 100644 --- a/arch_init.c +++ b/arch_init.c @@ -87,6 +87,7 @@ const uint32_t arch_type = QEMU_ARCH; #define RAM_SAVE_FLAG_MEM_SIZE 0x04 #define RAM_SAVE_FLAG_PAGE 0x08 #define RAM_SAVE_FLAG_EOS 0x10 +#define RAM_SAVE_FLAG_CONTINUE 0x20 static int is_dup_page(uint8_t *page, uint8_t ch) { @@ -120,6 +121,7 @@ static int ram_save_block(QEMUFile *f) do { if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) { uint8_t *p; +int cont = (block == last_block) ? RAM_SAVE_FLAG_CONTINUE : 0; cpu_physical_memory_reset_dirty(current_addr, current_addr + TARGET_PAGE_SIZE, @@ -128,17 +130,21 @@ static int ram_save_block(QEMUFile *f) p = block-host + offset; if (is_dup_page(p, *p)) { -qemu_put_be64(f, offset | RAM_SAVE_FLAG_COMPRESS); -qemu_put_byte(f, strlen(block-idstr)); -qemu_put_buffer(f, (uint8_t *)block-idstr, -strlen(block-idstr)); +qemu_put_be64(f, offset | cont | RAM_SAVE_FLAG_COMPRESS); +if (!cont) { +qemu_put_byte(f, strlen(block-idstr)); +qemu_put_buffer(f, (uint8_t *)block-idstr, +strlen(block-idstr)); +} qemu_put_byte(f, *p); bytes_sent = 1; } else { -qemu_put_be64(f, offset | RAM_SAVE_FLAG_PAGE); -qemu_put_byte(f, strlen(block-idstr)); -qemu_put_buffer(f, (uint8_t *)block-idstr, -strlen(block-idstr)); +qemu_put_be64(f, offset | cont | RAM_SAVE_FLAG_PAGE); +if (!cont) { +qemu_put_byte(f, strlen(block-idstr)); +qemu_put_buffer(f, (uint8_t *)block-idstr, +strlen(block-idstr)); +} qemu_put_buffer(f, p, TARGET_PAGE_SIZE); bytes_sent = TARGET_PAGE_SIZE; } @@ -289,6 +295,36 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) return (stage == 2) (expected_time = migrate_max_downtime()); } +static inline void *host_from_stream_offset(QEMUFile *f, +ram_addr_t offset, +int flags) +{ +static RAMBlock *block = NULL; +char id[256]; +uint8_t len; + +if (flags RAM_SAVE_FLAG_CONTINUE) { +if (!block) { +fprintf(stderr, Ack, bad migration stream!\n); +return NULL; +} + +return block-host + offset; +} + +len = qemu_get_byte(f); +qemu_get_buffer(f, (uint8_t *)id, len); +id[len] = 0; + +QLIST_FOREACH(block, ram_list.blocks, next) { +if (!strncmp(id, block-idstr, sizeof(id))) +return block-host + offset; +} + +fprintf(stderr, Can't find block %s!\n, id); +return NULL; +} + int ram_load(QEMUFile *f, void *opaque, int version_id) { ram_addr_t addr; @@ -346,26 +382,11 @@ int ram_load(QEMUFile *f, void *opaque, int version_id) void *host; uint8_t ch; -if (version_id == 3) { +if (version_id == 3) host = qemu_get_ram_ptr(addr); -} else { -RAMBlock *block; -char id[256]; -uint8_t len; - -len = qemu_get_byte(f); -qemu_get_buffer(f, (uint8_t *)id, len); -id[len] = 0; +else +host = host_from_stream_offset(f, addr, flags); -QLIST_FOREACH(block, ram_list.blocks, next) { -if (!strncmp(id, block-idstr, sizeof(id))) -break; -} -if (!block) -return -EINVAL; - -host = block-host + addr; -} ch = qemu_get_byte(f); memset(host, ch, TARGET_PAGE_SIZE); #ifndef _WIN32 @@ -377,26 +398,11 @@ int ram_load(QEMUFile *f, void *opaque, int version_id) } else if (flags RAM_SAVE_FLAG_PAGE) { void *host; -if (version_id == 3) { +if (version_id == 3) host = qemu_get_ram_ptr(addr); -} else { -RAMBlock *block; -char
[Qemu-devel] [PATCH v3 09/16] qemu_ram_alloc: Add DeviceState and name parameters
These will be used to generate unique id strings for ramblocks. The name field is required, the device pointer is optional as most callers don't have a device. When there's no device or the device isn't a child of a bus implementing BusInfo.get_dev_path, the name should be unique for the platform. Signed-off-by: Alex Williamson alex.william...@redhat.com --- cpu-common.h |2 +- exec.c|2 +- hw/an5206.c |4 ++-- hw/armv7m.c |9 ++--- hw/axis_dev88.c |4 ++-- hw/dummy_m68k.c |2 +- hw/etraxfs.c |6 +++--- hw/g364fb.c |2 +- hw/gumstix.c |6 -- hw/integratorcp.c |4 ++-- hw/mainstone.c|6 -- hw/mcf5208.c |4 ++-- hw/mips_fulong2e.c|4 ++-- hw/mips_jazz.c|4 ++-- hw/mips_malta.c |4 ++-- hw/mips_mipssim.c |4 ++-- hw/mips_r4k.c |6 +++--- hw/musicpal.c | 11 +++ hw/omap1.c|6 -- hw/omap2.c|6 -- hw/omap_sx1.c | 12 hw/onenand.c |2 +- hw/palm.c |3 ++- hw/pc.c |7 --- hw/pci.c |7 ++- hw/petalogix_s3adsp1800_mmu.c |7 --- hw/ppc405_boards.c| 18 +- hw/ppc405_uc.c|2 +- hw/ppc4xx_devs.c |4 +++- hw/ppc_newworld.c |6 +++--- hw/ppc_oldworld.c |6 +++--- hw/ppc_prep.c |4 ++-- hw/ppce500_mpc8544ds.c|3 ++- hw/pxa2xx.c | 12 hw/r2d.c |4 ++-- hw/realview.c |6 +++--- hw/s390-virtio.c |2 +- hw/sm501.c|2 +- hw/spitz.c|2 +- hw/sun4m.c|8 hw/sun4u.c|4 ++-- hw/syborg.c |2 +- hw/tc6393xb.c |2 +- hw/tcx.c |2 +- hw/tosa.c |2 +- hw/versatilepb.c |2 +- hw/vga.c |2 +- hw/vmware_vga.c |2 +- 48 files changed, 132 insertions(+), 99 deletions(-) diff --git a/cpu-common.h b/cpu-common.h index b24cecc..71e7933 100644 --- a/cpu-common.h +++ b/cpu-common.h @@ -40,7 +40,7 @@ static inline void cpu_register_physical_memory(target_phys_addr_t start_addr, } ram_addr_t cpu_get_physical_page_desc(target_phys_addr_t addr); -ram_addr_t qemu_ram_alloc(ram_addr_t); +ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size); void qemu_ram_free(ram_addr_t addr); /* This should only be used for ram local to a device. */ void *qemu_get_ram_ptr(ram_addr_t addr); diff --git a/exec.c b/exec.c index c8be508..164ba16 100644 --- a/exec.c +++ b/exec.c @@ -2778,7 +2778,7 @@ static ram_addr_t find_ram_offset(ram_addr_t size) return last; } -ram_addr_t qemu_ram_alloc(ram_addr_t size) +ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size) { RAMBlock *new_block; diff --git a/hw/an5206.c b/hw/an5206.c index f584d88..b9f19a9 100644 --- a/hw/an5206.c +++ b/hw/an5206.c @@ -54,11 +54,11 @@ static void an5206_init(ram_addr_t ram_size, /* DRAM at address zero */ cpu_register_physical_memory(0, ram_size, -qemu_ram_alloc(ram_size) | IO_MEM_RAM); +qemu_ram_alloc(NULL, an5206.ram, ram_size) | IO_MEM_RAM); /* Internal SRAM. */ cpu_register_physical_memory(AN5206_RAMBAR_ADDR, 512, -qemu_ram_alloc(512) | IO_MEM_RAM); +qemu_ram_alloc(NULL, an5206.sram, 512) | IO_MEM_RAM); mcf5206_init(AN5206_MBAR_ADDR, env); diff --git a/hw/armv7m.c b/hw/armv7m.c index 854261d..588ec98 100644 --- a/hw/armv7m.c +++ b/hw/armv7m.c @@ -200,9 +200,11 @@ qemu_irq *armv7m_init(int flash_size, int sram_size, /* Flash programming is done via the SCU, so pretend it is ROM. */ cpu_register_physical_memory(0, flash_size, - qemu_ram_alloc(flash_size) | IO_MEM_ROM); + qemu_ram_alloc(NULL, armv7m.flash, +flash_size) | IO_MEM_ROM); cpu_register_physical_memory(0x2000, sram_size, - qemu_ram_alloc(sram_size) | IO_MEM_RAM); + qemu_ram_alloc(NULL, armv7m.sram, +sram_size) | IO_MEM_RAM); armv7m_bitband_init(); nvic = qdev_create(NULL, armv7m_nvic); @@ -236,7 +238,8 @@ qemu_irq *armv7m_init(int flash_size, int sram_size, space. This stops qemu complaining
[Qemu-devel] [PATCH v3 02/16] pc: Allocate all ram in a single qemu_ram_alloc()
This will benefit us when we migrate based on ramblock name since we won't be bouncing between separate blocks. Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/pc.c | 22 +- 1 files changed, 9 insertions(+), 13 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 25ebafa..de60686 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -877,27 +877,23 @@ void pc_memory_init(ram_addr_t ram_size, *above_4g_mem_size_p = above_4g_mem_size; *below_4g_mem_size_p = below_4g_mem_size; +#if TARGET_PHYS_ADDR_BITS == 32 +if (above_4g_mem_size 0) { +hw_error(To much RAM for 32-bit physical address); +} +#endif linux_boot = (kernel_filename != NULL); /* allocate RAM */ -ram_addr = qemu_ram_alloc(below_4g_mem_size); +ram_addr = qemu_ram_alloc(below_4g_mem_size + above_4g_mem_size); cpu_register_physical_memory(0, 0xa, ram_addr); cpu_register_physical_memory(0x10, below_4g_mem_size - 0x10, ram_addr + 0x10); - -/* above 4giga memory allocation */ -if (above_4g_mem_size 0) { -#if TARGET_PHYS_ADDR_BITS == 32 -hw_error(To much RAM for 32-bit physical address); -#else -ram_addr = qemu_ram_alloc(above_4g_mem_size); -cpu_register_physical_memory(0x1ULL, - above_4g_mem_size, - ram_addr); +#if TARGET_PHYS_ADDR_BITS 32 +cpu_register_physical_memory(0x1ULL, above_4g_mem_size, + ram_addr + below_4g_mem_size); #endif -} - /* BIOS load */ if (bios_name == NULL)
[Qemu-devel] [PATCH v3 06/16] savevm: Make use of DeviceState
For callers that pass a device we can traverse up the qdev tree and make use of the BusInfo.get_dev_path information for creating unique savevm id strings. This avoids needing to rely on the instance number, which can cause problems with device initialization order and hotplug. For compatibility, we also store away the old id string and instance so we can accept migrations from VMs as we add new get_dev_path implementations. Signed-off-by: Alex Williamson alex.william...@redhat.com --- savevm.c | 84 ++ 1 files changed, 79 insertions(+), 5 deletions(-) diff --git a/savevm.c b/savevm.c index 0052406..e4f50b1 100644 --- a/savevm.c +++ b/savevm.c @@ -72,6 +72,7 @@ #include qemu-common.h #include hw/hw.h +#include hw/qdev.h #include net.h #include monitor.h #include sysemu.h @@ -988,6 +989,11 @@ const VMStateInfo vmstate_info_unused_buffer = { .put = put_unused_buffer, }; +typedef struct CompatEntry { +char idstr[256]; +int instance_id; +} CompatEntry; + typedef struct SaveStateEntry { QTAILQ_ENTRY(SaveStateEntry) entry; char idstr[256]; @@ -1001,6 +1007,7 @@ typedef struct SaveStateEntry { LoadStateHandler *load_state; const VMStateDescription *vmsd; void *opaque; +CompatEntry *compat; } SaveStateEntry; @@ -1022,6 +1029,23 @@ static int calculate_new_instance_id(const char *idstr) return instance_id; } +static int calculate_compat_instance_id(const char *idstr) +{ +SaveStateEntry *se; +int instance_id = 0; + +QTAILQ_FOREACH(se, savevm_handlers, entry) { +if (!se-compat) +continue; + +if (strcmp(idstr, se-compat-idstr) == 0 + instance_id = se-compat-instance_id) { +instance_id = se-compat-instance_id + 1; +} +} +return instance_id; +} + /* TODO: Individual devices generally have very little idea about the rest of the system, so instance_id should be removed/replaced. Meanwhile pass -1 as instance_id if you do not already have a clearly @@ -1039,7 +1063,6 @@ int register_savevm_live(DeviceState *dev, SaveStateEntry *se; se = qemu_mallocz(sizeof(SaveStateEntry)); -pstrcpy(se-idstr, sizeof(se-idstr), idstr); se-version_id = version_id; se-section_id = global_section_id++; se-set_params = set_params; @@ -1049,11 +1072,28 @@ int register_savevm_live(DeviceState *dev, se-opaque = opaque; se-vmsd = NULL; +if (dev dev-parent_bus dev-parent_bus-info-get_dev_path) { +char *id = dev-parent_bus-info-get_dev_path(dev); +if (id) { +pstrcpy(se-idstr, sizeof(se-idstr), id); +pstrcat(se-idstr, sizeof(se-idstr), /); +qemu_free(id); + +se-compat = qemu_mallocz(sizeof(CompatEntry)); +pstrcpy(se-compat-idstr, sizeof(se-compat-idstr), idstr); +se-compat-instance_id = instance_id == -1 ? + calculate_compat_instance_id(idstr) : instance_id; +instance_id = -1; +} +} +pstrcat(se-idstr, sizeof(se-idstr), idstr); + if (instance_id == -1) { -se-instance_id = calculate_new_instance_id(idstr); +se-instance_id = calculate_new_instance_id(se-idstr); } else { se-instance_id = instance_id; } +assert(!se-compat || se-instance_id == 0); /* add at the end of list */ QTAILQ_INSERT_TAIL(savevm_handlers, se, entry); return 0; @@ -1074,9 +1114,20 @@ int register_savevm(DeviceState *dev, void unregister_savevm(DeviceState *dev, const char *idstr, void *opaque) { SaveStateEntry *se, *new_se; +char id[256] = ; + +if (dev dev-parent_bus dev-parent_bus-info-get_dev_path) { +char *path = dev-parent_bus-info-get_dev_path(dev); +if (path) { +pstrcpy(id, sizeof(id), path); +pstrcat(id, sizeof(id), /); +qemu_free(path); +} +} +pstrcat(id, sizeof(id), idstr); QTAILQ_FOREACH_SAFE(se, savevm_handlers, entry, new_se) { -if (strcmp(se-idstr, idstr) == 0 se-opaque == opaque) { +if (strcmp(se-idstr, id) == 0 se-opaque == opaque) { QTAILQ_REMOVE(savevm_handlers, se, entry); qemu_free(se); } @@ -1094,7 +1145,6 @@ int vmstate_register_with_alias_id(DeviceState *dev, int instance_id, assert(alias_id == -1 || required_for_version = vmsd-minimum_version_id); se = qemu_mallocz(sizeof(SaveStateEntry)); -pstrcpy(se-idstr, sizeof(se-idstr), vmsd-name); se-version_id = vmsd-version_id; se-section_id = global_section_id++; se-save_live_state = NULL; @@ -1104,11 +1154,28 @@ int vmstate_register_with_alias_id(DeviceState *dev, int instance_id, se-vmsd = vmsd; se-alias_id = alias_id; +if (dev dev-parent_bus dev-parent_bus-info-get_dev_path) { +char *id = dev-parent_bus-info-get_dev_path(dev); +if (id) { +
[Qemu-devel] [PATCH 1/2] qemu-img check: Distinguish different kinds of errors
People think that their images are corrupted when in fact there are just some leaked clusters. Differentiating several error cases should make the messages more comprehensible. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c| 10 ++-- block.h| 10 - qemu-img.c | 62 +-- 3 files changed, 63 insertions(+), 19 deletions(-) diff --git a/block.c b/block.c index dd6dd76..b0ceef0 100644 --- a/block.c +++ b/block.c @@ -710,15 +710,19 @@ DeviceState *bdrv_get_attached(BlockDriverState *bs) /* * Run consistency checks on an image * - * Returns the number of errors or -errno when an internal error occurs + * Returns 0 if the check could be completed (it doesn't mean that the image is + * free of errors) or -errno when an internal error occured. The results of the + * check are stored in res. */ -int bdrv_check(BlockDriverState *bs) +int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res) { if (bs-drv-bdrv_check == NULL) { return -ENOTSUP; } -return bs-drv-bdrv_check(bs); +memset(res, 0, sizeof(*res)); +res-corruptions = bs-drv-bdrv_check(bs); +return res-corruptions 0 ? res-corruptions : 0; } /* commit COW file into the raw image */ diff --git a/block.h b/block.h index 3d03b3e..c2a7e4c 100644 --- a/block.h +++ b/block.h @@ -74,7 +74,6 @@ void bdrv_close(BlockDriverState *bs); int bdrv_attach(BlockDriverState *bs, DeviceState *qdev); void bdrv_detach(BlockDriverState *bs, DeviceState *qdev); DeviceState *bdrv_get_attached(BlockDriverState *bs); -int bdrv_check(BlockDriverState *bs); int bdrv_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors); int bdrv_write(BlockDriverState *bs, int64_t sector_num, @@ -97,6 +96,15 @@ int bdrv_change_backing_file(BlockDriverState *bs, const char *backing_file, const char *backing_fmt); void bdrv_register(BlockDriver *bdrv); + +typedef struct BdrvCheckResult { +int corruptions; +int leaks; +int check_errors; +} BdrvCheckResult; + +int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res); + /* async block I/O */ typedef struct BlockDriverAIOCB BlockDriverAIOCB; typedef void BlockDriverCompletionFunc(void *opaque, int ret); diff --git a/qemu-img.c b/qemu-img.c index 700af21..1782ac9 100644 --- a/qemu-img.c +++ b/qemu-img.c @@ -425,11 +425,20 @@ out: return 0; } +/* + * Checks an image for consistency. Exit codes: + * + * 0 - Check completed, image is good + * 1 - Check not completed because of internal errors + * 2 - Check completed, image is corrupted + * 3 - Check completed, image has leaked clusters, but is good otherwise + */ static int img_check(int argc, char **argv) { int c, ret; const char *filename, *fmt; BlockDriverState *bs; +BdrvCheckResult result; fmt = NULL; for(;;) { @@ -453,28 +462,51 @@ static int img_check(int argc, char **argv) if (!bs) { return 1; } -ret = bdrv_check(bs); -switch(ret) { -case 0: -printf(No errors were found on the image.\n); -break; -case -ENOTSUP: +ret = bdrv_check(bs, result); + +if (ret == -ENOTSUP) { error(This image format does not support checks); -break; -default: -if (ret 0) { -error(An error occurred during the check); -} else { -printf(%d errors were found on the image.\n, ret); +return 1; +} + +if (!(result.corruptions || result.leaks || result.check_errors)) { +printf(No errors were found on the image.\n); +} else { +if (result.corruptions) { +printf(\n%d errors were found on the image.\n +Data may be corrupted, or further writes to the image +may corrupt it.\n, +result.corruptions); +} + +if (result.leaks) { +printf(\n%d leaked clusters were found on the image.\n +This means waste of disk space, but no harm to data.\n, +result.leaks); +} + +if (result.check_errors) { +printf(\n%d internal errors have occurred during the check.\n, +result.check_errors); } -break; } bdrv_delete(bs); -if (ret) { + +if (ret 0 || result.check_errors) { +printf(\nAn error has occurred during the check: %s\n +The check is not complete and may have missed error.\n, +strerror(-ret)); return 1; } -return 0; + +if (result.corruptions) { +return 2; +} else if (result.leaks) { +return 3; +} else { +return 0; +} } static int img_commit(int argc, char **argv) -- 1.6.6.1
[Qemu-devel] [PATCH v3 11/16] savevm: Migrate RAM based on name/offset
Synchronize RAM blocks with the target and migrate using name/offset pairs. This ensures both source and target have the same view of RAM and that we get the right bits into the right slot. Signed-off-by: Alex Williamson alex.william...@redhat.com --- arch_init.c | 118 ++- vl.c|2 + 2 files changed, 108 insertions(+), 12 deletions(-) diff --git a/arch_init.c b/arch_init.c index 109dcef..37aad9d 100644 --- a/arch_init.c +++ b/arch_init.c @@ -113,20 +113,33 @@ static int ram_save_block(QEMUFile *f) while (addr total_ram) { if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) { +RAMBlock *block; +ram_addr_t offset; uint8_t *p; cpu_physical_memory_reset_dirty(current_addr, current_addr + TARGET_PAGE_SIZE, MIGRATION_DIRTY_FLAG); -p = qemu_get_ram_ptr(current_addr); +QLIST_FOREACH(block, ram_list.blocks, next) { +if (current_addr - block-offset block-length) +break; +} +offset = current_addr - block-offset; +p = block-host + offset; if (is_dup_page(p, *p)) { -qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_COMPRESS); +qemu_put_be64(f, offset | RAM_SAVE_FLAG_COMPRESS); +qemu_put_byte(f, strlen(block-idstr)); +qemu_put_buffer(f, (uint8_t *)block-idstr, +strlen(block-idstr)); qemu_put_byte(f, *p); bytes_sent = 1; } else { -qemu_put_be64(f, current_addr | RAM_SAVE_FLAG_PAGE); +qemu_put_be64(f, offset | RAM_SAVE_FLAG_PAGE); +qemu_put_byte(f, strlen(block-idstr)); +qemu_put_buffer(f, (uint8_t *)block-idstr, +strlen(block-idstr)); qemu_put_buffer(f, p, TARGET_PAGE_SIZE); bytes_sent = TARGET_PAGE_SIZE; } @@ -196,6 +209,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) } if (stage == 1) { +RAMBlock *block; uint64_t total_ram = ram_bytes_total(); bytes_transferred = 0; @@ -210,6 +224,12 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) cpu_physical_memory_set_dirty_tracking(1); qemu_put_be64(f, total_ram | RAM_SAVE_FLAG_MEM_SIZE); + +QLIST_FOREACH(block, ram_list.blocks, next) { +qemu_put_byte(f, strlen(block-idstr)); +qemu_put_buffer(f, (uint8_t *)block-idstr, strlen(block-idstr)); +qemu_put_be64(f, block-length); +} } bytes_transferred_last = bytes_transferred; @@ -257,7 +277,7 @@ int ram_load(QEMUFile *f, void *opaque, int version_id) ram_addr_t addr; int flags; -if (version_id != 3) { +if (version_id 3 || version_id 4) { return -EINVAL; } @@ -268,23 +288,99 @@ int ram_load(QEMUFile *f, void *opaque, int version_id) addr = TARGET_PAGE_MASK; if (flags RAM_SAVE_FLAG_MEM_SIZE) { -if (addr != ram_bytes_total()) { -return -EINVAL; +if (version_id == 3) { +if (addr != ram_bytes_total()) { +return -EINVAL; +} +} else { +/* Synchronize RAM block list */ +char id[256]; +ram_addr_t length; +ram_addr_t total_ram_bytes = addr; + +while (total_ram_bytes) { +RAMBlock *block; +uint8_t len; + +len = qemu_get_byte(f); +qemu_get_buffer(f, (uint8_t *)id, len); +id[len] = 0; +length = qemu_get_be64(f); + +QLIST_FOREACH(block, ram_list.blocks, next) { +if (!strncmp(id, block-idstr, sizeof(id))) { +if (block-length != length) +return -EINVAL; +break; +} +} + +if (!block) { +if (!qemu_ram_alloc(NULL, id, length)) +return -ENOMEM; +} + +total_ram_bytes -= length; +} } } if (flags RAM_SAVE_FLAG_COMPRESS) { -uint8_t ch = qemu_get_byte(f); -memset(qemu_get_ram_ptr(addr), ch, TARGET_PAGE_SIZE); +void *host; +uint8_t ch; + +if (version_id == 3) { +host = qemu_get_ram_ptr(addr); +} else { +RAMBlock *block; +char id[256]; +
[Qemu-devel] [PATCH v3 07/16] eepro100: Add a dev field to eeprom new/free functions
This allows us to create a more meaningful savevm string. Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/eepro100.c |4 ++-- hw/eeprom93xx.c |8 hw/eeprom93xx.h |4 ++-- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/hw/eepro100.c b/hw/eepro100.c index 0ddca8b..2b75c8f 100644 --- a/hw/eepro100.c +++ b/hw/eepro100.c @@ -1835,7 +1835,7 @@ static int pci_nic_uninit(PCIDevice *pci_dev) cpu_unregister_io_memory(s-mmio_index); vmstate_unregister(pci_dev-qdev, s-vmstate, s); -eeprom93xx_free(s-eeprom); +eeprom93xx_free(pci_dev-qdev, s-eeprom); qemu_del_vlan_client(s-nic-nc); return 0; } @@ -1862,7 +1862,7 @@ static int e100_nic_init(PCIDevice *pci_dev) /* Add 64 * 2 EEPROM. i82557 and i82558 support a 64 word EEPROM, * i82559 and later support 64 or 256 word EEPROM. */ -s-eeprom = eeprom93xx_new(EEPROM_SIZE); +s-eeprom = eeprom93xx_new(pci_dev-qdev, EEPROM_SIZE); /* Handler for memory-mapped I/O */ s-mmio_index = diff --git a/hw/eeprom93xx.c b/hw/eeprom93xx.c index 6ba546f..660b28f 100644 --- a/hw/eeprom93xx.c +++ b/hw/eeprom93xx.c @@ -289,7 +289,7 @@ void eeprom93xx_reset(eeprom_t *eeprom) } #endif -eeprom_t *eeprom93xx_new(uint16_t nwords) +eeprom_t *eeprom93xx_new(DeviceState *dev, uint16_t nwords) { /* Add a new EEPROM (with 16, 64 or 256 words). */ eeprom_t *eeprom; @@ -316,15 +316,15 @@ eeprom_t *eeprom93xx_new(uint16_t nwords) /* Output DO is tristate, read results in 1. */ eeprom-eedo = 1; logout(eeprom = 0x%p, nwords = %u\n, eeprom, nwords); -vmstate_register(NULL, 0, vmstate_eeprom, eeprom); +vmstate_register(dev, 0, vmstate_eeprom, eeprom); return eeprom; } -void eeprom93xx_free(eeprom_t *eeprom) +void eeprom93xx_free(DeviceState *dev, eeprom_t *eeprom) { /* Destroy EEPROM. */ logout(eeprom = 0x%p\n, eeprom); -vmstate_unregister(NULL, vmstate_eeprom, eeprom); +vmstate_unregister(dev, vmstate_eeprom, eeprom); qemu_free(eeprom); } diff --git a/hw/eeprom93xx.h b/hw/eeprom93xx.h index 47282d3..8ba0e28 100644 --- a/hw/eeprom93xx.h +++ b/hw/eeprom93xx.h @@ -23,10 +23,10 @@ typedef struct _eeprom_t eeprom_t; /* Create a new EEPROM with (nwords * 2) bytes. */ -eeprom_t *eeprom93xx_new(uint16_t nwords); +eeprom_t *eeprom93xx_new(DeviceState *dev, uint16_t nwords); /* Destroy an existing EEPROM. */ -void eeprom93xx_free(eeprom_t *eeprom); +void eeprom93xx_free(DeviceState *dev, eeprom_t *eeprom); /* Read from the EEPROM. */ uint16_t eeprom93xx_read(eeprom_t *eeprom);
[Qemu-devel] [PATCH v3 15/16] pci: Free the space allocated for the option rom on removal
Signed-off-by: Alex Williamson alex.william...@redhat.com --- hw/pci.c | 11 +++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/hw/pci.c b/hw/pci.c index fe7c5c3..a7ff566 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -76,6 +76,7 @@ static struct BusInfo pci_bus_info = { static void pci_update_mappings(PCIDevice *d); static void pci_set_irq(void *opaque, int irq_num, int level); static int pci_add_option_rom(PCIDevice *pdev); +static void pci_del_option_rom(PCIDevice *pdev); static uint16_t pci_default_sub_vendor_id = PCI_SUBVENDOR_ID_REDHAT_QUMRANET; static uint16_t pci_default_sub_device_id = PCI_SUBDEVICE_ID_QEMU; @@ -709,6 +710,7 @@ static int pci_unregister_device(DeviceState *dev) return ret; pci_unregister_io_regions(pci_dev); +pci_del_option_rom(pci_dev); do_pci_unregister_device(pci_dev); return 0; } @@ -1765,6 +1767,15 @@ static int pci_add_option_rom(PCIDevice *pdev) return 0; } +static void pci_del_option_rom(PCIDevice *pdev) +{ +if (!pdev-rom_offset) +return; + +qemu_ram_free(pdev-rom_offset); +pdev-rom_offset = 0; +} + /* Reserve space and add capability to the linked list in pci config space */ int pci_add_capability_at_offset(PCIDevice *pdev, uint8_t cap_id, uint8_t offset, uint8_t size)
Re: [Qemu-devel] Re: Status update
On Fri, Jul 02, 2010 at 06:41:55PM +0900, Isaku Yamahata wrote: On Fri, Jul 02, 2010 at 09:03:39AM +0100, Stefan Hajnoczi wrote: On Thu, Jul 1, 2010 at 8:30 PM, Eduard - Gabriel Munteanu eduard.munte...@linux360.ro wrote: But suddenly the guest OS changes mappings and expects the IOMMU to enforce them as soon as invalidation commands are completed. The guest then reclaims the old space for other uses. This leaves an opportunity for those processes to corrupt or read sensitive data. In such a case, OS should put device into quiescence by reset like pci bus reset or pcie function level reset. pci bus reset patch hasn't been merged yet, though. It needs clean up/generalization. -- yamahata I wouldn't count on that. When the IOMMU notifies software of command completion, then that notification should be correct. So if we count on 'pci bus reset' we either don't execute INVALIDATE_* and COMPLETION_WAIT commands, or we issue bogus notifications (e.g. they'd be nops). That goes against the specs, and I'm not sure there's any good reason a non-KVM/QEMU-aware OS would reset the device in _all_ cases. For some background on this, mappings updates are followed by INVALIDATE_* commands and then a COMPLETION_WAIT (to wait for invalidation to finish). Eduard
[Qemu-devel] [PATCH v3 12/16] savevm: Use RAM blocks for basis of migration
We don't want to assume a contiguous address space, so migrate based on RAM blocks instead of a fixed linear address map. This will allow us to have holes in the ram_addr_t namespace, so we can implement qemu_ram_free(). Signed-off-by: Alex Williamson alex.william...@redhat.com --- arch_init.c | 67 +-- 1 files changed, 42 insertions(+), 25 deletions(-) diff --git a/arch_init.c b/arch_init.c index 37aad9d..186645b 100644 --- a/arch_init.c +++ b/arch_init.c @@ -105,27 +105,26 @@ static int is_dup_page(uint8_t *page, uint8_t ch) static int ram_save_block(QEMUFile *f) { -static ram_addr_t current_addr = 0; -ram_addr_t saved_addr = current_addr; -ram_addr_t addr = 0; -uint64_t total_ram = ram_bytes_total(); +static RAMBlock *last_block = NULL; +static ram_addr_t last_offset = 0; +RAMBlock *block = last_block; +ram_addr_t offset = last_offset; +ram_addr_t current_addr; int bytes_sent = 0; -while (addr total_ram) { +if (!block) +block = QLIST_FIRST(ram_list.blocks); + +current_addr = block-offset + offset; + +do { if (cpu_physical_memory_get_dirty(current_addr, MIGRATION_DIRTY_FLAG)) { -RAMBlock *block; -ram_addr_t offset; uint8_t *p; cpu_physical_memory_reset_dirty(current_addr, current_addr + TARGET_PAGE_SIZE, MIGRATION_DIRTY_FLAG); -QLIST_FOREACH(block, ram_list.blocks, next) { -if (current_addr - block-offset block-length) -break; -} -offset = current_addr - block-offset; p = block-host + offset; if (is_dup_page(p, *p)) { @@ -146,9 +145,21 @@ static int ram_save_block(QEMUFile *f) break; } -addr += TARGET_PAGE_SIZE; -current_addr = (saved_addr + addr) % total_ram; -} + +offset += TARGET_PAGE_SIZE; +if (offset = block-length) { +offset = 0; +block = QLIST_NEXT(block, next); +if (!block) +block = QLIST_FIRST(ram_list.blocks); +} + +current_addr = block-offset + offset; + +} while (current_addr != last_block-offset + last_offset); + +last_block = block; +last_offset = offset; return bytes_sent; } @@ -157,13 +168,16 @@ static uint64_t bytes_transferred; static ram_addr_t ram_save_remaining(void) { -ram_addr_t addr; +RAMBlock *block; ram_addr_t count = 0; -uint64_t total_ram = ram_bytes_total(); -for (addr = 0; addr total_ram; addr += TARGET_PAGE_SIZE) { -if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) { -count++; +QLIST_FOREACH(block, ram_list.blocks, next) { +ram_addr_t addr; +for (addr = block-offset; addr block-offset + block-length; + addr += TARGET_PAGE_SIZE) { +if (cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) { +count++; +} } } @@ -210,20 +224,23 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque) if (stage == 1) { RAMBlock *block; -uint64_t total_ram = ram_bytes_total(); bytes_transferred = 0; /* Make sure all dirty bits are set */ -for (addr = 0; addr total_ram; addr += TARGET_PAGE_SIZE) { -if (!cpu_physical_memory_get_dirty(addr, MIGRATION_DIRTY_FLAG)) { -cpu_physical_memory_set_dirty(addr); +QLIST_FOREACH(block, ram_list.blocks, next) { +for (addr = block-offset; addr block-offset + block-length; + addr += TARGET_PAGE_SIZE) { +if (!cpu_physical_memory_get_dirty(addr, + MIGRATION_DIRTY_FLAG)) { +cpu_physical_memory_set_dirty(addr); +} } } /* Enable dirty memory tracking */ cpu_physical_memory_set_dirty_tracking(1); -qemu_put_be64(f, total_ram | RAM_SAVE_FLAG_MEM_SIZE); +qemu_put_be64(f, ram_bytes_total() | RAM_SAVE_FLAG_MEM_SIZE); QLIST_FOREACH(block, ram_list.blocks, next) { qemu_put_byte(f, strlen(block-idstr));
[Qemu-devel] [PATCH v3 14/16] qemu_ram_free: Implement it
Now that we can support a ram_addr_t space with holes, we can implement qemu_ram_free(). Signed-off-by: Alex Williamson alex.william...@redhat.com --- cpu-all.h |3 +++ exec.c| 62 + 2 files changed, 61 insertions(+), 4 deletions(-) diff --git a/cpu-all.h b/cpu-all.h index 5d8342b..224ca40 100644 --- a/cpu-all.h +++ b/cpu-all.h @@ -867,6 +867,9 @@ typedef struct RAMBlock { ram_addr_t length; char idstr[256]; QLIST_ENTRY(RAMBlock) next; +#if defined(__linux__) !defined(TARGET_S390X) +int fd; +#endif } RAMBlock; typedef struct RAMList { diff --git a/exec.c b/exec.c index fd47d5b..b9a1471 100644 --- a/exec.c +++ b/exec.c @@ -2701,7 +2701,9 @@ static long gethugepagesize(const char *path) return fs.f_bsize; } -static void *file_ram_alloc(ram_addr_t memory, const char *path) +static void *file_ram_alloc(RAMBlock *block, +ram_addr_t memory, +const char *path) { char *filename; void *area; @@ -2764,12 +2766,39 @@ static void *file_ram_alloc(ram_addr_t memory, const char *path) close(fd); return (NULL); } +block-fd = fd; return area; } #endif static ram_addr_t find_ram_offset(ram_addr_t size) { +RAMBlock *block, *next_block; +ram_addr_t offset, mingap = ULONG_MAX; + +if (QLIST_EMPTY(ram_list.blocks)) +return 0; + +QLIST_FOREACH(block, ram_list.blocks, next) { +ram_addr_t end, next = ULONG_MAX; + +end = block-offset + block-length; + +QLIST_FOREACH(next_block, ram_list.blocks, next) { +if (next_block-offset = end) { +next = MIN(next, next_block-offset); +} +} +if (next - end = size next - end mingap) { +offset = end; +mingap = next - end; +} +} +return offset; +} + +static ram_addr_t last_ram_offset(void) +{ RAMBlock *block; ram_addr_t last = 0; @@ -2812,7 +2841,7 @@ ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size) if (mem_path) { #if defined (__linux__) !defined(TARGET_S390X) -new_block-host = file_ram_alloc(size, mem_path); +new_block-host = file_ram_alloc(new_block, size, mem_path); if (!new_block-host) { new_block-host = qemu_vmalloc(size); #ifdef MADV_MERGEABLE @@ -2842,7 +2871,7 @@ ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size) QLIST_INSERT_HEAD(ram_list.blocks, new_block, next); ram_list.phys_dirty = qemu_realloc(ram_list.phys_dirty, -(new_block-offset + size) TARGET_PAGE_BITS); + last_ram_offset() TARGET_PAGE_BITS); memset(ram_list.phys_dirty + (new_block-offset TARGET_PAGE_BITS), 0xff, size TARGET_PAGE_BITS); @@ -2854,7 +2883,32 @@ ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size) void qemu_ram_free(ram_addr_t addr) { -/* TODO: implement this. */ +RAMBlock *block; + +QLIST_FOREACH(block, ram_list.blocks, next) { +if (addr == block-offset) { +QLIST_REMOVE(block, next); +if (mem_path) { +#if defined (__linux__) !defined(TARGET_S390X) +if (block-fd) { +munmap(block-host, block-length); +close(block-fd); +} else { +qemu_vfree(block-host); +} +#endif +} else { +#if defined(TARGET_S390X) defined(CONFIG_KVM) +munmap(block-host, block-length); +#else +qemu_vfree(block-host); +#endif +} +qemu_free(block); +return; +} +} + } /* Return a host pointer to ram allocated with qemu_ram_alloc.
[Qemu-devel] [PATCH 2/2] qcow2/vdi: Change check to distinguish error cases
This distinguishes between harmless leaks and real corruption. Hopefully users better understand what qemu-img check wants to tell them. Signed-off-by: Kevin Wolf kw...@redhat.com --- block.c|3 +- block/qcow2-refcount.c | 120 ++-- block/qcow2.c |4 +- block/qcow2.h |2 +- block/vdi.c| 10 ++-- block_int.h|7 ++- 6 files changed, 79 insertions(+), 67 deletions(-) diff --git a/block.c b/block.c index b0ceef0..65cf4dc 100644 --- a/block.c +++ b/block.c @@ -721,8 +721,7 @@ int bdrv_check(BlockDriverState *bs, BdrvCheckResult *res) } memset(res, 0, sizeof(*res)); -res-corruptions = bs-drv-bdrv_check(bs); -return res-corruptions 0 ? res-corruptions : 0; +return bs-drv-bdrv_check(bs, res); } /* commit COW file into the raw image */ diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c index 4a96d98..4c19e7e 100644 --- a/block/qcow2-refcount.c +++ b/block/qcow2-refcount.c @@ -884,9 +884,10 @@ int qcow2_update_snapshot_refcount(BlockDriverState *bs, * This is used to construct a temporary refcount table out of L1 and L2 tables * which can be compared the the refcount table saved in the image. * - * Returns the number of errors in the image that were found + * Modifies the number of errors in res. */ -static int inc_refcounts(BlockDriverState *bs, +static void inc_refcounts(BlockDriverState *bs, + BdrvCheckResult *res, uint16_t *refcount_table, int refcount_table_size, int64_t offset, int64_t size) @@ -894,30 +895,32 @@ static int inc_refcounts(BlockDriverState *bs, BDRVQcowState *s = bs-opaque; int64_t start, last, cluster_offset; int k; -int errors = 0; if (size = 0) -return 0; +return; start = offset ~(s-cluster_size - 1); last = (offset + size - 1) ~(s-cluster_size - 1); for(cluster_offset = start; cluster_offset = last; cluster_offset += s-cluster_size) { k = cluster_offset s-cluster_bits; -if (k 0 || k = refcount_table_size) { +if (k 0) { fprintf(stderr, ERROR: invalid cluster offset=0x% PRIx64 \n, cluster_offset); -errors++; +res-corruptions++; +} else if (k = refcount_table_size) { +fprintf(stderr, Warning: cluster offset=0x% PRIx64 is after +the end of the image file, can't properly check refcounts.\n, +cluster_offset); +res-check_errors++; } else { if (++refcount_table[k] == 0) { fprintf(stderr, ERROR: overflow cluster offset=0x% PRIx64 \n, cluster_offset); -errors++; +res-corruptions++; } } } - -return errors; } /* @@ -928,14 +931,13 @@ static int inc_refcounts(BlockDriverState *bs, * Returns the number of errors found by the checks or -errno if an internal * error occurred. */ -static int check_refcounts_l2(BlockDriverState *bs, +static int check_refcounts_l2(BlockDriverState *bs, BdrvCheckResult *res, uint16_t *refcount_table, int refcount_table_size, int64_t l2_offset, int check_copied) { BDRVQcowState *s = bs-opaque; uint64_t *l2_table, offset; int i, l2_size, nb_csectors, refcount; -int errors = 0; /* Read L2 table from disk */ l2_size = s-l2_size * sizeof(uint64_t); @@ -955,16 +957,15 @@ static int check_refcounts_l2(BlockDriverState *bs, copied flag must never be set for compressed clusters\n, offset s-cluster_bits); offset = ~QCOW_OFLAG_COPIED; -errors++; +res-corruptions++; } /* Mark cluster as used */ nb_csectors = ((offset s-csize_shift) s-csize_mask) + 1; offset = s-cluster_offset_mask; -errors += inc_refcounts(bs, refcount_table, - refcount_table_size, - offset ~511, nb_csectors * 512); +inc_refcounts(bs, res, refcount_table, refcount_table_size, +offset ~511, nb_csectors * 512); } else { /* QCOW_OFLAG_COPIED must be set iff refcount == 1 */ if (check_copied) { @@ -974,35 +975,35 @@ static int check_refcounts_l2(BlockDriverState *bs, if (refcount 0) { fprintf(stderr, Can't get refcount for offset % PRIx64 : %s\n, entry, strerror(-refcount)); +goto fail; } if ((refcount == 1) != ((entry QCOW_OFLAG_COPIED) != 0)) {
[Qemu-devel] [PATCH v3 16/16] ramblocks: No more being lazy about duplicate names
Now that we have a working qemu_ram_free() and the primary runtime user of it has been updated, don't be lenient about duplicate id strings. We also shouldn't need to create them ondemand at the target. Signed-off-by: Alex Williamson alex.william...@redhat.com --- arch_init.c |5 +++-- exec.c | 13 +++-- 2 files changed, 6 insertions(+), 12 deletions(-) diff --git a/arch_init.c b/arch_init.c index 2f082f3..47bb4b2 100644 --- a/arch_init.c +++ b/arch_init.c @@ -369,8 +369,9 @@ int ram_load(QEMUFile *f, void *opaque, int version_id) } if (!block) { -if (!qemu_ram_alloc(NULL, id, length)) -return -ENOMEM; +fprintf(stderr, Unknown ramblock \%s\, cannot +accept migration\n, id); +return -EINVAL; } total_ram_bytes -= length; diff --git a/exec.c b/exec.c index b9a1471..5420f56 100644 --- a/exec.c +++ b/exec.c @@ -2826,16 +2826,9 @@ ram_addr_t qemu_ram_alloc(DeviceState *dev, const char *name, ram_addr_t size) QLIST_FOREACH(block, ram_list.blocks, next) { if (!strcmp(block-idstr, new_block-idstr)) { -if (block-length == new_block-length) { -fprintf(stderr, RAMBlock \%s\ exists, assuming lack of -free.\n, new_block-idstr); -qemu_free(new_block); -return block-offset; -} else { -fprintf(stderr, RAMBlock \%s\ already registered with -different size, abort\n, new_block-idstr); -abort(); -} +fprintf(stderr, RAMBlock \%s\ already registered, abort!\n, +new_block-idstr); +abort(); } }
[Qemu-devel] [PATCH 0/2] Improve qemu-img check output
qemu-img check produces messages that are hard to understand. Even worse is that in the end it just says something like 42 errors with no further explanation. Recently I got bug reports from people who though that their image was corrupted, when in fact there were only a few leaked clusters after a crash. This series tries to tell the user if it's real corruption, leaked clusters or something that went wrong during the check. Kevin Wolf (2): qemu-img check: Distinguish different kinds of errors qcow2/vdi: Change check to distinguish error cases block.c|9 ++- block.h| 10 - block/qcow2-refcount.c | 120 ++-- block/qcow2.c |4 +- block/qcow2.h |2 +- block/vdi.c| 10 ++-- block_int.h|7 ++- qemu-img.c | 62 +++-- 8 files changed, 140 insertions(+), 84 deletions(-)
[Qemu-devel] Re: [PATCH v7 0/4] Inter-VM shared memory device
On Tue, Jun 15, 2010 at 2:23 PM, Cam Macdonell c...@cs.ualberta.ca wrote: Latest patch for PCI shared memory device that maps a host shared memory object to be shared between guests new in this series - replace marking memory from v6 with marking device as unmigratable indicating that it should be unplugged before migration and re-added after. - 'peer' case changed to require removal before migration, only 'master' devices can be migrated while attached. v6 - migration support with 'master' and 'peer' roles for guest to determine who owns memory - modified phys_ram_dirty array for marking memory as not to be migrated v5: - fixed segfault for non-server case - code style fixes - removed limit on the number of guests - shared memory server is now in qemu.git/contrib - made ioeventfd setup function generic - removed interrupts when guest joined (let application handle it) v4: - moved to single Doorbell register and use datamatch to trigger different VMs rather than one register per eventfd - remove writing arbitrary values to eventfds. Only values of 1 are now written to ensure correct usage Cam Macdonell (4): Device specification for shared memory PCI device Add function to assign ioeventfd to MMIO. Support marking a device as non-migratable Inter-VM shared memory PCI device Makefile.target | 3 + docs/specs/ivshmem_device_spec.txt | 96 + hw/hw.h | 1 + hw/ivshmem.c | 823 kvm-all.c | 32 ++ kvm.h | 1 + qemu-char.c | 6 + qemu-char.h | 3 + qemu-doc.texi | 43 ++ savevm.c | 32 ++- 10 files changed, 1037 insertions(+), 3 deletions(-) create mode 100644 docs/specs/ivshmem_device_spec.txt create mode 100644 hw/ivshmem.c Hi, Are there outstanding concerns with this patchset? Can it be merged? I can rebase if necessary. Thanks, Cam
Re: [Qemu-devel] [Bug 600589] [NEW] xchg r8,rax treated as nop
You wrote 1 июля 2010 г., 19:43:06: On Thu, 1 Jul 2010, Richard Henderson wrote: On 07/01/2010 05:04 AM, Vic3Dexe wrote: Public bug reported: xchg r8,rax (49h 90h) executed as nop (90h) in long mode, in other words REX not used. qemu 0.12.4, host Win 7 x64, running qemu-system-x86_64.exe. ** Affects: qemu Importance: Undecided Status: New Verified. Test case for x86_64-linux-user: .globl main .type main, @function main: movl$0, %r8d movl$1, %eax xchgq %r8, %rax ret Expected result is exit status 0. No surprise really: target-i386/translate.c lines 6665-... case 0x90: /* nop */ /* XXX: xchg + rex handling */ /* XXX: correct lock test for all insn */ The code to handle that just isn't there. Sorry for inconvenience, I just forgot to look in source. :) Do you plan to fix it in the near future? -- Best regards, Vic3dexe mailto:vic3d...@gmail.com
[Qemu-devel] Re: [PATCH] ARM v4t/arm920t support
On Thursday 01 July 2010 16:50:29 Paul Brook wrote: Here is the patch again. There may be more work to be done on top of this, but this patch staying out of tree hasn't noticeably accelerated that work in the past year and change. Could it please be merged? As mentioned previously, V5 should be split into its component parts. If this patch needs to be split up/tweaked/polished a bit to be acceptable, I'm interested in at least trying to do this work myself, but unfortunately the robots.txt file of lists.nongnu.org still prevents Google from indexing the mailing list web archive. Could you give me a hint where mentioned previously might be found? Rob -- GPLv3: as worthy a successor as The Phantom Meanace, as timely as Duke Nukem Forever, and as welcome as New Coke.
Re: [Qemu-devel] [Bug 600589] [NEW] xchg r8,rax treated as nop
On 07/02/2010 12:13 PM, vic3d...@gmail.com wrote: Sorry for inconvenience, I just forgot to look in source. :) Do you plan to fix it in the near future? No, in the near past. ;-) The fix was committed to qemu.git head yesterday. r~
[Qemu-devel] [PATCH 1/2] QMP: Introduce the documentation for query-qdm
--- qemu-monitor.hx | 68 +++ 1 files changed, 68 insertions(+), 0 deletions(-) diff --git a/qemu-monitor.hx b/qemu-monitor.hx index 9f62b94..5348899 100644 --- a/qemu-monitor.hx +++ b/qemu-monitor.hx @@ -2490,6 +2490,74 @@ STEXI show device tree @item info qdm show qdev device model list +ETEXI +SQMP +query-qdm +- + +Describe the capabilities of all devices registered with qdev. + +The returned output is a list, each element is a json-object describing a single +device type. + +Each json-object contains the following: + +- name: the short name of the device (json-string) +- bus: the name of the bus type for the device (json-string) +- alias: an alias by which the device is also known (json-string, optional) +- description: a long description the device (json-string, optional) +- creatable: whether this device can be created on command line (json-boolean) +- props: a list where each element is an json-object that describes a property +of the device. Each json-object contains the following: + - name: the short name of the property (json-string) + - info: short description of the property (json-string) + - type: the data type of the property value (json-string) + +Example: + +- { execute: query-qdm } +- { + return: [ +{ + name: virtio-9p-pci, + creatable: true, + bus: PCI, + props: [ + { + name: indirect_desc, + type: bit, + info: on/off + }, + { + name: mount_tag, + type: string, + info: string + }, + { + name: fsdev, + type: string, + info: string + } + ] +}, +{ + name: virtio-balloon-pci, + creatable: true, + bus: PCI, + props: [ + { + name: indirect_desc, + type: bit, + info: on/off + } + ] +}, + +] + +EQMP + +STEXI @item info roms show roms @end table -- 1.7.1
[Qemu-devel] [PATCH 0/2] QMP: Introduce query-qdm
This series introduces the documentation for the query-qdm command and the conversion of the monitor command 'info qdm' to QMP. The documentation and code are based on a patch previously sent to qemu-devel by Daniel P. Berrange: http://lists.gnu.org/archive/html/qemu-devel/2010-06/msg00931.html Changes from the Daniel's original patch: - Split the patch in two, taking out the documentation from the code - Reworded some parts of the documentation and added data types - Small cleanups and renamed do_info_devices() to do_info_qdm() - Added do_info_qdm_print() to be used in the monitor Regards, Miguel
[Qemu-devel] [PATCH 2/2] monitor: Convert 'info qdm' to QMP
Converts the 'info qdm' command to QMP, allowing the discovery of all devices known to the QEMU binary without relying on command line paramaters like -device ? and -device devtype,? This change does not modify the output of the 'info qdm' monitor command. Signed-off-by: Miguel Di Ciurcio Filho miguel.fi...@gmail.com --- hw/qdev.c | 118 +++- hw/qdev.h |3 +- monitor.c |3 +- 3 files changed, 120 insertions(+), 4 deletions(-) diff --git a/hw/qdev.c b/hw/qdev.c index 61f999c..23f0540 100644 --- a/hw/qdev.c +++ b/hw/qdev.c @@ -29,6 +29,7 @@ #include qdev.h #include sysemu.h #include monitor.h +#include qjson.h static int qdev_hotplug = 0; @@ -777,13 +778,126 @@ void do_info_qtree(Monitor *mon) qbus_print(mon, main_system_bus, 0); } -void do_info_qdm(Monitor *mon) +static void qdm_list_iter(QObject *obj, void *opaque) +{ + +Monitor *mon = opaque; +QDict *dev = qobject_to_qdict(obj); + +monitor_printf(mon, name \%s\, bus %s, qdict_get_str(dev, name), +qdict_get_str(dev, bus)); + +if (qdict_haskey(dev, alias)) { +monitor_printf(mon, , alias \%s\, qdict_get_str(dev, alias)); +} + +if (qdict_haskey(dev, description)) { +monitor_printf(mon, , desc \%s\, qdict_get_str(dev, description)); +} + +if (!qdict_get_bool(dev, creatable)) { +monitor_printf(mon, , no-user); +} + +monitor_printf(mon, \n); +} + +void do_info_qdm_print(Monitor *mon, const QObject *ret_data) +{ +QList *devs; + +devs = qobject_to_qlist(ret_data); +qlist_iter(devs, qdm_list_iter, mon); +} + +static const char *qdev_property_type_to_string(int type) { +switch (type) { +case PROP_TYPE_UNSPEC: + return unknown; +case PROP_TYPE_UINT8: + return uint8; +case PROP_TYPE_UINT16: + return uint16; +case PROP_TYPE_UINT32: + return uint32; +case PROP_TYPE_INT32: + return int32; +case PROP_TYPE_UINT64: + return uint64; +case PROP_TYPE_TADDR: + return taddr; +case PROP_TYPE_MACADDR: + return macaddr; +case PROP_TYPE_DRIVE: + return drive; +case PROP_TYPE_CHR: + return chr; +case PROP_TYPE_STRING: + return string; +case PROP_TYPE_NETDEV: + return netdev; +case PROP_TYPE_VLAN: + return vlan; +case PROP_TYPE_PTR: + return pointer; +case PROP_TYPE_BIT: + return bit; +} + +return NULL; +} + +void do_info_qdm(Monitor *mon, QObject **ret_data) { DeviceInfo *info; +QList *devs = qlist_new(); for (info = device_info_list; info != NULL; info = info-next) { -qdev_print_devinfo(info); +QObject *obj; +QDict *dev; +QList *props = qlist_new(); +Property *prop; + +for (prop = info-props; prop prop-name; prop++) { +QObject *entry; +/* + * TODO Properties without a parser are just for dirty hacks. + * qdev_prop_ptr is the only such PropertyInfo. It's marked + * for removal. This conditional should be removed along with + * it. + */ +if (!prop-info-parse) { +continue; /* no way to set it, don't show */ +} + +const char *type = qdev_property_type_to_string(prop-info-type); + +entry = qobject_from_jsonf({ 'name': %s, 'info': %s, 'type': %s }, + prop-name, prop-info-name, type); + +qlist_append_obj(props, entry); +} + +obj = qobject_from_jsonf({ 'name': %s, 'bus': %s, 'props': %p, 'creatable': %i }, + info-name, + info-bus_info-name, + props, + info-no_user ? 0 : 1); + +dev = qobject_to_qdict(obj); + +if (info-alias) { +qdict_put(dev, alias, qstring_from_str(info-alias)); +} + +if (info-desc) { +qdict_put(dev, description, qstring_from_str(info-desc)); +} + +qlist_append(devs, dev); } + +*ret_data = QOBJECT(devs); } int do_device_add(Monitor *mon, const QDict *qdict, QObject **ret_data) diff --git a/hw/qdev.h b/hw/qdev.h index be5ad67..f44e4a2 100644 --- a/hw/qdev.h +++ b/hw/qdev.h @@ -181,7 +181,8 @@ void qbus_free(BusState *bus); /*** monitor commands ***/ void do_info_qtree(Monitor *mon); -void do_info_qdm(Monitor *mon); +void do_info_qdm_print(Monitor *mon, const QObject *ret_data); +void do_info_qdm(Monitor *mon, QObject **ret_data); int do_device_add(Monitor *mon, const QDict *qdict, QObject **ret_data); int do_device_del(Monitor *mon, const QDict *qdict, QObject **ret_data); diff --git a/monitor.c b/monitor.c index 170b269..91b87f9 100644 --- a/monitor.c +++ b/monitor.c @@ -2550,7 +2550,8 @@ static const mon_cmd_t info_cmds[] = {
[Qemu-devel] Re: qemu-kvm.git unittest failures
On Thu, Jul 01, 2010 at 01:25:51PM -0300, Lucas Meneghel Rodrigues wrote: Now that we already have a mechanism to perform automated and regular unittesting, let me start by reporting the first problems I'm seeing with the unittests. Some (or all) of the problems might be due to inappropriate parameters passed to qemu, so if that's the case, please point out the correction. 1) idt_test /home/lmr/Code/autotest-git/client/tests/kvm/qemu -name 'vm1' -monitor unix:'/tmp/monitor-humanmonitor1-20100701-112446-wftX',server,nowait -serial unix:'/tmp/serial-20100701-112446-wftX',server,nowait -m 512 -kernel '/home/lmr/Code/autotest-git/client/tests/kvm/unittests/idt_test.flat' -vnc :1 -chardev file,id=testlog,path=/tmp/testlog-20100701-112446-wftX -device testdev,chardev=testlog -S 11:24:48 DEBUG| VM appears to be alive with PID 12524 11:24:48 INFO | Waiting for unittest idt_test to complete, timeout 600, output in /tmp/testlog-20100701-112446-wftX 11:24:49 DEBUG| (qemu) (Process terminated with status 7) 11:24:49 ERROR| Unit test idt_test failed 11:24:49 INFO | Unit test log collected and available under /home/lmr/Code/autotest-git/client/results/default/kvm.unittest/debug/idt_test.log Config entry: [idt_test] file = idt_test.flat Means no additional params other than the flat file is being used - see the command line used above. The log is not very helpful stating what's going wrong: This is fixed by kvm.git's: commit 6a7382e966e07f10137b7d6106ebabfeb76998d9 Author: Avi Kivity a...@redhat.com Date: Thu Jun 10 17:02:15 2010 +0300 KVM: Fix mov cr4 #GP at wrong instruction enabling apic Starting IDT test unhandled excecption Log is attached as well. Problem caused by a recent change, fix committed to qemu-kvm.git. 2) access /home/lmr/Code/autotest-git/client/tests/kvm/qemu -name 'vm1' -monitor unix:'/tmp/monitor-humanmonitor1-20100701-112446-wftX',server,nowait -serial unix:'/tmp/serial-20100701-112446-wftX',server,nowait -m 512 -kernel '/home/lmr/Code/autotest-git/client/tests/kvm/unittests/access.flat' -vnc :1 -chardev file,id=testlog,path=/tmp/testlog-20100701-112446-wftX -device testdev,chardev=testlog -S 11:25:50 DEBUG| VM appears to be alive with PID 12636 11:25:50 INFO | Waiting for unittest access to complete, timeout 600, output in /tmp/testlog-20100701-112446-wftX 11:29:28 DEBUG| (qemu) (Process terminated with status 1) 11:29:29 ERROR| Unit test access failed 11:29:29 INFO | Unit test log collected and available under /home/lmr/Code/autotest-git/client/results/default/kvm.unittest/debug/access.log Config entry: [access] file = access.flat Massive log, compressed and attached. run test pde.p user: FAIL: error code 5 expected 4 test pte.rw pde.p user: FAIL: error code 5 expected 4 test pte.user pde.p user: FAIL: error code 5 expected 4 test pte.rw pte.user pde.p user: FAIL: error code 5 expected 4 test pte.a pde.p user: FAIL: error code 5 expected 4 test pte.rw pte.a pde.p user: FAIL: error code 5 expected 4 test pte.user pte.a pde.p user: FAIL: error code 5 expected 4 P flag (bit 0). This flag is 0 if there is no valid translation for the linear address because the P flag was 0 in one of the paging-structure entries used to translate that address. Avi, a walk ignoring access permissions should be done to properly set the P flag on error code. Does anybody care? 3) apic /home/lmr/Code/autotest-git/client/tests/kvm/qemu -name 'vm1' -monitor unix:'/tmp/monitor-humanmonitor1-20100701-112446-wftX',server,nowait -serial unix:'/tmp/serial-20100701-112446-wftX',server,nowait -m 512 -kernel '/home/lmr/Code/autotest-git/client/tests/kvm/unittests/apic.flat' -vnc :1 -chardev file,id=testlog,path=/tmp/testlog-20100701-112446-wftX -device testdev,chardev=testlog -S 11:29:30 DEBUG| VM appears to be alive with PID 12713 11:29:30 INFO | Waiting for unittest apic to complete, timeout 600, output in /tmp/testlog-20100701-112446-wftX 11:30:55 WARNI| VM 'vm1' failed to produce a screendump 11:31:46 WARNI| VM 'vm1' failed to produce a screendump 11:36:03 WARNI| VM 'vm1' failed to produce a screendump 11:36:49 WARNI| VM 'vm1' failed to produce a screendump 11:38:17 WARNI| VM 'vm1' failed to produce a screendump 11:39:31 DEBUG| Timeout elapsed 11:39:31 ERROR| Exception happened during apic: Timeout elapsed (600s) 11:39:31 INFO | Unit test log collected and available under /home/lmr/Code/autotest-git/client/results/default/kvm.unittest/debug/apic.log Config entry: [apic] file = apic.flat smp = 2 extra_params: -cpu qemu64,+x2apic So, -smp 2 and -cpu qemu64,+x2apic were used on the command line. Log attached. I am not sure whether more time is necessary to run the test. Please advise. 4) emulator /home/lmr/Code/autotest-git/client/tests/kvm/qemu -name 'vm1' -monitor unix:'/tmp/monitor-humanmonitor1-20100701-112446-wftX',server,nowait -serial
[Qemu-devel] [Bug 586175] Re: Windows XP/2003 doesn't boot
Long time no post, i tried to install Win2k3 through RIS/PXE this time. I still get the same error at boot time: A disk read error occurred. Press Ctrl + Alt + Del to restart. Neither the Win2k3 install source nor the VirtIO drivers are defective. Something's just wrong with QEMU. Currently qemu.git is able to compile itself properly, so I'll check it out. Without libvirt (because it can't parse qemu-kvm-devel as version string :/ https://bugzilla.redhat.com/show_bug.cgi?id=609741 ) ** Bug watch added: Red Hat Bugzilla #609741 https://bugzilla.redhat.com/show_bug.cgi?id=609741 -- Windows XP/2003 doesn't boot https://bugs.launchpad.net/bugs/586175 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: Incomplete Status in Debian GNU/Linux: New Status in Fedora: Unknown Bug description: Hello everyone, my qemu doesn't boot any Windows XP/2003 installations if I try to boot the image. If I boot the install cd first, it's boot manager counts down and triggers the boot on it's own. That's kinda stupid. I'm using libvirt, but even by a simple qemu-kvm -drive file=image.img,media=disk,if=ide,boot=on it won't boot. Qemu hangs at the message Booting from Hard Disk... I'm using qemu-kvm-0.12.4 with SeaBIOS 0.5.1 on Gentoo (No-Multilib and AMD64). It's a server, that means I'm using VNC as the primary graphic output but i don't think it should be an issue.
[Qemu-devel] [Bug 586420] Re: WinXP install cd hangs at boot time if machine started with floppy
@Jes: No, it doesn't boot on its own, it's a simple FAT formatted floppy image. I've even tried to format a real floppy on Windows, copied the drivers on it, and saved the whole floppy as an image with rawrite. I also tried other floppy images, but: QEMU 0.12.4 hangs if I try to boot from the Win2k3 install cd AND use a floppy image at the same time. If I boot from the Win2k8 install cd, it works fine. The install cd of Win2k3 isn't corrupted. It just can't be because I even tried it with an ISO image downloaded directly from MS. And both images boot up everywhere, as well on real machines as on VirtualBox. -- WinXP install cd hangs at boot time if machine started with floppy https://bugs.launchpad.net/bugs/586420 You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. Status in QEMU: Incomplete Bug description: I have a second problem: I wanted to install Windows Server 2003 on a virtio drive, so I tried to start the machine with the install cd as the boot drive and a floppy image with the viostor drivers. The problem is, the install cd hangs at boot time. If I start VNC I just see a black ground with 640x480. I've also tried this with the install cd of Windows Server 2008 R2 and it works! Could it be that the BIOS screws up because the older install cds are using the floppy emulation to boot the setup?