Re: [Qemu-devel] [PATCH 2/6] kvm: support using KVM_MEM_READONLY flag for readonly regions
On 04/28/2013 04:32 PM, Jordan Justen wrote: A slot that uses KVM_MEM_READONLY can be read from and code can execute from the region, but writes will trap. For regions that are readonly and also not writeable, we force the slot to be removed so reads or writes to the region will trap. (A memory region in this state is not executable within kvm.) Reviewed-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
Re: [Qemu-devel] [PATCH 4/4] usb-host: raise libusbx minimum version to 1.0.13
Hi, I'm wondering how best to address QEMU's libusb support on FreeBSD, and discovered the libusb vs. libusbx saga. Is it safe to assume that in the Linux world pkg-config libusb-1.0 is generally going to refer to libusbx? In recent linux distributions yes. FreeBSD has its own libusb-compatible implementation, but currently lacks libusb_get_port_path and perhaps others, and if libusbx is virtually universal on Linux we presumably want to grow these same interfaces. Yes. Even better would be to get the freebsd support merged into libusbx. /me suspects the reason why freebsd has its own implementation is basically the same why the libusbx exists in the first place: unfriendly libusb upstream. So if you tried + failed to merge the freebsd bits to libusb in the past it is worth trying again to get them into libusbx, then switch over freebsd to libusbx too. cheers, Gerd
Re: [Qemu-devel] [PATCH 3/6] kvm: workaround a possible KVM bug when using KVM_MEM_READONLY
On 04/28/2013 04:32 PM, Jordan Justen wrote: On a Linux 3.8.0 based kernel, I occasionally saw a situation where the memory region would continue to trap on memory read even though KVM_MEM_READONLY was set. I found that if I set the slot to a size of 0, and before setting the slot, it would then behave as expected. Yes, the KVM_MEM_READONLY flag can not be directly changed, see the commit 75d61fbcf563373696578570e914f555e12c8d97 on kvm tree. Do you think the way which deletes the memslot before changing the KVM_MEM_READONLY hurts the performance? If yes, i will make it be directly changed. Thanks for you implementing the READONLY memory in Qemu. Reviewed-by: Xiao Guangrong xiaoguangr...@linux.vnet.ibm.com
[Qemu-devel] [PATCH] spapr_llan: fix device reenabling
Normally, the tap device is polled by QEMU if a guest NIC can receive packets. If a guest NIC is stopped during transfer (rmmod or ifdown), it may still have packets in a queue which have to be send to the guest before QEMU enables polling of a tap interface via tap_update_fd_handler(). However the spapr_llan device was missing the qemu_flush_queued_packets() call so the tap_send_completed() callback was never called and therefore tap interface polling was not enabled ever. The patch fixes this problem. Signed-off-by: Alexey Kardashevskiy a...@ozlabs.ru --- hw/net/spapr_llan.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/hw/net/spapr_llan.c b/hw/net/spapr_llan.c index cca3d1a..46f7d5f 100644 --- a/hw/net/spapr_llan.c +++ b/hw/net/spapr_llan.c @@ -336,6 +336,8 @@ static target_ulong h_register_logical_lan(PowerPCCPU *cpu, spapr_vio_dma_set(sdev, VLAN_BD_ADDR(rec_queue), 0, VLAN_BD_LEN(rec_queue)); dev-isopen = 1; +qemu_flush_queued_packets(qemu_get_queue(dev-nic)); + return H_SUCCESS; } -- 1.7.10.4
Re: [Qemu-devel] [PATCH] qemu-iotests: Filter out vmdk creation options
Am 03.05.2013 um 03:31 hat Fam Zheng geschrieben: Cover new image creation options for vmdk, so we can use '-o zeroed_grain=XXX' and '-o subformat=XXX' to run the tests successfully. Signed-off-by: Fam Zheng f...@redhat.com --- tests/qemu-iotests/common.rc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index a536bf7..442cf51 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -127,6 +127,8 @@ _make_test_img() -e s# compat='[^']*'##g \ -e s# compat6=\\(on\\|off\\)##g \ -e s# static=\\(on\\|off\\)##g \ +-e s# zeroed_grain=\\(on\\|off\\)##g \ +-e s# subformat='[^']*'##g \ -e s# lazy_refcounts=\\(on\\|off\\)##g # Start an NBD server on the image file, which is what we'll be talking to Maybe add adapter_type as well? It's a string option, so it's not displayed if it's not set, but if we want to make it testable (with an explicit ./check -o option or in a new test case that adds it), we need to filter it out. But the patch obviously helps even without adapter_type, so anyway: Reviewed-by: Kevin Wolf kw...@redhat.com
Re: [Qemu-devel] [PATCH] qemu-iotests: Filter out vmdk creation options
On Fri, 05/03 08:48, Kevin Wolf wrote: Am 03.05.2013 um 03:31 hat Fam Zheng geschrieben: Cover new image creation options for vmdk, so we can use '-o zeroed_grain=XXX' and '-o subformat=XXX' to run the tests successfully. Signed-off-by: Fam Zheng f...@redhat.com --- tests/qemu-iotests/common.rc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index a536bf7..442cf51 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -127,6 +127,8 @@ _make_test_img() -e s# compat='[^']*'##g \ -e s# compat6=\\(on\\|off\\)##g \ -e s# static=\\(on\\|off\\)##g \ +-e s# zeroed_grain=\\(on\\|off\\)##g \ +-e s# subformat='[^']*'##g \ -e s# lazy_refcounts=\\(on\\|off\\)##g # Start an NBD server on the image file, which is what we'll be talking to Maybe add adapter_type as well? It's a string option, so it's not displayed if it's not set, but if we want to make it testable (with an explicit ./check -o option or in a new test case that adds it), we need to filter it out. But the patch obviously helps even without adapter_type, so anyway: Reviewed-by: Kevin Wolf kw...@redhat.com Sure, I'll send another patch for that option. Thanks. -- Fam
[Qemu-devel] [PATCH] qemu-iotests: Filter out 'adapter_type'
Filter out vmdk creation option 'adapter_type' for vmdk. So that tests with an explicit './check -o adapter_type=XXX' will not fail. --- tests/qemu-iotests/common.rc | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index 442cf51..31eb62b 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -129,6 +129,7 @@ _make_test_img() -e s# static=\\(on\\|off\\)##g \ -e s# zeroed_grain=\\(on\\|off\\)##g \ -e s# subformat='[^']*'##g \ +-e s# adapter_type='[^']*'##g \ -e s# lazy_refcounts=\\(on\\|off\\)##g # Start an NBD server on the image file, which is what we'll be talking to -- 1.8.1.4
Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock
On Fri, May 3, 2013 at 12:58 AM, Jan Kiszka jan.kis...@siemens.com wrote: Hi Pingfan, On 2012-12-06 08:28, liu ping fan wrote: Any suggestion? Or new design idea for this? Finally... I'm getting back to this. I'm currently trying to make use of this series, adapting it to my needs (selective BQL-free dispatching of PIO regions). Glad that you are back :) Is there a newer version available on your side? This one obviously no No, but I can see the code and rebase next week. longer applies due to all the code movements in QEMU. But it also seems to contain some bugs, at least in patch 5 (mixed up page number vs. page address around for address_space_section_lookup_ref). Will pay some time to see it. Then we should get rid of the ref/unref callbacks. Making a memory region BQL-free must be as simple as setting a flag or (more likely) adding a reference to the owning QOM object in the region. Reimplementing ref/unref in device models over and over again is clearly a no-go. Maybe I'm currently forgetting a use case where overloading the At the beginning, Avi suggest to enforce mr-opaque to be Device object, but due to the nested embedded Object, we fail. And finally Avi suggest ref/unref interface. From my point, we can save lots of reimplementing ref/unref in device models by telling whether mr-opauque is Object or not. And leave not object case to reimplement ref/unref. reference functions is needed, so please help my memory in that case. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux
[Qemu-devel] [PATCH] curl: fix curl read
CURL library API has changed, the current curl driver is not working. This patch rewrites the use of API as well as the structure of internal states. (It is hard to split this to multiple patches as basically all these changes need to work together.) BDRVCURLState holds the pointer to curl multi interface (man 3 libcurl-multi), and 4 lists for internal states: - CURLState holds state for libcurl connection (man 3 libcurl-easy) - CURLSockInfo holds information for libcurl socket interface (man 3 curl_multi_socket_action). - CURLDataCache holds the user data read from libcurl, it is in a list ordered by access, the used cache is moved to list head on access, so the tail element is freed first. BDRVCURLState.cache_quota is the threshold to start freeing cache. - CURLAIOCB holds ongoing aio information. Signed-off-by: Fam Zheng f...@redhat.com --- block/curl.c | 553 --- 1 file changed, 336 insertions(+), 217 deletions(-) diff --git a/block/curl.c b/block/curl.c index b8935fd..e5ad36f 100644 --- a/block/curl.c +++ b/block/curl.c @@ -38,14 +38,9 @@ CURLPROTO_FTP | CURLPROTO_FTPS | \ CURLPROTO_TFTP) -#define CURL_NUM_STATES 8 -#define CURL_NUM_ACB8 #define SECTOR_SIZE 512 #define READ_AHEAD_SIZE (256 * 1024) - -#define FIND_RET_NONE 0 -#define FIND_RET_OK 1 -#define FIND_RET_WAIT 2 +#define CURL_CACHE_QUOTA 10 struct BDRVCURLState; @@ -56,171 +51,219 @@ typedef struct CURLAIOCB { int64_t sector_num; int nb_sectors; - -size_t start; -size_t end; +QLIST_ENTRY(CURLAIOCB) next; } CURLAIOCB; -typedef struct CURLState -{ +typedef struct CURLDataCache { +char *data; +size_t base_pos; +size_t data_len; +size_t write_pos; +/* Ref count for CURLState */ +int use_count; +QLIST_ENTRY(CURLDataCache) next; +} CURLDataCache; + +typedef struct CURLState { struct BDRVCURLState *s; -CURLAIOCB *acb[CURL_NUM_ACB]; CURL *curl; -char *orig_buf; -size_t buf_start; -size_t buf_off; -size_t buf_len; -char range[128]; +#define CURL_RANGE_SIZE 128 +char range[CURL_RANGE_SIZE]; char errmsg[CURL_ERROR_SIZE]; -char in_use; +CURLDataCache *cache; +QLIST_ENTRY(CURLState) next; } CURLState; +typedef struct CURLSockInfo { +curl_socket_t fd; +int action; +struct BDRVCURLState *s; +QLIST_ENTRY(CURLSockInfo) next; +} CURLSockInfo; + typedef struct BDRVCURLState { CURLM *multi; size_t len; -CURLState states[CURL_NUM_STATES]; +QLIST_HEAD(, CURLState) curl_states; +QLIST_HEAD(, CURLAIOCB) acbs; +QLIST_HEAD(, CURLSockInfo) socks; char *url; size_t readahead_size; +QEMUTimer *timer; +/* List of data cache ordered by access, freed from tail */ +QLIST_HEAD(, CURLDataCache) cache; +/* Threshold to release unused cache when cache list is longer than it */ +int cache_quota; +/* Whether http server accept range in header */ +bool accept_range; +/* Whether certificated ssl only */ +bool ssl_no_cert; } BDRVCURLState; static void curl_clean_state(CURLState *s); -static void curl_multi_do(void *arg); +static void curl_fd_handler(void *arg); static int curl_aio_flush(void *opaque); +static CURLDataCache *curl_find_cache(BDRVCURLState *bs, + size_t start, size_t len) +{ +CURLDataCache *c; +QLIST_FOREACH(c, bs-cache, next) { +if (start = c-base_pos +start + len = c-base_pos + c-write_pos) { +return c; +} +} +return NULL; +} + static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action, void *s, void *sp) { +BDRVCURLState *bs = (BDRVCURLState *)s; +/*int running;*/ DPRINTF(CURL (AIO): Sock action %d on fd %d\n, action, fd); +CURLSockInfo *sock = (CURLSockInfo *)sp; +if (!sp) { +sock = g_malloc0(sizeof(CURLSockInfo)); +sock-fd = fd; +sock-s = bs; +QLIST_INSERT_HEAD(bs-socks, sock, next); +curl_multi_assign(bs-multi, fd, sock); +} switch (action) { case CURL_POLL_IN: -qemu_aio_set_fd_handler(fd, curl_multi_do, NULL, curl_aio_flush, s); +qemu_aio_set_fd_handler(fd, curl_fd_handler, NULL, +curl_aio_flush, sock); +sock-action |= CURL_CSELECT_IN; break; case CURL_POLL_OUT: -qemu_aio_set_fd_handler(fd, NULL, curl_multi_do, curl_aio_flush, s); +qemu_aio_set_fd_handler(fd, NULL, curl_fd_handler, curl_aio_flush, +sock); +sock-action |= CURL_CSELECT_OUT; break; case CURL_POLL_INOUT: -qemu_aio_set_fd_handler(fd, curl_multi_do, curl_multi_do, -curl_aio_flush, s); +
Re: [Qemu-devel] [PATCH v7 0/7] push mmio dispatch out of big lock
On 2013-05-03 09:37, liu ping fan wrote: On Fri, May 3, 2013 at 12:58 AM, Jan Kiszka jan.kis...@siemens.com wrote: Hi Pingfan, On 2012-12-06 08:28, liu ping fan wrote: Any suggestion? Or new design idea for this? Finally... I'm getting back to this. I'm currently trying to make use of this series, adapting it to my needs (selective BQL-free dispatching of PIO regions). Glad that you are back :) Is there a newer version available on your side? This one obviously no No, but I can see the code and rebase next week. I've already rebased and started to fix/cleanup. Today I will look into the ref/unref vs. Object topic and try a first unlocked device model. I can share afterward so that we do not need to do the work twice. longer applies due to all the code movements in QEMU. But it also seems to contain some bugs, at least in patch 5 (mixed up page number vs. page address around for address_space_section_lookup_ref). Will pay some time to see it. Fixed. Specifically subpages were broken. As I've converted portio to use the memory core dispatcher, I'm getting a lot of them now and triggered the bugs immediately. There is still some nesting issue around coalesced MMIO flushing. Need to look into this today as well. So far I've worked around it by assuming that nesting is fine if we enter the dispatcher with BQL held. Then we should get rid of the ref/unref callbacks. Making a memory region BQL-free must be as simple as setting a flag or (more likely) adding a reference to the owning QOM object in the region. Reimplementing ref/unref in device models over and over again is clearly a no-go. Maybe I'm currently forgetting a use case where overloading the At the beginning, Avi suggest to enforce mr-opaque to be Device object, but due to the nested embedded Object, we fail. And finally Avi suggest ref/unref interface. From my point, we can save lots of reimplementing ref/unref in device models by telling whether mr-opauque is Object or not. And leave not object case to reimplement ref/unref. We can't change the semantics of opaque as long as old_mmio / old_portio are around. But we need a flag anyway to indicate if a region is depending on BQL or not. Adding a separate Object *owner to MemoryRegion can serve both purposes. Then we define something like void memory_region_set_local_locking(MemoryRegion *mr, bool local_locking, Object *owner); to control the property (if local_locking is true, owner must be non-NULL, of course). That's quite similar to my old prototype here that had memory_region_set/clear_global_locking. Jan -- Siemens AG, Corporate Technology, CT RTC ITP SDP-DE Corporate Competence Center Embedded Linux
[Qemu-devel] [Bug 685096] Re: USB Passthrough not working for Windows 7 guest
This is also affecting Windows Server 2008 and happens with all usb storage devices I tested. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/685096 Title: USB Passthrough not working for Windows 7 guest Status in QEMU: Confirmed Status in “qemu-kvm” package in Ubuntu: Confirmed Bug description: USB Passthrough from host to guest is not working for a 32-bit Windows 7 guest, while it works perfectly for a 32-bit Windows XP guest. The device appears in the device manager of Windows 7, but with Error code 10: device cannot start. I have tried this with numerous USB thumbdrives and a USB wireless NIC, all with the same result. The device name and functionality is recognized, so at least some USB negotiation is taking place. I am trying this with the latest git-pull of QEMU-KVM. The command line to launch qemu-kvm for win7 is: sudo /home/user/local_install/bin/qemu-system-x86_64 -cpu core2duo -m 1024 -smp 2 -vga std -hda ./disk_images/win7.qcow -vnc :1 -boot c -usb -usbdevice tablet -usbdevice host:0781:5150 The command line to launch qemu-kvm for winxp is: sudo /home/user/local_install/bin/qemu-system-x86_64 -cpu core2duo -m 1024 -smp 2 -usb -vga std -hda ./winxpsp3.qcow -vnc :0 -boot c -usbdevice tablet -usbdevice host:0781:5150 Any help is appreciated. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/685096/+subscriptions
Re: [Qemu-devel] [PATCH V4 0/5] block: make qmp_transaction extendable
Am 02.05.2013 um 04:26 hat Wenchao Xia geschrieben: This serial will package backing chain snapshot code as one case, to make it possible adding more operations later. v2: Address Kevin's comments: Use the same prototype prepare, commit, rollback model in original code, commit should never fail. v3: Address Stefan's comments: 3/5, 4/5: remove *action parameter since later only BlkTransactionStates* is needed. 5/5: embbed BlkTransactionStates in ExternalSnapshotStates, *opaque is removed, related call back function format change for external snapshot. Address Kevin's comments: removed all indention in commit message. 1/5: return void for prepare() function, *errp plays the role as error checker. 5/5: mark *commit callback must exist, *rollback callback can be NULL. Align callback = in const BdrvActionOps external_snapshot_ops to the same colum. Address Eric's comments: 1/5: better commit message. 5/5: better commit message and comments in code that only one of rollback() or commit() will be called. v4: 5/5: document clean() callback will always be called if it present, declare static for global variable actions, use array plus .instance_size to remove switch checking code according to caller input. Wenchao Xia (5): 1 block: package preparation code in qmp_transaction() 2 block: move input parsing code in qmp_transaction() 3 block: package committing code in qmp_transaction() 4 block: package rollback code in qmp_transaction() 5 block: make all steps in qmp_transaction() as callback blockdev.c | 263 ++- 1 files changed, 169 insertions(+), 94 deletions(-) I would have used s/rollback/abort/ for consistency with bdrv_reopen_*(), but this doesn't make a difference for the correctness, so: Reviewed-by: Kevin Wolf kw...@redhat.com
Re: [Qemu-devel] [PATCH v2 1/6] hostmem: make hostmem single, not per Vring related
On Fri, May 03, 2013 at 10:45:17AM +0800, Liu Ping Fan wrote: diff --git a/include/exec/memory.h b/include/exec/memory.h index 9e88320..2761668 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -887,6 +887,7 @@ void *address_space_map(AddressSpace *as, hwaddr addr, void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, int is_write, hwaddr access_len); +void hostmem_init(void); This should go in hostmem.h. exec.c must now include hostmem.h. You need to take care of the ./configure dependency: hostmem.o is only build when CONFIG_VIRTIO_BLK_DATA_PLANE is 'y'.
Re: [Qemu-devel] [PATCH v2 2/6] hostmem: AddressSpace has its own map and maintained by RCU prepared style
On Fri, May 03, 2013 at 10:45:18AM +0800, Liu Ping Fan wrote: +/** + * Install new regions list + */ +static void hostmem_listener_commit(MemoryListener *listener) +{ +HostMem *tmp; +AddrSpaceMem *as_mem = container_of(listener, AddrSpaceMem, listener); + +/* writer of cur_hostmem next_hostmem is serialed by biglock s/serialed/serialized/ + * in hotplug path. So only take care of r/w on cur_hostmem + */ Indentation. @@ -164,18 +203,30 @@ void hostmem_init(void) .coalesced_mmio_del = hostmem_listener_coalesced_mmio_dummy, .priority = 10, }; +as_mem-cur_hostmem = g_new0(HostMem, 1); +as_mem-cur_hostmem-ref = 1; +memory_listener_register(as_mem-listener, as); -memory_listener_register(system_mem-listener, address_space_memory); -if (system_mem-num_new_regions 0) { -hostmem_listener_commit(system_mem-listener); -} The point of this if statement was to make the newly added regions visible. I guess it is not necessary because exec.c is calling us before any memory gets initialized now?
Re: [Qemu-devel] [PATCH] qemu-iotests: Filter out vmdk creation options
On Fri, May 03, 2013 at 09:31:40AM +0800, Fam Zheng wrote: Cover new image creation options for vmdk, so we can use '-o zeroed_grain=XXX' and '-o subformat=XXX' to run the tests successfully. Signed-off-by: Fam Zheng f...@redhat.com --- tests/qemu-iotests/common.rc | 2 ++ 1 file changed, 2 insertions(+) Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan
Re: [Qemu-devel] [PATCH v5 0/6] vmdk: zeroed-grain GTE support
On Thu, May 02, 2013 at 10:25:21AM +0800, Fam Zheng wrote: Added support for zeroed-grain GTE to VMDK according to VMDK Spec 5.0[1]. [1] Virtual Disk Format 5.0 - VMware, http://www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf?src=vmdk Changes since v4: - 6/6: Correct endianess: *m_data-l2_cache_entry = offset; Changes since v3: - 5/6: Remove tmp - 6/6: Update L2 Cache Remove redundant assignment to m_data Simplify l2_offset assignment to m_data Fix m_data.offset endian. Changes since v2: - all: Added 5/6 (vmdk: store fields of VmdkMetaData in cpu endian) - 6/6: Avoid side-effect of vmdk_L2update. Change function comment to gtkdoc stype. Fix VMDK4_FLAG_ZG. Changes since v1: - all: fix From: field - 1/5: squash one line of ret code macro change from 2/5 - 2/5: change VMDK4_FLAG_ZG to VMDK4_FLAG_ZERO_GRAIN - 3/5: move BLOCK_OPT_ZEROED_GRAIN defination from block_int.h to vmdk.c - 5/5: fix metadata update issue, unit test with cases 033 034 Fam Zheng (6): vmdk: named return code. vmdk: add support for “zeroed‐grain” GTE vmdk: Add option to create zeroed-grain image vmdk: change magic number to macro vmdk: store fields of VmdkMetaData in cpu endian vmdk: add bdrv_co_write_zeroes block/vmdk.c | 208 +-- 1 file changed, 145 insertions(+), 63 deletions(-) -- 1.8.1.4 Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan
Re: [Qemu-devel] [PATCH v2 4/6] hostmem: hostmem listener pin RAM-Device by refcnt
On Fri, May 03, 2013 at 10:45:20AM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com With ref()/unref() interface of MemoryRegion, we can pin RAM-Device when using its memory, and release it when done. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- hw/virtio/dataplane/hostmem.c | 44 +++- hw/virtio/dataplane/vring.c |8 +++--- include/hw/virtio/dataplane/hostmem.h |4 ++- 3 files changed, 44 insertions(+), 12 deletions(-) diff --git a/hw/virtio/dataplane/hostmem.c b/hw/virtio/dataplane/hostmem.c index 1fd3e06..0e28dfc 100644 --- a/hw/virtio/dataplane/hostmem.c +++ b/hw/virtio/dataplane/hostmem.c @@ -42,18 +42,28 @@ static void hostmem_ref(HostMem *hostmem) static void hostmem_unref(HostMem *hostmem) { -int t; +int i, t; +HostMemRegion *hmr; t = __sync_sub_and_fetch(hostmem-ref, 1); assert(t = 0); if (!t) { +for (i = 0; i hostmem-num_current_regions; i++) { +hmr = hostmem-current_regions[i]; +/* Fix me, when memory hotunplug implement + * assert(hmr-mr_ops-unref) + */ +if (hmr-mr-ops hmr-mr-ops-unref) { +hmr-mr-ops-unref(); +} This patch should use memory_region_ref()/unref() which you introduced in the previous patch instead of open-coding this. @@ -158,6 +176,18 @@ static void hostmem_listener_append_region(MemoryListener *listener, hostmem_append_new_region(as_mem-next_hostmem, section); } +static void hostmem_listener_append_region(MemoryListener *listener, + MemoryRegionSection *section) +{ +hostmem_listener_nop_region(listener, section); +/* Fix me, when memory hotunplug implement + * assert(section-mr-ops-ref) + */ +if (section-mr-ops section-mr-ops-ref) { +section-mr-ops-ref(); +} +} Why does append increment the refcount while nop does not? It seems that we always need to increment the memory region refcount since we're building a completely new hostmem. The refcount ownership is not passed from the current hostmen to the next hostmem, all memory regions are released in hostmem_unref().
Re: [Qemu-devel] [PATCH] target-mips: fix calculation of overflow for SHLL.PH and SHLL.QB
On Sun, Apr 28, 2013 at 03:18:36AM +0200, Petar Jovanovic wrote: From: Petar Jovanovic petar.jovano...@imgtec.com This change corrects and simplifies how discard is calculated for shift left logical vector instructions. It is used to detect overflow and set bit 22 in the DSPControl register. The existing tests (shll_ph.c, shll_qb.c) are extended with the corner cases that expose incorrectness in the previous implementation. Signed-off-by: Petar Jovanovic petar.jovano...@imgtec.com --- target-mips/dsp_helper.c| 30 ++ tests/tcg/mips/mips32-dsp/shll_ph.c | 33 - tests/tcg/mips/mips32-dsp/shll_qb.c | 23 +-- 3 files changed, 59 insertions(+), 27 deletions(-) diff --git a/target-mips/dsp_helper.c b/target-mips/dsp_helper.c index f975da0..805247d 100644 --- a/target-mips/dsp_helper.c +++ b/target-mips/dsp_helper.c @@ -682,49 +682,31 @@ static inline uint8_t mipsdsp_sat8_reduce_precision(uint16_t a, static inline uint8_t mipsdsp_lshift8(uint8_t a, uint8_t s, CPUMIPSState *env) { -uint8_t sign; uint8_t discard; -if (s == 0) { -return a; -} else { -sign = (a 7) 0x01; -if (sign != 0) { -discard = (((0x01 (8 - s)) - 1) s) | - ((a (6 - (s - 1))) ((0x01 s) - 1)); -} else { -discard = a (6 - (s - 1)); -} +if (s != 0) { +discard = a (8 - s); if (discard != 0x00) { set_DSPControl_overflow_flag(1, 22, env); } -return a s; } +return a s; } static inline uint16_t mipsdsp_lshift16(uint16_t a, uint8_t s, CPUMIPSState *env) { -uint8_t sign; uint16_t discard; -if (s == 0) { -return a; -} else { -sign = (a 15) 0x01; -if (sign != 0) { -discard = (((0x01 (16 - s)) - 1) s) | - ((a (14 - (s - 1))) ((0x01 s) - 1)); -} else { -discard = a (14 - (s - 1)); -} +if (s != 0) { +discard = (int16_t)a (15 - s); if ((discard != 0x) (discard != 0x)) { set_DSPControl_overflow_flag(1, 22, env); } -return a s; } +return a s; } diff --git a/tests/tcg/mips/mips32-dsp/shll_ph.c b/tests/tcg/mips/mips32-dsp/shll_ph.c index b8f1ff5..5fa58cc 100644 --- a/tests/tcg/mips/mips32-dsp/shll_ph.c +++ b/tests/tcg/mips/mips32-dsp/shll_ph.c @@ -11,7 +11,38 @@ int main() resultdsp = 1; __asm -(shll.ph %0, %2, 0x0B\n\t +(wrdsp $0\n\t + shll.ph %0, %2, 0x0B\n\t + rddsp %1\n\t + : =r(rd), =r(dsp) + : r(rt) +); +dsp = (dsp 22) 0x01; +assert(dsp == resultdsp); +assert(rd == result); + +rt= 0x7fff8000; +result= 0xfffe; +resultdsp = 1; + +__asm +(wrdsp $0\n\t + shll.ph %0, %2, 0x01\n\t + rddsp %1\n\t + : =r(rd), =r(dsp) + : r(rt) +); +dsp = (dsp 22) 0x01; +assert(dsp == resultdsp); +assert(rd == result); + +rt= 0x0001; +result= 0x8000; +resultdsp = 1; + +__asm +(wrdsp $0\n\t + shll.ph %0, %2, 0x0F\n\t rddsp %1\n\t : =r(rd), =r(dsp) : r(rt) diff --git a/tests/tcg/mips/mips32-dsp/shll_qb.c b/tests/tcg/mips/mips32-dsp/shll_qb.c index 8c1b91c..729716d 100644 --- a/tests/tcg/mips/mips32-dsp/shll_qb.c +++ b/tests/tcg/mips/mips32-dsp/shll_qb.c @@ -11,12 +11,14 @@ int main() resultdsp = 0x00; __asm -(shll.qb %0, %2, 0x00\n\t +(wrdsp $0\n\t + shll.qb %0, %2, 0x00\n\t rddsp %1\n\t : =r(rd), =r(dsp) : r(rt) ); dsp = (dsp 22) 0x01; +assert(dsp == resultdsp); assert(rd == result); rt = 0x87654321; @@ -24,12 +26,29 @@ int main() resultdsp = 0x01; __asm -(shll.qb %0, %2, 0x03\n\t +(wrdsp $0\n\t + shll.qb %0, %2, 0x03\n\t rddsp %1\n\t : =r(rd), =r(dsp) : r(rt) ); dsp = (dsp 22) 0x01; +assert(dsp == resultdsp); +assert(rd == result); + +rt = 0x0001; +result = 0x0080; +resultdsp = 0x00; + +__asm +(wrdsp $0\n\t + shll.qb %0, %2, 0x07\n\t + rddsp %1\n\t + : =r(rd), =r(dsp) + : r(rt) +); +dsp = (dsp 22) 0x01; +assert(dsp == resultdsp); assert(rd == result); return 0; Thanks, applied. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH v2 02/12] block: update error reporting for bdrv_snapshot_delete() and related functions
Am 24.04.2013 um 17:32 hat Pavel Hrdina geschrieben: @@ -2528,15 +2530,13 @@ void do_delvm(Monitor *mon, const QDict *qdict) bs1 = NULL; while ((bs1 = bdrv_next(bs1))) { if (bdrv_can_snapshot(bs1)) { -ret = bdrv_snapshot_delete(bs1, name); -if (ret 0) { -if (ret == -ENOTSUP) -monitor_printf(mon, - Snapshots not supported on device '%s'\n, - bdrv_get_device_name(bs1)); -else -monitor_printf(mon, Error %d while deleting snapshot on - '%s'\n, ret, bdrv_get_device_name(bs1)); +bdrv_snapshot_delete(bs1, name, local_err); +if (error_is_set(local_err)) { +qerror_report(ERROR_CLASS_GENERIC_ERROR, Failed to delete + old snapshot on device '%s': %s, Here in do_delvm() it doesn't make sense to talk about an old snapshot. Probably some unchanged copy paste from above? + bdrv_get_device_name(bs), + error_get_pretty(local_err)); +error_free(local_err); } } } Kevin
Re: [Qemu-devel] [PATCH for-1.5 0/2] more tcg-arm cleanups
On Mon, Apr 29, 2013 at 08:08:21AM -0700, Richard Henderson wrote: The first patch really ought to get into 1.5. The second patch is a re-done version of the 19/19 patch from version 5 of the previous patch series. We had dropped that due to wanting to avoid 16MB assumption changes. The revised patch merely changes how we handle the constant that needs loading. Please apply. r~ Richard Henderson (2): tcg-arm: Fix 64-bit tlb load for pre-v6 tcg-arm: Use movi32 in exit_tb tcg/arm/tcg-target.c | 18 -- 1 file changed, 8 insertions(+), 10 deletions(-) Thanks, both applied. -- Aurelien Jarno GPG: 1024D/F1BCDB73 aurel...@aurel32.net http://www.aurel32.net
Re: [Qemu-devel] [PATCH v2 5/6] Vring: use hostmem's RAM safe api
On Fri, May 03, 2013 at 10:45:21AM +0800, Liu Ping Fan wrote: @@ -109,11 +110,14 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring) static int get_indirect(Vring *vring, struct iovec iov[], struct iovec *iov_end, unsigned int *out_num, unsigned int *in_num, -struct vring_desc *indirect) +struct vring_desc *indirect, +MemoryRegion ***mrs) { struct vring_desc desc; unsigned int i = 0, count, found = 0; - +MemoryRegion **cur = *mrs; +int ret = 0; +MemoryRegion *tmp; /* Sanity check */ if (unlikely(indirect-len % sizeof(desc))) { error_report(Invalid length in indirect descriptor: @@ -132,19 +136,23 @@ static int get_indirect(Vring *vring, return -EFAULT; } +*mrs[0] = NULL; The goto fail case is broken when we fail with cur *mrs before calling hostmem_lookup(..., cur, ...) since *cur will be undefined. do { struct vring_desc *desc_ptr; /* Translate indirect descriptor */ desc_ptr = hostmem_lookup(indirect-addr + found * sizeof(desc), - sizeof(desc), NULL, false); + sizeof(desc), + tmp, Please use a more descriptive name, like desc_mr. This function is fairly involved so the variable names should be descriptive. + false); if (!desc_ptr) { error_report(Failed to map indirect descriptor addr %# PRIx64 len %zu, (uint64_t)indirect-addr + found * sizeof(desc), sizeof(desc)); vring-broken = true; -return -EFAULT; +ret = -EFAULT; +goto fail; } desc = *desc_ptr; @@ -155,31 +163,40 @@ static int get_indirect(Vring *vring, error_report(Loop detected: last one at %u indirect size %u, i, count); vring-broken = true; -return -EFAULT; +memory_region_unref(tmp); +ret = -EFAULT; +goto fail; } No need to hold onto tmp and handle all these error cases. After the barrier() desc_ptr is no longer used because we've loaded the descriptor into a local struct. Please unref tmp right after barrier(). @@ -209,15 +240,20 @@ static int get_indirect(Vring *vring, * never a valid descriptor number) if none was found. A negative code is * returned on error. * + * @mrs should be the same cnt as iov[] + * * Stolen from linux/drivers/vhost/vhost.c. */ int vring_pop(VirtIODevice *vdev, Vring *vring, struct iovec iov[], struct iovec *iov_end, - unsigned int *out_num, unsigned int *in_num) + unsigned int *out_num, unsigned int *in_num, + MemoryRegion **mrs) { struct vring_desc desc; unsigned int i, head, found = 0, num = vring-vr.num; uint16_t avail_idx, last_avail_idx; +MemoryRegion **indirect, **cur = mrs; +int ret = 0; /* If there was a fatal error then refuse operation */ if (vring-broken) { @@ -263,17 +299,20 @@ int vring_pop(VirtIODevice *vdev, Vring *vring, *out_num = *in_num = 0; i = head; +mrs[0] = NULL; Same goto fail issue here when cur *mrs before hostmem_lookup(..., cur, ...) has been called. do { if (unlikely(i = num)) { error_report(Desc index is %u %u, head = %u, i, num, head); vring-broken = true; -return -EFAULT; +ret = -EFAULT; +goto fail; } if (unlikely(++found num)) { error_report(Loop detected: last one at %u vq size %u head %u, i, num, head); vring-broken = true; -return -EFAULT; +ret = -EFAULT; +goto fail; } desc = vring-vr.desc[i]; @@ -281,10 +320,13 @@ int vring_pop(VirtIODevice *vdev, Vring *vring, barrier(); if (desc.flags VRING_DESC_F_INDIRECT) { -int ret = get_indirect(vring, iov, iov_end, out_num, in_num, desc); +indirect = cur; +int ret = get_indirect(vring, iov, iov_end, out_num, in_num, desc, +indirect); if (ret 0) { -return ret; + goto fail; Indentation.
Re: [Qemu-devel] [PATCH v2 6/6] virtio-blk: release reference to RAM's memoryRegion
On Fri, May 03, 2013 at 10:45:22AM +0800, Liu Ping Fan wrote: From: Liu Ping Fan pingf...@linux.vnet.ibm.com virtio-blk will unref RAM's memoryRegion when the io-req has been done. So we can avoid to call bdrv_drain_all() when RAM hot unplug. Signed-off-by: Liu Ping Fan pingf...@linux.vnet.ibm.com --- hw/block/dataplane/virtio-blk.c | 51 +- 1 files changed, 39 insertions(+), 12 deletions(-) diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index 3bb57d1..047e1df 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -35,6 +35,8 @@ enum { typedef struct { struct iocb iocb; /* Linux AIO control block */ +MemoryRegion *mrs[VRING_MAX]; +int mrs_cnt; QEMUIOVector *inhdr;/* iovecs for virtio_blk_inhdr */ unsigned int head; /* vring descriptor index */ struct iovec *bounce_iov; /* used if guest buffers are unaligned */ @@ -121,6 +123,10 @@ static void complete_request(struct iocb *iocb, ssize_t ret, void *opaque) * transferred plus the status bytes. */ vring_push(s-vring, req-head, len + sizeof(hdr)); +while (--req-mrs_cnt = 0) { +memory_region_unref(req-mrs[req-mrs_cnt]); +} The vring has the property that every virtqueue element popped will be pushed back. Therefore it might be nicer to hide the MemoryRegion unref inside vring_push(). The user wouldn't have to know about MemoryRegion - that's especially nice since there are other devices besides virtio-blk (like virtio-net) that would need to be modified if we use the current approach. Stefan
Re: [Qemu-devel] [PATCH v2 03/12] savevm: update bdrv_snapshot_find() to find snapshot by id or name and add error parameter
Am 24.04.2013 um 17:32 hat Pavel Hrdina geschrieben: Finding snapshot by a name which could also be an id isn't best way how to do it. There will be rewrite of savevm, loadvm and delvm to improve the behavior of these commands. The savevm and loadvm will have their own patch series. Now bdrv_snapshot_find takes more parameters. The name parameter will be matched only against the name of the snapshot and the same applies to id parameter. There is one exception. If you set the last parameter, the name parameter will be matched against the name or the id of a snapshot. This exception is only for backward compatibility for other commands and it will be dropped after all commands will be rewritten. We only need to know if that snapshot exists or not. We don't care about any error message. If snapshot exists it returns TRUE otherwise it returns FALSE. There is also new Error parameter which will containt error messeage if something goes wrong. Signed-off-by: Pavel Hrdina phrd...@redhat.com --- savevm.c | 93 ++-- 1 file changed, 67 insertions(+), 26 deletions(-) diff --git a/savevm.c b/savevm.c index ba97c41..1622c55 100644 --- a/savevm.c +++ b/savevm.c @@ -2262,26 +2262,66 @@ out: return ret; } -static int bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info, - const char *name) +static bool bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info, + const char *name, const char *id, Error **errp, + bool old_match) { QEMUSnapshotInfo *sn_tab, *sn; -int nb_sns, i, ret; +int nb_sns, i; +bool found = false; + +assert(name || id); -ret = -ENOENT; nb_sns = bdrv_snapshot_list(bs, sn_tab); -if (nb_sns 0) -return ret; -for(i = 0; i nb_sns; i++) { +if (nb_sns 0) { +error_setg_errno(errp, -nb_sns, Failed to get a snapshot list); +return found; +} + +if (nb_sns == 0) { +error_setg(errp, Device has no snapshots); This is not an error case, please remove the error_setg(). If the caller indeed needs to have a snapshot, it can generate an error by himself. We cannot assume here that every caller needs it. +return found; +} + +for (i = 0; i nb_sns; i++) { sn = sn_tab[i]; -if (!strcmp(sn-id_str, name) || !strcmp(sn-name, name)) { -*sn_info = *sn; -ret = 0; -break; +if (name id) { +if (!strcmp(sn-id_str, id) !strcmp(sn-name, name)) { +*sn_info = *sn; +found = true; +break; +} +} else if (name) { +/* for compatibility for old bdrv_snapshot_find call + * will be removed */ +if (old_match) { +if (!strcmp(sn-id_str, name) || !strcmp(sn-name, name)) { +*sn_info = *sn; +found = true; +break; +} +} else { +if (!strcmp(sn-name, name)) { +*sn_info = *sn; +found = true; +break; +} +} +} else if (id) { +if (!strcmp(sn-id_str, id)) { +*sn_info = *sn; +found = true; +break; +} } } + +if (!found) { +error_setg(errp, Failed to find snapshot '%s', name ? name : id); +} Same here. + g_free(sn_tab); -return ret; +return found; } /* @@ -2296,7 +2336,7 @@ static int del_existing_snapshots(Monitor *mon, const char *name) bs = NULL; while ((bs = bdrv_next(bs))) { if (bdrv_can_snapshot(bs) -bdrv_snapshot_find(bs, snapshot, name) = 0) +bdrv_snapshot_find(bs, snapshot, name, NULL, NULL, true)) { bdrv_snapshot_delete(bs, name, local_err); if (error_is_set(local_err)) { @@ -2358,8 +2398,7 @@ void do_savevm(Monitor *mon, const QDict *qdict) sn-vm_clock_nsec = qemu_get_clock_ns(vm_clock); if (name) { -ret = bdrv_snapshot_find(bs, old_sn, name); -if (ret = 0) { +if (bdrv_snapshot_find(bs, old_sn, name, NULL, NULL, true)) { pstrcpy(sn-name, sizeof(sn-name), old_sn-name); pstrcpy(sn-id_str, sizeof(sn-id_str), old_sn-id_str); } else { @@ -2440,6 +2479,7 @@ int load_vmstate(const char *name) BlockDriverState *bs, *bs_vm_state; QEMUSnapshotInfo sn; QEMUFile *f; +Error *local_err = NULL; int ret; bs_vm_state = bdrv_snapshots(); @@ -2449,9 +2489,10 @@ int load_vmstate(const char *name) } /* Don't even try to load empty VM states */
[Qemu-devel] Query on network support TCP redir for range of ports
Hi All, Is it possible to specify the range of port to be redirected by qemu? for example, I am want qemu to use the 2000 to 2500 port by qemu. Is it supported? Or do i need to pass all the ports by one by one at invocation? Thanks and Regards, Srikanth Baratam, Bangalore
Re: [Qemu-devel] [PATCH v2 04/12] qapi: Convert delvm
Am 24.04.2013 um 17:32 hat Pavel Hrdina geschrieben: QMP command vm-snapshot-delete takes two parameters: name and id. They are optional, but one of the name or id must be provided. If both are provided they will match only the snapshot with the same name and id. The command returns SnapshotInfo only if the snapshot exists, otherwise it returns an error message. HMP command delvm now uses the new vm-snapshot-delete, but behave slightly different. The delvm takes one optional flag -i and one parameter name. If you provide only the name parameter, it will match only snapshots with that name. If you also provide the flag, it will match only snapshots with name as id. Information about successfully deleted snapshot will be printed, otherwise an error message will be printed. These improves behavior of the command to be more strict on selecting snapshots because actual behavior is wrong. Now if you want to delete snapshot with name '2' but there is no snapshot with that name it could delete snapshot with id '2' and that isn't what you want. There is also small hack to ensure that in each block device with different driver type the correct snapshot is deleted. The 'qcow2' and 'sheepdog' internally search at first for id but 'rbd' has only name and therefore search only for name. Signed-off-by: Pavel Hrdina phrd...@redhat.com One general comment: I'm not sure how much sense it really makes to delete snapshots based on ID on multiple images. The user doesn't have any influence on the ID, I think, so a VM-wide snapshot can only be identified by name across multiple images. --- a/savevm.c +++ b/savevm.c @@ -40,6 +40,7 @@ #include trace.h #include qemu/bitops.h #include qemu/iov.h +#include block/block_int.h #define SELF_ANNOUNCE_ROUNDS 5 @@ -2556,31 +2557,73 @@ int load_vmstate(const char *name) return 0; } -void do_delvm(Monitor *mon, const QDict *qdict) +SnapshotInfo *qmp_vm_snapshot_delete(const bool has_name, const char *name, + const bool has_id, const char *id, + Error **errp) { -BlockDriverState *bs, *bs1; +BlockDriverState *bs; +SnapshotInfo *info = NULL; +QEMUSnapshotInfo sn; Error *local_err = NULL; -const char *name = qdict_get_str(qdict, name); + +if (!has_name !has_id) { +error_setg(errp, Name or id must be provided); +return NULL; +} + +if (!has_name) { +name = NULL; +} +if (!has_id) { +id = NULL; +} bs = bdrv_snapshots(); if (!bs) { -monitor_printf(mon, No block device supports snapshots\n); -return; +error_setg(errp, No block device supports snapshots); +return NULL; } -bs1 = NULL; -while ((bs1 = bdrv_next(bs1))) { -if (bdrv_can_snapshot(bs1)) { -bdrv_snapshot_delete(bs1, name, local_err); +if (!bdrv_snapshot_find(bs, sn, name, id, local_err, false)) { +error_propagate(errp, local_err); +return NULL; +} Why is this necessary? The check didn't exist before. + +info = g_malloc0(sizeof(SnapshotInfo)); +info-id = g_strdup(sn.id_str); +info-name = g_strdup(sn.name); +info-date_nsec = sn.date_nsec; +info-date_sec = sn.date_sec; +info-vm_state_size = sn.vm_state_size; +info-vm_clock_nsec = sn.vm_clock_nsec % 10; +info-vm_clock_sec = sn.vm_clock_nsec / 10; + +bs = NULL; +while ((bs = bdrv_next(bs))) { +if (bdrv_can_snapshot(bs) +bdrv_snapshot_find(bs, sn, name, id, NULL, false)) { Aha, this is the reason: The command is underspecified and you change some behaviour that isn't mentioned in the documentation. Before, the command would return an error if a device that supports snapshots doesn't have the specific snapshot. After the patch, it would silently ignore the snapshot - except in bdrv_snapshots(), which is more or less randomly selected. Why is this an improvement? Independent of what we decide here, the result should be added to the QMP documentation. +/* Small hack to ensure that proper snapshot is deleted for every + * block driver. This will be fixed soon. */ +if (!strcmp(bs-drv-format_name, rbd)) { +bdrv_snapshot_delete(bs, sn.name, local_err); +} else { +bdrv_snapshot_delete(bs, sn.id_str, local_err); +} if (error_is_set(local_err)) { -qerror_report(ERROR_CLASS_GENERIC_ERROR, Failed to delete - old snapshot on device '%s': %s, - bdrv_get_device_name(bs), - error_get_pretty(local_err)); +error_setg(errp, Failed to delete old snapshot on + device '%s': %s,
Re: [Qemu-devel] [PATCH v2 05/12] block: update error reporting for bdrv_snapshot_goto() and related functions
Am 24.04.2013 um 17:32 hat Pavel Hrdina geschrieben: Signed-off-by: Pavel Hrdina phrd...@redhat.com --- a/block/sheepdog.c +++ b/block/sheepdog.c @@ -1867,7 +1867,9 @@ cleanup: return ret; } -static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id) +static void sd_snapshot_goto(BlockDriverState *bs, + const char *snapshot_id, + Error **errp) { BDRVSheepdogState *s = bs-opaque; BDRVSheepdogState *old_s; @@ -1892,13 +1894,13 @@ static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id) ret = find_vdi_name(s, vdi, snapid, tag, vid, 1); if (ret) { -error_report(Failed to find_vdi_name); +error_setg_errno(errp, -ret, Failed to find VDI snapshot '%s', vdi); Isn't snapid what identifies the snapshot? Or maybe the combination of vdi and snapid. goto out; } fd = connect_to_sdog(s); if (fd 0) { -ret = fd; +error_setg_errno(errp, -fd, Failed to connect to sdog); goto out; } @@ -1909,14 +1911,14 @@ static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id) closesocket(fd); if (ret) { +error_setg_errno(errp, -ret, Failed to open VDI snapshot '%s', vdi); goto out; } memcpy(s-inode, buf, sizeof(s-inode)); if (!s-inode.vm_state_size) { -error_report(Invalid snapshot); -ret = -ENOENT; +error_setg(errp, Invalid snapshot '%s', snapshot_id); Here as well. goto out; } @@ -1925,16 +1927,12 @@ static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id) g_free(buf); g_free(old_s); -return 0; +return; out: /* recover bdrv_sd_state */ memcpy(s, old_s, sizeof(BDRVSheepdogState)); g_free(buf); g_free(old_s); - -error_report(failed to open. recover old bdrv_sd_state.); - -return ret; } static void sd_snapshot_delete(BlockDriverState *bs, --- a/savevm.c +++ b/savevm.c @@ -2529,11 +2529,11 @@ int load_vmstate(const char *name) bs = NULL; while ((bs = bdrv_next(bs))) { if (bdrv_can_snapshot(bs)) { -ret = bdrv_snapshot_goto(bs, name); -if (ret 0) { -error_report(Error %d while activating snapshot '%s' on '%s', - ret, name, bdrv_get_device_name(bs)); -return ret; +bdrv_snapshot_goto(bs, name, local_err); +if (error_is_set(local_err)) { +qerror_report_err(local_err); Shouldn't we add the device name on which the failure occurred? +error_free(local_err); +return -EIO; } } } Kevin
Re: [Qemu-devel] [PATCH] qemu-iotests: Filter out 'adapter_type'
On Fri, May 03, 2013 at 03:31:16PM +0800, Fam Zheng wrote: Filter out vmdk creation option 'adapter_type' for vmdk. So that tests with an explicit './check -o adapter_type=XXX' will not fail. --- tests/qemu-iotests/common.rc | 1 + 1 file changed, 1 insertion(+) Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan
Re: [Qemu-devel] [PATCH] curl: fix curl read
On Fri, May 03, 2013 at 04:00:09PM +0800, Fam Zheng wrote: CURL library API has changed, the current curl driver is not working. This patch rewrites the use of API as well as the structure of internal states. (It is hard to split this to multiple patches as basically all these changes need to work together.) Which libcurl APIs have changed? I expect libcurl to be backwards compatible. What is the minimum libcurl version needed with this patch applied? Stefan
Re: [Qemu-devel] [PATCH v2 07/12] savevm: update error reporting for qemu_loadvm_state()
Am 24.04.2013 um 17:32 hat Pavel Hrdina geschrieben: Signed-off-by: Pavel Hrdina phrd...@redhat.com --- include/sysemu/sysemu.h | 2 +- migration.c | 11 ++--- savevm.c| 65 - 3 files changed, 40 insertions(+), 38 deletions(-) @@ -2212,8 +2216,8 @@ int qemu_loadvm_state(QEMUFile *f) ret = vmstate_load(f, le-se, le-version_id); if (ret 0) { -fprintf(stderr, qemu: warning: error while loading state for instance 0x%x of device '%s'\n, -instance_id, idstr); +error_setg(errp, Error while loading state for instance + 0x%x of device '%s', instance_id, idstr); error_setg_errno? goto out; } break; @@ -2227,40 +2231,35 @@ int qemu_loadvm_state(QEMUFile *f) } } if (le == NULL) { -fprintf(stderr, Unknown savevm section %d\n, section_id); -ret = -EINVAL; +error_setg(errp, Unknown vmstate section %d, section_id); goto out; } ret = vmstate_load(f, le-se, le-version_id); if (ret 0) { -fprintf(stderr, qemu: warning: error while loading state section id %d\n, +error_setg(errp, Error while loading state section id %d, section_id); Here, too. goto out; } break; default: -fprintf(stderr, Unknown savevm section type %d\n, section_type); -ret = -EINVAL; +error_setg(errp, Unknown vmstate section type %d, section_type); goto out; } } Kevin
Re: [Qemu-devel] [PATCH V4 0/5] block: make qmp_transaction extendable
On Thu, May 02, 2013 at 10:26:41AM +0800, Wenchao Xia wrote: This serial will package backing chain snapshot code as one case, to make it possible adding more operations later. v2: Address Kevin's comments: Use the same prototype prepare, commit, rollback model in original code, commit should never fail. v3: Address Stefan's comments: 3/5, 4/5: remove *action parameter since later only BlkTransactionStates* is needed. 5/5: embbed BlkTransactionStates in ExternalSnapshotStates, *opaque is removed, related call back function format change for external snapshot. Address Kevin's comments: removed all indention in commit message. 1/5: return void for prepare() function, *errp plays the role as error checker. 5/5: mark *commit callback must exist, *rollback callback can be NULL. Align callback = in const BdrvActionOps external_snapshot_ops to the same colum. Address Eric's comments: 1/5: better commit message. 5/5: better commit message and comments in code that only one of rollback() or commit() will be called. v4: 5/5: document clean() callback will always be called if it present, declare static for global variable actions, use array plus .instance_size to remove switch checking code according to caller input. Wenchao Xia (5): 1 block: package preparation code in qmp_transaction() 2 block: move input parsing code in qmp_transaction() 3 block: package committing code in qmp_transaction() 4 block: package rollback code in qmp_transaction() 5 block: make all steps in qmp_transaction() as callback blockdev.c | 263 ++- 1 files changed, 169 insertions(+), 94 deletions(-) Good for QEMU 1.6. Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
Re: [Qemu-devel] [PATCH] curl: fix curl read
On Fri, 05/03 13:09, Stefan Hajnoczi wrote: On Fri, May 03, 2013 at 04:00:09PM +0800, Fam Zheng wrote: CURL library API has changed, the current curl driver is not working. This patch rewrites the use of API as well as the structure of internal states. (It is hard to split this to multiple patches as basically all these changes need to work together.) Which libcurl APIs have changed? I expect libcurl to be backwards compatible. Function curl_multi_socket_all (see man 3 curl_multi_socket) is deprecated, and not working as I tried the current curl driver, build against libcurl 7.27.0. It's deprecated to be replaced by curl_multi_socket_action. The version does not concern, they are all introduced in the same version, 7.15.4 [1], and has been there for long. [1]: http://curl.haxx.se/libcurl/c/curl_multi_socket_action.html -- Fam
Re: [Qemu-devel] [PATCH 2/8] target-ppc: Convert ppc cpu savevm to VMStateDescription
Am 03.05.2013 03:38, schrieb David Gibson: The savevm code for the powerpc cpu emulation is currently based around the old register_savevm() rather than register_vmstate() method. It's also rather broken, missing some important state on some CPU models. This patch completely rewrites the savevm for target-ppc, using the new VMStateDescription approach. Exactly what needs to be saved in what configurations has been more carefully examined, too. This introduces a new version (5) of the cpu save format. The old load function is retained to support version 4 images. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- target-ppc/cpu.h |9 +- target-ppc/machine.c | 542 ++ 2 files changed, 460 insertions(+), 91 deletions(-) [...] diff --git a/target-ppc/machine.c b/target-ppc/machine.c index 2d10adb..594fe6a 100644 --- a/target-ppc/machine.c +++ b/target-ppc/machine.c [...] +void cpu_save(QEMUFile *f, void *opaque) +{ +vmstate_save_state(f, vmstate_cpu, opaque); + +} + +int cpu_load(QEMUFile *f, void *opaque, int version_id) +{ +return vmstate_load_state(f, vmstate_cpu, opaque, version_id); +} Please drop cpu_{save,load}() and use the VMStateDescription-based registration mechanism cpu_class_set_vmsd() from PowerPCCPU's instance_init in translate_init.c. I'm pretty certain I CC'ed you on that series... Thanks, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH v2 08/12] qapi: Convert loadvm
Am 24.04.2013 um 17:32 hat Pavel Hrdina geschrieben: QMP command vm-snapshot-load and HMP command loadvm behave similar to vm-snapshot-delete and delvm. The only different is that they will load the snapshot instead of deleting it. Signed-off-by: Pavel Hrdina phrd...@redhat.com bs = NULL; while ((bs = bdrv_next(bs))) { if (bdrv_can_snapshot(bs)) { -bdrv_snapshot_goto(bs, name, local_err); +/* Small hack to ensure that proper snapshot is deleted for every + * block driver. This will be fixed soon. */ +if (!strcmp(bs-drv-format_name, rbd)) { +bdrv_snapshot_goto(bs, sn.name, local_err); +} else { +bdrv_snapshot_goto(bs, sn.id_str, local_err); +} I think I wanted to comment this in an earlier patch in this series, but didn't actually do it, so here is what I intended to write there: Umm... Just no. Special casing an image format should _never_ happen. Even more so outside block.c (and it's disliked even there). It's a sign that we're doing something seriously wrong. if (error_is_set(local_err)) { -qerror_report_err(local_err); +error_setg(errp, Failed to load snapshot for device '%s': %s, + bdrv_get_device_name(bs), + error_get_pretty(local_err)); error_free(local_err); -return -EIO; +return NULL; } } } --- a/vl.c +++ b/vl.c @@ -4376,8 +4376,13 @@ int main(int argc, char **argv, char **envp) qemu_system_reset(VMRESET_SILENT); if (loadvm) { -if (load_vmstate(loadvm) 0) { +Error *local_err = NULL; +qmp_vm_snapshot_load(true, loadvm, false, NULL, local_err); + +if (error_is_set(local_err)) { autostart = 0; +qerror_report_err(local_err); +error_free(local_err); } } Should we add something like Loading the snapshot failed? It's probably not clear from all possible error messages. Kevin
Re: [Qemu-devel] [PATCH qom-cpu 6/9] target-i386: Add feature-words property
On Mon, 22 Apr 2013 16:00:17 -0300 Eduardo Habkost ehabk...@redhat.com wrote: This property will be useful for libvirt, as libvirt already has logic based on low-level feature bits (not feature names), so it will be really easy to convert the current libvirt logic to something using the feature-words property. The property will have two main use cases: - Checking host capabilities, by checking the features of the host CPU model - Checking which features are enabled on each CPU model patch doesn't apply to current qom-cpu, more comments below. Example output: $ ./QMP/qmp --path=/tmp/m qom-get --path=/machine/unattached/device[1] --property=feature-words item[0].cpuid-register: EDX item[0].cpuid-input-eax: 2147483658 item[0].features: 0 item[1].cpuid-register: EAX item[1].cpuid-input-eax: 1073741825 item[1].features: 0 item[2].cpuid-register: EDX item[2].cpuid-input-eax: 3221225473 item[2].features: 0 item[3].cpuid-register: ECX item[3].cpuid-input-eax: 2147483649 item[3].features: 101 item[4].cpuid-register: EDX item[4].cpuid-input-eax: 2147483649 item[4].features: 563346425 item[5].cpuid-register: EBX item[5].cpuid-input-eax: 7 item[5].features: 0 item[5].cpuid-input-ecx: 0 could item be represented as CPUID function in format used in Intel's SDM? item[5].CPUID: EAX=7h,ECX=0h item[5].EBX: 0 item[5].EAX: 0 for simplicity/uniformity ECX could be not optional, always present, and ignored when not needed. item[6].cpuid-register: ECX item[6].cpuid-input-eax: 1 item[6].features: 2155880449 item[7].cpuid-register: EDX item[7].cpuid-input-eax: 1 item[7].features: 126614521 Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- Changes v1 - v2: * Merge the non-qapi and qapi patches, to keep series simpler * Use the feature word array series as base, so we don't have to set the feature word values one-by-one in the code * Change type name of property from x86-cpu-feature-words to X86CPUFeatureWordInfo * Remove cpu-qapi-schema.json and simply add the type definitions to qapi-schema.json, to keep the changes simpler * This required compiling qapi-types.o and qapi-visit.o into *-user as well --- .gitignore| 2 ++ Makefile.objs | 7 +- qapi-schema.json | 31 target-i386/cpu.c | 70 +-- 4 files changed, 97 insertions(+), 13 deletions(-) diff --git a/.gitignore b/.gitignore index 487813a..71408e3 100644 --- a/.gitignore +++ b/.gitignore @@ -21,6 +21,8 @@ linux-headers/asm qapi-generated qapi-types.[ch] qapi-visit.[ch] +cpu-qapi-types.[ch] +cpu-qapi-visit.[ch] still needed? qmp-commands.h qmp-marshal.c qemu-doc.html diff --git a/Makefile.objs b/Makefile.objs index a473348..1835d37 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -78,10 +78,15 @@ common-obj-$(CONFIG_SMARTCARD_NSS) += $(libcacard-y) ## # qapi -common-obj-y += qmp-marshal.o qapi-visit.o qapi-types.o +common-obj-y += qmp-marshal.o common-obj-y += qmp.o hmp.o endif +## +# some qapi visitors are used by both system and user emulation: + +common-obj-y += qapi-visit.o qapi-types.o + ### # Target-independent parts used in system and user emulation common-obj-y += qemu-log.o diff --git a/qapi-schema.json b/qapi-schema.json index 751d3c2..424434f 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3505,3 +3505,34 @@ '*asl_compiler_rev': 'uint32', '*file': 'str', '*data': 'str' }} + +# @X86CPURegister32 +# +# A X86 32-bit register +# +# Since: 1.5 +## +{ 'enum': 'X86CPURegister32', + 'data': [ 'EAX', 'EBX', 'ECX', 'EDX', 'ESP', 'EBP', 'ESI', 'EDI' ] } + +## +# @X86CPUFeatureWordInfo +# +# Information about a X86 CPU feature word +# +# @cpuid-input-eax: Input EAX value for CPUID instruction for that feature word +# +# @cpuid-input-ecx: #optional Input ECX value for CPUID instruction for that +# feature word +# +# @cpuid-register: Output register containing the feature bits +# +# @features: value of output register, containing the feature bits +# +# Since: 1.5 +## +{ 'type': 'X86CPUFeatureWordInfo', + 'data': { 'cpuid-input-eax': 'int', +'*cpuid-input-ecx': 'int', +'cpuid-register': 'X86CPURegister32', +'features': 'int' } } diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 314931e..757749c 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -30,6 +30,8 @@ #include qemu/config-file.h #include qapi/qmp/qerror.h +#include qapi-types.h +#include qapi-visit.h #include qapi/visitor.h #include
Re: [Qemu-devel] [PATCH 7/7] prep: QOM'ify System I/O
Am 02.05.2013 22:09, schrieb Hervé Poussineau: Most of the functionality is extracted from hw/ppc/prep.c. Also add support for board identification/equipment registers. Document it for the IBM 43p emulation. Cc: Julio Guerra gu...@julio.in Signed-off-by: Hervé Poussineau hpous...@reactos.org --- docs/ibm_43p.cfg |5 + hw/ppc/Makefile.objs |1 + hw/ppc/prep_systemio.c | 298 trace-events |4 + 4 files changed, 308 insertions(+) create mode 100644 hw/ppc/prep_systemio.c Haven't reviewed the full patch yet, but since this is not modifying hw/ppc/prep.c, it is duplicating code rather than QOM'ifying the existing code. Have you looked into Julio's patch whom you CC? I'm still not sure how to solve things for 1.5 (and this series a consider -next). Regards, Andreas
Re: [Qemu-devel] [PATCH 2/7] qom: handle registration of new types when initializing the first ones
Am 02.05.2013 22:08, schrieb Hervé Poussineau: When initializing all types in object_class_foreach, called by object_class_get_list, some new types may be registered. Those will change the type internal hashtable which is currently enumerated, and may crash QEMU. Fix it, by adding a second hash table which contains all the non-initialized types, merged to the main one before each round of initializations. Bug has been detected when registering dynamic types containing an interface. Signed-off-by: Hervé Poussineau hpous...@reactos.org --- qom/object.c | 45 + 1 file changed, 37 insertions(+), 8 deletions(-) Could you be more specific about how to reproduce the problem? Is it a generic issue or specific to some later patch in this series? I find neither object_class_get_list() nor object_class_for_each() being used in this series. And registering types during object_class_for_each() doesn't sound right... CC'ing Anthony and Paolo. Andreas diff --git a/qom/object.c b/qom/object.c index 75e6aac..e0a24dc 100644 --- a/qom/object.c +++ b/qom/object.c @@ -65,25 +65,39 @@ struct TypeImpl static Type type_interface; +static GHashTable *type_table_to_initialize; +static GHashTable *type_table_initialized; + static GHashTable *type_table_get(void) { -static GHashTable *type_table; - -if (type_table == NULL) { -type_table = g_hash_table_new(g_str_hash, g_str_equal); +if (!type_table_initialized) { +type_table_initialized = g_hash_table_new(g_str_hash, g_str_equal); } -return type_table; +return type_table_initialized; } static void type_table_add(TypeImpl *ti) { -g_hash_table_insert(type_table_get(), (void *)ti-name, ti); +GHashTable **type_table; +if (ti-class) { +type_table = type_table_initialized; +} else { +type_table = type_table_to_initialize; +} +if (!*type_table) { +*type_table = g_hash_table_new(g_str_hash, g_str_equal); +} +g_hash_table_insert(*type_table, (void *)ti-name, ti); } static TypeImpl *type_table_lookup(const char *name) { -return g_hash_table_lookup(type_table_get(), name); +TypeImpl *ret = g_hash_table_lookup(type_table_get(), name); +if (!ret type_table_to_initialize) { +ret = g_hash_table_lookup(type_table_to_initialize, name); +} +return ret; } static TypeImpl *type_register_internal(const TypeInfo *info) @@ -573,13 +587,28 @@ static void object_class_foreach_tramp(gpointer key, gpointer value, data-fn(k, data-opaque); } +static void object_class_merge(gpointer key, gpointer value, + gpointer opaque) +{ +g_hash_table_insert(type_table_get(), key, value); +} + void object_class_foreach(void (*fn)(ObjectClass *klass, void *opaque), const char *implements_type, bool include_abstract, void *opaque) { OCFData data = { fn, implements_type, include_abstract, opaque }; -g_hash_table_foreach(type_table_get(), object_class_foreach_tramp, data); +while (type_table_to_initialize + g_hash_table_size(type_table_to_initialize) 0) { +g_hash_table_foreach(type_table_to_initialize, object_class_merge, + NULL); +g_hash_table_destroy(type_table_to_initialize); +type_table_to_initialize = NULL; + +g_hash_table_foreach(type_table_get(), object_class_foreach_tramp, + data); +} } int object_child_foreach(Object *obj, int (*fn)(Object *child, void *opaque), -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
[Qemu-devel] [PULL 00/15] Block patches
Details on the patches: 1. Fam Zheng's VMDK zeroed-grain GTEs implements zero cluster support in VMDK. We need this to correctly read files containing zero clusters - it's essentially a bugfix. 2. Jeff Cody's VHDX series implements read-only support for the new Hyper-V image format. The series has been on the list for a while and stripped down to make it mergable for QEMU 1.5. Not all image files are supported yet but this already allows for new v2v migrations. 3. Kevin Wolf's qmp_block_resize error clarification. 4. My NBD fix for new Linux nbd drivers that can send 1 MB requests. The following changes since commit 8ca27ce2e1150486ea2db4116a03706b28294f16: Merge remote-tracking branch 'afaerber/qom-cpu' into staging (2013-05-02 10:57:01 -0500) are available in the git repository at: git://github.com/stefanha/qemu.git block for you to fetch changes up to 86abefd61e23325162e59e5bfb8f0346eda62541: qemu-iotests: Filter out 'adapter_type' (2013-05-03 13:06:22 +0200) Fam Zheng (8): vmdk: named return code. vmdk: add support for “zeroed‐grain” GTE vmdk: Add option to create zeroed-grain image vmdk: change magic number to macro vmdk: store fields of VmdkMetaData in cpu endian vmdk: add bdrv_co_write_zeroes qemu-iotests: Filter out vmdk creation options qemu-iotests: Filter out 'adapter_type' Jeff Cody (4): qemu: add castagnoli crc32c checksum algorithm block: vhdx header for the QEMU support of VHDX images block: initial VHDX driver support framework - supports open and probe block: add read-only support to VHDX image format. Kevin Wolf (1): blockdev: Replace undefined error in qmp_block_resize Stefan Hajnoczi (2): nbd: use g_slice_new() instead of a freelist nbd: support large NBD requests block/Makefile.objs | 1 + block/vhdx.c | 972 +++ block/vhdx.h | 325 +++ block/vmdk.c | 208 ++--- blockdev.c | 6 +- include/block/nbd.h | 3 +- include/qemu/crc32c.h| 35 ++ nbd.c| 36 +- tests/qemu-iotests/common.rc | 3 + util/Makefile.objs | 1 + util/crc32c.c| 115 + 11 files changed, 1618 insertions(+), 87 deletions(-) create mode 100644 block/vhdx.c create mode 100644 block/vhdx.h create mode 100644 include/qemu/crc32c.h create mode 100644 util/crc32c.c -- 1.8.1.4
[Qemu-devel] [PATCH 04/15] block: add read-only support to VHDX image format.
From: Jeff Cody jc...@redhat.com This adds in read-only support to the VHDX image format. This supports reads for fixed-size, and dynamic sized VHDX images. Differencing files are still unsupported. The image must be opened without BDRV_O_RDWR set, because we do not yet update the headers. I.e., pass 'readonly=on' in the drive image options from the QEMU commandline. Signed-off-by: Jeff Cody jc...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/vhdx.c | 123 ++- 1 file changed, 121 insertions(+), 2 deletions(-) diff --git a/block/vhdx.c b/block/vhdx.c index 0ee10a7..e9704b1 100644 --- a/block/vhdx.c +++ b/block/vhdx.c @@ -114,6 +114,17 @@ typedef struct VHDXMetadataEntries { } VHDXMetadataEntries; +typedef struct VHDXSectorInfo { +uint32_t bat_idx; /* BAT entry index */ +uint32_t sectors_avail; /* sectors available in payload block */ +uint32_t bytes_left;/* bytes left in the block after data to r/w */ +uint32_t bytes_avail; /* bytes available in payload block */ +uint64_t file_offset; /* absolute offset in bytes, in file */ +uint64_t block_offset; /* block offset, in bytes */ +} VHDXSectorInfo; + + + typedef struct BDRVVHDXState { CoMutex lock; @@ -792,7 +803,7 @@ static int vhdx_open(BlockDriverState *bs, QDict *options, int flags) goto fail; } -/* TODO: differencing files, read, write */ +/* TODO: differencing files, write */ return 0; fail: @@ -810,10 +821,118 @@ static int vhdx_reopen_prepare(BDRVReopenState *state, } +/* + * Perform sector to block offset translations, to get various + * sector and file offsets into the image. See VHDXSectorInfo + */ +static void vhdx_block_translate(BDRVVHDXState *s, int64_t sector_num, + int nb_sectors, VHDXSectorInfo *sinfo) +{ +uint32_t block_offset; + +sinfo-bat_idx = sector_num s-sectors_per_block_bits; +/* effectively a modulo - this gives us the offset into the block + * (in sector sizes) for our sector number */ +block_offset = sector_num - (sinfo-bat_idx s-sectors_per_block_bits); +/* the chunk ratio gives us the interleaving of the sector + * bitmaps, so we need to advance our page block index by the + * sector bitmaps entry number */ +sinfo-bat_idx += sinfo-bat_idx s-chunk_ratio_bits; + +/* the number of sectors we can read/write in this cycle */ +sinfo-sectors_avail = s-sectors_per_block - block_offset; + +sinfo-bytes_left = sinfo-sectors_avail s-logical_sector_size_bits; + +if (sinfo-sectors_avail nb_sectors) { +sinfo-sectors_avail = nb_sectors; +} + +sinfo-bytes_avail = sinfo-sectors_avail s-logical_sector_size_bits; + +sinfo-file_offset = s-bat[sinfo-bat_idx] VHDX_BAT_FILE_OFF_BITS; + +sinfo-block_offset = block_offset s-logical_sector_size_bits; + +/* The file offset must be past the header section, so must be 0 */ +if (sinfo-file_offset == 0) { +return; +} + +/* block offset is the offset in vhdx logical sectors, in + * the payload data block. Convert that to a byte offset + * in the block, and add in the payload data block offset + * in the file, in bytes, to get the final read address */ + +sinfo-file_offset = 20; /* now in bytes, rather than 1MB units */ +sinfo-file_offset += sinfo-block_offset; +} + + + static coroutine_fn int vhdx_co_readv(BlockDriverState *bs, int64_t sector_num, int nb_sectors, QEMUIOVector *qiov) { -return -ENOTSUP; +BDRVVHDXState *s = bs-opaque; +int ret = 0; +VHDXSectorInfo sinfo; +uint64_t bytes_done = 0; +QEMUIOVector hd_qiov; + +qemu_iovec_init(hd_qiov, qiov-niov); + +qemu_co_mutex_lock(s-lock); + +while (nb_sectors 0) { +/* We are a differencing file, so we need to inspect the sector bitmap + * to see if we have the data or not */ +if (s-params.data_bits VHDX_PARAMS_HAS_PARENT) { +/* not supported yet */ +ret = -ENOTSUP; +goto exit; +} else { +vhdx_block_translate(s, sector_num, nb_sectors, sinfo); + +qemu_iovec_reset(hd_qiov); +qemu_iovec_concat(hd_qiov, qiov, bytes_done, sinfo.bytes_avail); + +/* check the payload block state */ +switch (s-bat[sinfo.bat_idx] VHDX_BAT_STATE_BIT_MASK) { +case PAYLOAD_BLOCK_NOT_PRESENT: /* fall through */ +case PAYLOAD_BLOCK_UNDEFINED: /* fall through */ +case PAYLOAD_BLOCK_UNMAPPED:/* fall through */ +case PAYLOAD_BLOCK_ZERO: +/* return zero */ +qemu_iovec_memset(hd_qiov, 0, 0, sinfo.bytes_avail); +break; +case PAYLOAD_BLOCK_FULL_PRESENT: +qemu_co_mutex_unlock(s-lock); +ret = bdrv_co_readv(bs-file, +
[Qemu-devel] [PATCH 02/15] block: vhdx header for the QEMU support of VHDX images
From: Jeff Cody jc...@redhat.com This is based on Microsoft's VHDX specification: VHDX Format Specification v0.95, published 4/12/2012 https://www.microsoft.com/en-us/download/details.aspx?id=29681 These structures define the various header, metadata, and other block structures defined in the VHDX specification. Signed-off-by: Jeff Cody jc...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/vhdx.h | 311 +++ 1 file changed, 311 insertions(+) create mode 100644 block/vhdx.h diff --git a/block/vhdx.h b/block/vhdx.h new file mode 100644 index 000..fcddd37 --- /dev/null +++ b/block/vhdx.h @@ -0,0 +1,311 @@ +/* + * Block driver for Hyper-V VHDX Images + * + * Copyright (c) 2013 Red Hat, Inc., + * + * Authors: + * Jeff Cody jc...@redhat.com + * + * This is based on the VHDX Format Specification v0.95, published 4/12/2012 + * by Microsoft: + * https://www.microsoft.com/en-us/download/details.aspx?id=29681 + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#ifndef BLOCK_VHDX_H +#define BLOCK_VHDX_H + +/* Structures and fields present in the VHDX file */ + +/* The header section has the following blocks, + * each block is 64KB: + * + * _ + * | File Id. | Header 1| Header 2 | Region Table | Reserved (768KB) | + * |--|---||--|| + * | | || | | + * 0.64KB...128KB192KB..256KB1MB + */ + +#define VHDX_HEADER_BLOCK_SIZE (64*1024) + +#define VHDX_FILE_ID_OFFSET 0 +#define VHDX_HEADER1_OFFSET (VHDX_HEADER_BLOCK_SIZE*1) +#define VHDX_HEADER2_OFFSET (VHDX_HEADER_BLOCK_SIZE*2) +#define VHDX_REGION_TABLE_OFFSET(VHDX_HEADER_BLOCK_SIZE*3) + + +/* + * A note on the use of MS-GUID fields. For more details on the GUID, + * please see: https://en.wikipedia.org/wiki/Globally_unique_identifier. + * + * The VHDX specification only states that these are MS GUIDs, and which + * bytes are data1-data4. It makes no mention of what algorithm should be used + * to generate the GUID, nor what standard. However, looking at the specified + * known GUID fields, it appears the GUIDs are: + * Standard/DCE GUID type (noted by 10b in the MSB of byte 0 of .data4) + * Random algorithm(noted by 0x4XXX for .data3) + */ + +/* HEADER SECTION STRUCTURES */ + +/* These structures are ones that are defined in the VHDX specification + * document */ + +typedef struct VHDXFileIdentifier { +uint64_tsignature; /* vhdxfile in ASCII */ +uint16_tcreator[256]; /* optional; utf-16 string to identify + the vhdx file creator. Diagnotistic + only */ +} VHDXFileIdentifier; + + +/* the guid is a 16 byte unique ID - the definition for this used by + * Microsoft is not just 16 bytes though - it is a structure that is defined, + * so we need to follow it here so that endianness does not trip us up */ + +typedef struct MSGUID { +uint32_t data1; +uint16_t data2; +uint16_t data3; +uint8_t data4[8]; +} MSGUID; + +#define guid_eq(a, b) \ +(memcmp((a), (b), sizeof(MSGUID)) == 0) + +#define VHDX_HEADER_SIZE (4*1024) /* although the vhdx_header struct in disk + is only 582 bytes, for purposes of crc + the header is the first 4KB of the 64KB + block */ + +/* The full header is 4KB, although the actual header data is much smaller. + * But for the checksum calculation, it is over the entire 4KB structure, + * not just the defined portion of it */ +typedef struct QEMU_PACKED VHDXHeader { +uint32_tsignature; /* head in ASCII */ +uint32_tchecksum; /* CRC-32C hash of the whole header */ +uint64_tsequence_number;/* Seq number of this header. Each + VHDX file has 2 of these headers, + and only the header with the highest + sequence number is valid */ +MSGUID file_write_guid; /* 128 bit unique identifier. Must be + updated to new, unique value before + the first modification is made to + file */ +MSGUID data_write_guid;/* 128 bit unique identifier. Must be + updated to new, unique value before +
[Qemu-devel] [PATCH 03/15] block: initial VHDX driver support framework - supports open and probe
From: Jeff Cody jc...@redhat.com This is the initial block driver framework for VHDX image support (i.e. Hyper-V image file formats), that supports opening VHDX files, and parsing the headers. This commit does not yet enable: - reading - writing - updating the header - differencing files (images with parents) - log replay / dirty logs (only clean images) This is based on Microsoft's VHDX specification: VHDX Format Specification v0.95, published 4/12/2012 https://www.microsoft.com/en-us/download/details.aspx?id=29681 Signed-off-by: Jeff Cody jc...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/Makefile.objs | 1 + block/vhdx.c| 853 block/vhdx.h| 14 + 3 files changed, 868 insertions(+) create mode 100644 block/vhdx.c diff --git a/block/Makefile.objs b/block/Makefile.objs index 6c4b5bc..5f0358a 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -2,6 +2,7 @@ block-obj-y += raw.o cow.o qcow.o vdi.o vmdk.o cloop.o dmg.o bochs.o vpc.o vvfat block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o block-obj-y += qed-check.o +block-obj-y += vhdx.o block-obj-y += parallels.o blkdebug.o blkverify.o block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o block-obj-$(CONFIG_POSIX) += raw-posix.o diff --git a/block/vhdx.c b/block/vhdx.c new file mode 100644 index 000..0ee10a7 --- /dev/null +++ b/block/vhdx.c @@ -0,0 +1,853 @@ +/* + * Block driver for Hyper-V VHDX Images + * + * Copyright (c) 2013 Red Hat, Inc., + * + * Authors: + * Jeff Cody jc...@redhat.com + * + * This is based on the VHDX Format Specification v0.95, published 4/12/2012 + * by Microsoft: + * https://www.microsoft.com/en-us/download/details.aspx?id=29681 + * + * This work is licensed under the terms of the GNU LGPL, version 2 or later. + * See the COPYING.LIB file in the top-level directory. + * + */ + +#include qemu-common.h +#include block/block_int.h +#include qemu/module.h +#include qemu/crc32c.h +#include block/vhdx.h + + +/* Several metadata and region table data entries are identified by + * guids in a MS-specific GUID format. */ + + +/* --- Known Region Table GUIDs -- */ +static const MSGUID bat_guid = { .data1 = 0x2dc27766, + .data2 = 0xf623, + .data3 = 0x4200, + .data4 = { 0x9d, 0x64, 0x11, 0x5e, + 0x9b, 0xfd, 0x4a, 0x08} }; + +static const MSGUID metadata_guid = { .data1 = 0x8b7ca206, + .data2 = 0x4790, + .data3 = 0x4b9a, + .data4 = { 0xb8, 0xfe, 0x57, 0x5f, + 0x05, 0x0f, 0x88, 0x6e} }; + + + +/* --- Known Metadata Entry GUIDs -- */ +static const MSGUID file_param_guid = { .data1 = 0xcaa16737, + .data2 = 0xfa36, + .data3 = 0x4d43, + .data4 = { 0xb3, 0xb6, 0x33, 0xf0, + 0xaa, 0x44, 0xe7, 0x6b} }; + +static const MSGUID virtual_size_guid = { .data1 = 0x2FA54224, + .data2 = 0xcd1b, + .data3 = 0x4876, + .data4 = { 0xb2, 0x11, 0x5d, 0xbe, + 0xd8, 0x3b, 0xf4, 0xb8} }; + +static const MSGUID page83_guid = { .data1 = 0xbeca12ab, + .data2 = 0xb2e6, + .data3 = 0x4523, + .data4 = { 0x93, 0xef, 0xc3, 0x09, + 0xe0, 0x00, 0xc7, 0x46} }; + + +static const MSGUID phys_sector_guid = { .data1 = 0xcda348c7, + .data2 = 0x445d, + .data3 = 0x4471, + .data4 = { 0x9c, 0xc9, 0xe9, 0x88, + 0x52, 0x51, 0xc5, 0x56} }; + +static const MSGUID parent_locator_guid = { .data1 = 0xa8d35f2d, +.data2 = 0xb30b, +.data3 = 0x454d, +.data4 = { 0xab, 0xf7, 0xd3, + 0xd8, 0x48, 0x34, + 0xab, 0x0c} }; + +static const MSGUID logical_sector_guid = { .data1 = 0x8141bf1d, +.data2 =
[Qemu-devel] [PATCH 01/15] qemu: add castagnoli crc32c checksum algorithm
From: Jeff Cody jc...@redhat.com This adds the Castagnoli CRC32C algorithm, using the 0x11EDC6F41 polynomial. This is extracted from the linux kernel cryptographic crc32.c module. The algorithm is based on: Castagnoli93: Guy Castagnoli and Stefan Braeuer and Martin Herrman Optimization of Cyclic Redundancy-Check Codes with 24 and 32 Parity Bits, IEEE Transactions on Communication, Volume 41, Number 6, June 1993 Signed-off-by: Jeff Cody jc...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- include/qemu/crc32c.h | 35 +++ util/Makefile.objs| 1 + util/crc32c.c | 115 ++ 3 files changed, 151 insertions(+) create mode 100644 include/qemu/crc32c.h create mode 100644 util/crc32c.c diff --git a/include/qemu/crc32c.h b/include/qemu/crc32c.h new file mode 100644 index 000..56d1c3b --- /dev/null +++ b/include/qemu/crc32c.h @@ -0,0 +1,35 @@ +/* + * Castagnoli CRC32C Checksum Algorithm + * + * Polynomial: 0x11EDC6F41 + * + * Castagnoli93: Guy Castagnoli and Stefan Braeuer and Martin Herrman + * Optimization of Cyclic Redundancy-Check Codes with 24 + * and 32 Parity Bits,IEEE Transactions on Communication, + *Volume 41, Number 6, June 1993 + * + * Copyright (c) 2013 Red Hat, Inc., + * + * Authors: + * Jeff Cody jc...@redhat.com + * + * Based on the Linux kernel cryptographic crc32c module, + * + * Copyright (c) 2004 Cisco Systems, Inc. + * Copyright (c) 2008 Herbert Xu herb...@gondor.apana.org.au + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. + * + */ + +#ifndef QEMU_CRC32_H +#define QEMU_CRC32_H + +#include qemu-common.h + +uint32_t crc32c(uint32_t crc, const uint8_t *data, unsigned int length); + +#endif diff --git a/util/Makefile.objs b/util/Makefile.objs index c5652f5..4a1bd4e 100644 --- a/util/Makefile.objs +++ b/util/Makefile.objs @@ -10,3 +10,4 @@ util-obj-$(CONFIG_POSIX) += compatfd.o util-obj-y += iov.o aes.o qemu-config.o qemu-sockets.o uri.o notify.o util-obj-y += qemu-option.o qemu-progress.o util-obj-y += hexdump.o +util-obj-y += crc32c.o diff --git a/util/crc32c.c b/util/crc32c.c new file mode 100644 index 000..8866327 --- /dev/null +++ b/util/crc32c.c @@ -0,0 +1,115 @@ +/* + * Castagnoli CRC32C Checksum Algorithm + * + * Polynomial: 0x11EDC6F41 + * + * Castagnoli93: Guy Castagnoli and Stefan Braeuer and Martin Herrman + * Optimization of Cyclic Redundancy-Check Codes with 24 + * and 32 Parity Bits,IEEE Transactions on Communication, + *Volume 41, Number 6, June 1993 + * + * Copyright (c) 2013 Red Hat, Inc., + * + * Authors: + * Jeff Cody jc...@redhat.com + * + * Based on the Linux kernel cryptographic crc32c module, + * + * Copyright (c) 2004 Cisco Systems, Inc. + * Copyright (c) 2008 Herbert Xu herb...@gondor.apana.org.au + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License as published by the Free + * Software Foundation; either version 2 of the License, or (at your option) + * any later version. + * + */ + +#include qemu-common.h +#include qemu/crc32c.h + +/* + * This is the CRC-32C table + * Generated with: + * width = 32 bits + * poly = 0x1EDC6F41 + * reflect input bytes = true + * reflect output bytes = true + */ + +static const uint32_t crc32c_table[256] = { +0xL, 0xF26B8303L, 0xE13B70F7L, 0x1350F3F4L, +0xC79A971FL, 0x35F1141CL, 0x26A1E7E8L, 0xD4CA64EBL, +0x8AD958CFL, 0x78B2DBCCL, 0x6BE22838L, 0x9989AB3BL, +0x4D43CFD0L, 0xBF284CD3L, 0xAC78BF27L, 0x5E133C24L, +0x105EC76FL, 0xE235446CL, 0xF165B798L, 0x030E349BL, +0xD7C45070L, 0x25AFD373L, 0x36FF2087L, 0xC494A384L, +0x9A879FA0L, 0x68EC1CA3L, 0x7BBCEF57L, 0x89D76C54L, +0x5D1D08BFL, 0xAF768BBCL, 0xBC267848L, 0x4E4DFB4BL, +0x20BD8EDEL, 0xD2D60DDDL, 0xC186FE29L, 0x33ED7D2AL, +0xE72719C1L, 0x154C9AC2L, 0x061C6936L, 0xF477EA35L, +0xAA64D611L, 0x580F5512L, 0x4B5FA6E6L, 0xB93425E5L, +0x6DFE410EL, 0x9F95C20DL, 0x8CC531F9L, 0x7EAEB2FAL, +0x30E349B1L, 0xC288CAB2L, 0xD1D83946L, 0x23B3BA45L, +0xF779DEAEL, 0x05125DADL, 0x1642AE59L, 0xE4292D5AL, +0xBA3A117EL, 0x4851927DL, 0x5B016189L, 0xA96AE28AL, +0x7DA08661L, 0x8FCB0562L, 0x9C9BF696L, 0x6EF07595L, +0x417B1DBCL, 0xB3109EBFL, 0xA0406D4BL, 0x522BEE48L, +0x86E18AA3L, 0x748A09A0L, 0x67DAFA54L, 0x95B17957L, +0xCBA24573L, 0x39C9C670L, 0x2A993584L, 0xD8F2B687L, +0x0C38D26CL, 0xFE53516FL, 0xED03A29BL, 0x1F682198L, +0x5125DAD3L, 0xA34E59D0L, 0xB01EAA24L, 0x42752927L, +0x96BF4DCCL, 0x64D4CECFL, 0x77843D3BL, 0x85EFBE38L, +0xDBFC821CL, 0x2997011FL, 0x3AC7F2EBL, 0xC8AC71E8L, +
[Qemu-devel] [PATCH 05/15] blockdev: Replace undefined error in qmp_block_resize
From: Kevin Wolf kw...@redhat.com We have an errno value that can be displayed, so we should just do that. An easy way to reproduce this case is to resize a raw image to a size that is too large for the host file system. Signed-off-by: Kevin Wolf kw...@redhat.com Reviewed-by: Eric Blake ebl...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/blockdev.c b/blockdev.c index 6e293e9..7c9d8dd 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1118,6 +1118,7 @@ int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data) void qmp_block_resize(const char *device, int64_t size, Error **errp) { BlockDriverState *bs; +int ret; bs = bdrv_find(device); if (!bs) { @@ -1133,7 +1134,8 @@ void qmp_block_resize(const char *device, int64_t size, Error **errp) /* complete all in-flight operations before resizing the device */ bdrv_drain_all(); -switch (bdrv_truncate(bs, size)) { +ret = bdrv_truncate(bs, size); +switch (ret) { case 0: break; case -ENOMEDIUM: @@ -1149,7 +1151,7 @@ void qmp_block_resize(const char *device, int64_t size, Error **errp) error_set(errp, QERR_DEVICE_IN_USE, device); break; default: -error_set(errp, QERR_UNDEFINED_ERROR); +error_setg_errno(errp, -ret, Could not resize); break; } } -- 1.8.1.4
[Qemu-devel] [PATCH 06/15] vmdk: named return code.
From: Fam Zheng f...@redhat.com Internal routines in vmdk.c previously return -1 on error and 0 on success. More return values are useful for future changes such as zeroed-grain GTE. Change all the magic `return 0` and `return -1` to macro names: * VMDK_OK 0 * VMDK_ERROR (-1) * VMDK_UNALLOC (-2) * VMDK_ZEROED (-3) Signed-off-by: Fam Zheng f...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/vmdk.c | 60 ++-- 1 file changed, 34 insertions(+), 26 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 7bad757..16aa29c 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -37,6 +37,14 @@ #define VMDK4_FLAG_MARKER (1 17) #define VMDK4_GD_AT_END 0xULL + +/* VMDK internal error codes */ +#define VMDK_OK 0 +#define VMDK_ERROR (-1) +/* Cluster not allocated */ +#define VMDK_UNALLOC (-2) +#define VMDK_ZEROED (-3) + typedef struct { uint32_t version; uint32_t flags; @@ -578,22 +586,22 @@ static int vmdk_parse_description(const char *desc, const char *opt_name, opt_pos = strstr(desc, opt_name); if (!opt_pos) { -return -1; +return VMDK_ERROR; } /* Skip =\ following opt_name */ opt_pos += strlen(opt_name) + 2; if (opt_pos = end) { -return -1; +return VMDK_ERROR; } opt_end = opt_pos; while (opt_end end *opt_end != '') { opt_end++; } if (opt_end == end || buf_size opt_end - opt_pos + 1) { -return -1; +return VMDK_ERROR; } pstrcpy(buf, opt_end - opt_pos + 1, opt_pos); -return 0; +return VMDK_OK; } /* Open an extent file and append to bs array */ @@ -772,7 +780,7 @@ static int get_whole_cluster(BlockDriverState *bs, int ret; if (!vmdk_is_cid_valid(bs)) { -return -1; +return VMDK_ERROR; } /* floor offset to cluster */ @@ -780,17 +788,17 @@ static int get_whole_cluster(BlockDriverState *bs, ret = bdrv_read(bs-backing_hd, offset 9, whole_grain, extent-cluster_sectors); if (ret 0) { -return -1; +return VMDK_ERROR; } /* Write grain only into the active image */ ret = bdrv_write(extent-file, cluster_offset, whole_grain, extent-cluster_sectors); if (ret 0) { -return -1; +return VMDK_ERROR; } } -return 0; +return VMDK_OK; } static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data) @@ -803,7 +811,7 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data) (m_data-offset), sizeof(m_data-offset) ) 0) { -return -1; +return VMDK_ERROR; } /* update backup L2 table */ if (extent-l1_backup_table_offset != 0) { @@ -814,11 +822,11 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data) + (m_data-l2_index * sizeof(m_data-offset)), (m_data-offset), sizeof(m_data-offset) ) 0) { -return -1; +return VMDK_ERROR; } } -return 0; +return VMDK_OK; } static int get_cluster_offset(BlockDriverState *bs, @@ -837,17 +845,17 @@ static int get_cluster_offset(BlockDriverState *bs, } if (extent-flat) { *cluster_offset = extent-flat_start_offset; -return 0; +return VMDK_OK; } offset -= (extent-end_sector - extent-sectors) * SECTOR_SIZE; l1_index = (offset 9) / extent-l1_entry_sectors; if (l1_index = extent-l1_size) { -return -1; +return VMDK_ERROR; } l2_offset = extent-l1_table[l1_index]; if (!l2_offset) { -return -1; +return VMDK_UNALLOC; } for (i = 0; i L2_CACHE_SIZE; i++) { if (l2_offset == extent-l2_cache_offsets[i]) { @@ -877,7 +885,7 @@ static int get_cluster_offset(BlockDriverState *bs, l2_table, extent-l2_size * sizeof(uint32_t) ) != extent-l2_size * sizeof(uint32_t)) { -return -1; +return VMDK_ERROR; } extent-l2_cache_offsets[min_index] = l2_offset; @@ -888,7 +896,7 @@ static int get_cluster_offset(BlockDriverState *bs, if (!*cluster_offset) { if (!allocate) { -return -1; +return VMDK_UNALLOC; } /* Avoid the L2 tables update for the images that have snapshots. */ @@ -911,7 +919,7 @@ static int get_cluster_offset(BlockDriverState *bs, */ if (get_whole_cluster( bs, extent, *cluster_offset, offset, allocate) == -1) { -return -1; +return VMDK_ERROR; } if (m_data) { @@ -923,7 +931,7 @@ static int get_cluster_offset(BlockDriverState *bs, } } *cluster_offset = 9; -return 0; +
[Qemu-devel] [PATCH 07/15] vmdk: add support for “zeroed‐grain” GTE
From: Fam Zheng f...@redhat.com Introduced support for zeroed-grain GTE, as specified in Virtual Disk Format 5.0[1]. Recent VMware hosted platform products support a new “zeroed‐grain” grain table entry (GTE). The zeroed‐grain GTE returns all zeros on read. In other words, the zeroed‐grain GTE indicates that a grain in the child disk is zero‐filled but does not actually occupy space in storage. A sparse extent with zeroed‐grain GTE has the following in its header: * SparseExtentHeader.version = 2 * SparseExtentHeader.flags has bit 2 set Other than the new flag and the possibly zeroed‐grain GTE, version 2 sparse extents are identical to version 1. Also, a zeroed‐grain GTE has value 0x1 in the GT table. [1] Virtual Disk Format 5.0, http://www.vmware.com/support/developer/vddk/vmdk_50_technote.pdf?src=vmdk Signed-off-by: Fam Zheng f...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/vmdk.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 16aa29c..7e07c0f 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -33,10 +33,13 @@ #define VMDK4_MAGIC (('K' 24) | ('D' 16) | ('M' 8) | 'V') #define VMDK4_COMPRESSION_DEFLATE 1 #define VMDK4_FLAG_RGD (1 1) +/* Zeroed-grain enable bit */ +#define VMDK4_FLAG_ZERO_GRAIN (1 2) #define VMDK4_FLAG_COMPRESS (1 16) #define VMDK4_FLAG_MARKER (1 17) #define VMDK4_GD_AT_END 0xULL +#define VMDK_GTE_ZEROED 0x1 /* VMDK internal error codes */ #define VMDK_OK 0 @@ -81,6 +84,8 @@ typedef struct VmdkExtent { bool flat; bool compressed; bool has_marker; +bool has_zero_grain; +int version; int64_t sectors; int64_t end_sector; int64_t flat_start_offset; @@ -569,6 +574,8 @@ static int vmdk_open_vmdk4(BlockDriverState *bs, extent-compressed = le16_to_cpu(header.compressAlgorithm) == VMDK4_COMPRESSION_DEFLATE; extent-has_marker = le32_to_cpu(header.flags) VMDK4_FLAG_MARKER; +extent-version = le32_to_cpu(header.version); +extent-has_zero_grain = le32_to_cpu(header.flags) VMDK4_FLAG_ZERO_GRAIN; ret = vmdk_init_tables(bs, extent); if (ret) { /* free extent allocated by vmdk_add_extent */ @@ -839,6 +846,7 @@ static int get_cluster_offset(BlockDriverState *bs, unsigned int l1_index, l2_offset, l2_index; int min_index, i, j; uint32_t min_count, *l2_table, tmp = 0; +bool zeroed = false; if (m_data) { m_data-valid = 0; @@ -894,9 +902,13 @@ static int get_cluster_offset(BlockDriverState *bs, l2_index = ((offset 9) / extent-cluster_sectors) % extent-l2_size; *cluster_offset = le32_to_cpu(l2_table[l2_index]); -if (!*cluster_offset) { +if (extent-has_zero_grain *cluster_offset == VMDK_GTE_ZEROED) { +zeroed = true; +} + +if (!*cluster_offset || zeroed) { if (!allocate) { -return VMDK_UNALLOC; +return zeroed ? VMDK_ZEROED : VMDK_UNALLOC; } /* Avoid the L2 tables update for the images that have snapshots. */ @@ -967,8 +979,8 @@ static int coroutine_fn vmdk_co_is_allocated(BlockDriverState *bs, ret = get_cluster_offset(bs, extent, NULL, sector_num * 512, 0, offset); qemu_co_mutex_unlock(s-lock); -/* get_cluster_offset returning 0 means success */ -ret = !ret; + +ret = (ret == VMDK_OK || ret == VMDK_ZEROED); index_in_cluster = sector_num % extent-cluster_sectors; n = extent-cluster_sectors - index_in_cluster; @@ -,9 +1123,9 @@ static int vmdk_read(BlockDriverState *bs, int64_t sector_num, if (n nb_sectors) { n = nb_sectors; } -if (ret) { +if (ret != VMDK_OK) { /* if not allocated, try to read from parent image, if exist */ -if (bs-backing_hd) { +if (bs-backing_hd ret != VMDK_ZEROED) { if (!vmdk_is_cid_valid(bs)) { return -EINVAL; } -- 1.8.1.4
[Qemu-devel] [PATCH 09/15] vmdk: change magic number to macro
From: Fam Zheng f...@redhat.com Two hard coded flag bits are changed to macros. Signed-off-by: Fam Zheng f...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/vmdk.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/block/vmdk.c b/block/vmdk.c index cc19e20..0463d3b 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -32,6 +32,7 @@ #define VMDK3_MAGIC (('C' 24) | ('O' 16) | ('W' 8) | 'D') #define VMDK4_MAGIC (('K' 24) | ('D' 16) | ('M' 8) | 'V') #define VMDK4_COMPRESSION_DEFLATE 1 +#define VMDK4_FLAG_NL_DETECT (1 0) #define VMDK4_FLAG_RGD (1 1) /* Zeroed-grain enable bit */ #define VMDK4_FLAG_ZERO_GRAIN (1 2) @@ -1287,7 +1288,7 @@ static int vmdk_create_extent(const char *filename, int64_t filesize, magic = cpu_to_be32(VMDK4_MAGIC); memset(header, 0, sizeof(header)); header.version = zeroed_grain ? 2 : 1; -header.flags = 3 +header.flags = VMDK4_FLAG_RGD | VMDK4_FLAG_NL_DETECT | (compress ? VMDK4_FLAG_COMPRESS | VMDK4_FLAG_MARKER : 0) | (zeroed_grain ? VMDK4_FLAG_ZERO_GRAIN : 0); header.compressAlgorithm = compress ? VMDK4_COMPRESSION_DEFLATE : 0; -- 1.8.1.4
[Qemu-devel] [PATCH 12/15] qemu-iotests: Filter out vmdk creation options
From: Fam Zheng f...@redhat.com Cover new image creation options for vmdk, so we can use '-o zeroed_grain=XXX' and '-o subformat=XXX' to run the tests successfully. Signed-off-by: Fam Zheng f...@redhat.com Reviewed-by: Kevin Wolf kw...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- tests/qemu-iotests/common.rc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index a536bf7..442cf51 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -127,6 +127,8 @@ _make_test_img() -e s# compat='[^']*'##g \ -e s# compat6=\\(on\\|off\\)##g \ -e s# static=\\(on\\|off\\)##g \ +-e s# zeroed_grain=\\(on\\|off\\)##g \ +-e s# subformat='[^']*'##g \ -e s# lazy_refcounts=\\(on\\|off\\)##g # Start an NBD server on the image file, which is what we'll be talking to -- 1.8.1.4
[Qemu-devel] [PATCH 08/15] vmdk: Add option to create zeroed-grain image
From: Fam Zheng f...@redhat.com Add image create option zeroed-grain to enable zeroed-grain GTE feature of vmdk sparse extents. When this option is on, header version of newly created extent will be 2 and VMDK4_FLAG_ZERO_GRAIN flag bit will be set. Signed-off-by: Fam Zheng f...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/vmdk.c | 22 +- 1 file changed, 17 insertions(+), 5 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 7e07c0f..cc19e20 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -48,6 +48,8 @@ #define VMDK_UNALLOC (-2) #define VMDK_ZEROED (-3) +#define BLOCK_OPT_ZEROED_GRAIN zeroed_grain + typedef struct { uint32_t version; uint32_t flags; @@ -1262,7 +1264,7 @@ static coroutine_fn int vmdk_co_write(BlockDriverState *bs, int64_t sector_num, static int vmdk_create_extent(const char *filename, int64_t filesize, - bool flat, bool compress) + bool flat, bool compress, bool zeroed_grain) { int ret, i; int fd = 0; @@ -1284,9 +1286,10 @@ static int vmdk_create_extent(const char *filename, int64_t filesize, } magic = cpu_to_be32(VMDK4_MAGIC); memset(header, 0, sizeof(header)); -header.version = 1; -header.flags = -3 | (compress ? VMDK4_FLAG_COMPRESS | VMDK4_FLAG_MARKER : 0); +header.version = zeroed_grain ? 2 : 1; +header.flags = 3 + | (compress ? VMDK4_FLAG_COMPRESS | VMDK4_FLAG_MARKER : 0) + | (zeroed_grain ? VMDK4_FLAG_ZERO_GRAIN : 0); header.compressAlgorithm = compress ? VMDK4_COMPRESSION_DEFLATE : 0; header.capacity = filesize / 512; header.granularity = 128; @@ -1467,6 +1470,7 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options) char parent_desc_line[BUF_SIZE] = ; uint32_t parent_cid = 0x; uint32_t number_heads = 16; +bool zeroed_grain = false; const char desc_template[] = # Disk DescriptorFile\n version=1\n @@ -1502,6 +1506,8 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options) flags |= options-value.n ? BLOCK_FLAG_COMPAT6 : 0; } else if (!strcmp(options-name, BLOCK_OPT_SUBFMT)) { fmt = options-value.s; +} else if (!strcmp(options-name, BLOCK_OPT_ZEROED_GRAIN)) { +zeroed_grain |= options-value.n; } options++; } @@ -1588,7 +1594,8 @@ static int vmdk_create(const char *filename, QEMUOptionParameter *options) snprintf(ext_filename, sizeof(ext_filename), %s%s, path, desc_filename); -if (vmdk_create_extent(ext_filename, size, flat, compress)) { +if (vmdk_create_extent(ext_filename, size, + flat, compress, zeroed_grain)) { return -EINVAL; } filesize -= size; @@ -1714,6 +1721,11 @@ static QEMUOptionParameter vmdk_create_options[] = { VMDK flat extent format, can be one of {monolithicSparse (default) | monolithicFlat | twoGbMaxExtentSparse | twoGbMaxExtentFlat | streamOptimized} }, +{ +.name = BLOCK_OPT_ZEROED_GRAIN, +.type = OPT_FLAG, +.help = Enable efficient zero writes using the zeroed-grain GTE feature +}, { NULL } }; -- 1.8.1.4
[Qemu-devel] [PATCH 13/15] nbd: use g_slice_new() instead of a freelist
Use GLib's efficient slice allocator instead of open-coding the request freelist. This patch simplifies the NBDRequest code. Now we qemu_blockalign() the req-data buffer each time but the next patch switches from a fixed size buffer to a dynamic size anyway. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- nbd.c | 23 ++- 1 file changed, 6 insertions(+), 17 deletions(-) diff --git a/nbd.c b/nbd.c index 85187ff..761f4ec 100644 --- a/nbd.c +++ b/nbd.c @@ -98,7 +98,6 @@ struct NBDExport { off_t size; uint32_t nbdflags; QTAILQ_HEAD(, NBDClient) clients; -QSIMPLEQ_HEAD(, NBDRequest) requests; QTAILQ_ENTRY(NBDExport) next; }; @@ -850,13 +849,8 @@ static NBDRequest *nbd_request_get(NBDClient *client) assert(client-nb_requests = MAX_NBD_REQUESTS - 1); client-nb_requests++; -if (QSIMPLEQ_EMPTY(exp-requests)) { -req = g_malloc0(sizeof(NBDRequest)); -req-data = qemu_blockalign(exp-bs, NBD_BUFFER_SIZE); -} else { -req = QSIMPLEQ_FIRST(exp-requests); -QSIMPLEQ_REMOVE_HEAD(exp-requests, entry); -} +req = g_slice_new0(NBDRequest); +req-data = qemu_blockalign(exp-bs, NBD_BUFFER_SIZE); nbd_client_get(client); req-client = client; return req; @@ -865,7 +859,10 @@ static NBDRequest *nbd_request_get(NBDClient *client) static void nbd_request_put(NBDRequest *req) { NBDClient *client = req-client; -QSIMPLEQ_INSERT_HEAD(client-exp-requests, req, entry); + +qemu_vfree(req-data); +g_slice_free(NBDRequest, req); + if (client-nb_requests-- == MAX_NBD_REQUESTS) { qemu_notify_event(); } @@ -877,7 +874,6 @@ NBDExport *nbd_export_new(BlockDriverState *bs, off_t dev_offset, void (*close)(NBDExport *)) { NBDExport *exp = g_malloc0(sizeof(NBDExport)); -QSIMPLEQ_INIT(exp-requests); exp-refcount = 1; QTAILQ_INIT(exp-clients); exp-bs = bs; @@ -953,13 +949,6 @@ void nbd_export_put(NBDExport *exp) exp-close(exp); } -while (!QSIMPLEQ_EMPTY(exp-requests)) { -NBDRequest *first = QSIMPLEQ_FIRST(exp-requests); -QSIMPLEQ_REMOVE_HEAD(exp-requests, entry); -qemu_vfree(first-data); -g_free(first); -} - g_free(exp); } } -- 1.8.1.4
[Qemu-devel] [PATCH 15/15] qemu-iotests: Filter out 'adapter_type'
From: Fam Zheng f...@redhat.com Filter out vmdk creation option 'adapter_type' for vmdk. So that tests with an explicit './check -o adapter_type=XXX' will not fail. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- tests/qemu-iotests/common.rc | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index 442cf51..31eb62b 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -129,6 +129,7 @@ _make_test_img() -e s# static=\\(on\\|off\\)##g \ -e s# zeroed_grain=\\(on\\|off\\)##g \ -e s# subformat='[^']*'##g \ +-e s# adapter_type='[^']*'##g \ -e s# lazy_refcounts=\\(on\\|off\\)##g # Start an NBD server on the image file, which is what we'll be talking to -- 1.8.1.4
[Qemu-devel] [PATCH 10/15] vmdk: store fields of VmdkMetaData in cpu endian
From: Fam Zheng f...@redhat.com Previously VmdkMetaData.offset is stored little endian while other fields are cpu endian. This changes offset to cpu endian and convert before writing to image. Signed-off-by: Fam Zheng f...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/vmdk.c | 17 - 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index 0463d3b..d98f304 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -813,14 +813,15 @@ static int get_whole_cluster(BlockDriverState *bs, static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data) { +uint32_t offset; +QEMU_BUILD_BUG_ON(sizeof(offset) != sizeof(m_data-offset)); +offset = cpu_to_le32(m_data-offset); /* update L2 table */ if (bdrv_pwrite_sync( extent-file, ((int64_t)m_data-l2_offset * 512) + (m_data-l2_index * sizeof(m_data-offset)), -(m_data-offset), -sizeof(m_data-offset) -) 0) { +offset, sizeof(offset)) 0) { return VMDK_ERROR; } /* update backup L2 table */ @@ -830,8 +831,7 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data) extent-file, ((int64_t)m_data-l2_offset * 512) + (m_data-l2_index * sizeof(m_data-offset)), -(m_data-offset), sizeof(m_data-offset) -) 0) { +offset, sizeof(offset)) 0) { return VMDK_ERROR; } } @@ -848,7 +848,7 @@ static int get_cluster_offset(BlockDriverState *bs, { unsigned int l1_index, l2_offset, l2_index; int min_index, i, j; -uint32_t min_count, *l2_table, tmp = 0; +uint32_t min_count, *l2_table; bool zeroed = false; if (m_data) { @@ -924,8 +924,7 @@ static int get_cluster_offset(BlockDriverState *bs, } *cluster_offset = 9; -tmp = cpu_to_le32(*cluster_offset); -l2_table[l2_index] = tmp; +l2_table[l2_index] = cpu_to_le32(*cluster_offset); /* First of all we write grain itself, to avoid race condition * that may to corrupt the image. @@ -938,7 +937,7 @@ static int get_cluster_offset(BlockDriverState *bs, } if (m_data) { -m_data-offset = tmp; +m_data-offset = *cluster_offset; m_data-l1_index = l1_index; m_data-l2_index = l2_index; m_data-l2_offset = l2_offset; -- 1.8.1.4
[Qemu-devel] [PULL 0/2] Net patches
Two bug fixes for QEMU 1.5. The following changes since commit 8ca27ce2e1150486ea2db4116a03706b28294f16: Merge remote-tracking branch 'afaerber/qom-cpu' into staging (2013-05-02 10:57:01 -0500) are available in the git repository at: git://github.com/stefanha/qemu.git net for you to fetch changes up to 7873df408dd44eb92840b108211d5aa5db7db526: tap: properly initialize vhostfds (2013-05-03 13:53:46 +0200) Amos Kong (1): net: make network client name unique Jason Wang (1): tap: properly initialize vhostfds net/net.c | 7 ++- net/tap.c | 2 +- 2 files changed, 3 insertions(+), 6 deletions(-) -- 1.8.1.4
[Qemu-devel] [PATCH 11/15] vmdk: add bdrv_co_write_zeroes
From: Fam Zheng f...@redhat.com Use special offset to write zeroes efficiently, when zeroed-grain GTE is available. If zero-write an allocated cluster, cluster is leaked because its offset pointer is overwritten by 0x1. Signed-off-by: Fam Zheng f...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/vmdk.c | 86 +++- 1 file changed, 68 insertions(+), 18 deletions(-) diff --git a/block/vmdk.c b/block/vmdk.c index d98f304..608daaf 100644 --- a/block/vmdk.c +++ b/block/vmdk.c @@ -124,6 +124,7 @@ typedef struct VmdkMetaData { unsigned int l2_index; unsigned int l2_offset; int valid; +uint32_t *l2_cache_entry; } VmdkMetaData; typedef struct VmdkGrainMarker { @@ -835,6 +836,9 @@ static int vmdk_L2update(VmdkExtent *extent, VmdkMetaData *m_data) return VMDK_ERROR; } } +if (m_data-l2_cache_entry) { +*m_data-l2_cache_entry = offset; +} return VMDK_OK; } @@ -905,6 +909,14 @@ static int get_cluster_offset(BlockDriverState *bs, l2_index = ((offset 9) / extent-cluster_sectors) % extent-l2_size; *cluster_offset = le32_to_cpu(l2_table[l2_index]); +if (m_data) { +m_data-valid = 1; +m_data-l1_index = l1_index; +m_data-l2_index = l2_index; +m_data-offset = *cluster_offset; +m_data-l2_offset = l2_offset; +m_data-l2_cache_entry = l2_table[l2_index]; +} if (extent-has_zero_grain *cluster_offset == VMDK_GTE_ZEROED) { zeroed = true; } @@ -938,10 +950,6 @@ static int get_cluster_offset(BlockDriverState *bs, if (m_data) { m_data-offset = *cluster_offset; -m_data-l1_index = l1_index; -m_data-l2_index = l2_index; -m_data-l2_offset = l2_offset; -m_data-valid = 1; } } *cluster_offset = 9; @@ -1164,8 +1172,17 @@ static coroutine_fn int vmdk_co_read(BlockDriverState *bs, int64_t sector_num, return ret; } +/** + * vmdk_write: + * @zeroed: buf is ignored (data is zero), use zeroed_grain GTE feature + * if possible, otherwise return -ENOTSUP. + * @zero_dry_run: used for zeroed == true only, don't update L2 table, just + * + * Returns: error code with 0 for success. + */ static int vmdk_write(BlockDriverState *bs, int64_t sector_num, - const uint8_t *buf, int nb_sectors) + const uint8_t *buf, int nb_sectors, + bool zeroed, bool zero_dry_run) { BDRVVmdkState *s = bs-opaque; VmdkExtent *extent = NULL; @@ -1211,7 +1228,7 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num, cluster_offset); } } -if (ret) { +if (ret == VMDK_ERROR) { return -EINVAL; } extent_begin_sector = extent-end_sector - extent-sectors; @@ -1221,17 +1238,34 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num, if (n nb_sectors) { n = nb_sectors; } - -ret = vmdk_write_extent(extent, -cluster_offset, index_in_cluster * 512, -buf, n, sector_num); -if (ret) { -return ret; -} -if (m_data.valid) { -/* update L2 tables */ -if (vmdk_L2update(extent, m_data) == -1) { -return -EIO; +if (zeroed) { +/* Do zeroed write, buf is ignored */ +if (extent-has_zero_grain +index_in_cluster == 0 +n = extent-cluster_sectors) { +n = extent-cluster_sectors; +if (!zero_dry_run) { +m_data.offset = VMDK_GTE_ZEROED; +/* update L2 tables */ +if (vmdk_L2update(extent, m_data) != VMDK_OK) { +return -EIO; +} +} +} else { +return -ENOTSUP; +} +} else { +ret = vmdk_write_extent(extent, +cluster_offset, index_in_cluster * 512, +buf, n, sector_num); +if (ret) { +return ret; +} +if (m_data.valid) { +/* update L2 tables */ +if (vmdk_L2update(extent, m_data) != VMDK_OK) { +return -EIO; +} } } nb_sectors -= n; @@ -1257,7 +1291,22 @@ static coroutine_fn int vmdk_co_write(BlockDriverState *bs, int64_t sector_num, int ret; BDRVVmdkState *s = bs-opaque; qemu_co_mutex_lock(s-lock); -ret = vmdk_write(bs, sector_num, buf, nb_sectors); +ret = vmdk_write(bs, sector_num, buf, nb_sectors, false, false); +qemu_co_mutex_unlock(s-lock); +return ret; +} + +static int coroutine_fn
[Qemu-devel] [PATCH 1/2] net: make network client name unique
From: Amos Kong ak...@redhat.com assign_name() creates a name MODEL.NUM, where MODEL is the client's model, and NUM is the number of MODELs that already exist. Markus added NIC naming for non-VLAN clients in commit 53e51d85. commit d33d93b2 incorrectly added a judgement of net-hub. It caused net clients created with -netdev get same names. eg: # qemu-upstream -device virtio-net-pci,netdev=h1 -netdev tap,id=h1 \ -device virtio-net-pci,netdev=h2 -netdev tap,id=h2 .. (qemu) info network virtio-net-pci.0: index=0,type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:56 \ h1: index=0,type=tap,ifname=tap0,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown virtio-net-pci.0: index=0,type=nic,model=virtio-net-pci,macaddr=52:54:00:12:34:57 \ h2: index=0,type=tap,ifname=tap1,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown This patch removed the check of nic-hub, and created unique names for all net clients that have same model. v2: update commitlog comments Signed-off-by: Amos Kong ak...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- net/net.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/net/net.c b/net/net.c index 7869161..43a74e4 100644 --- a/net/net.c +++ b/net/net.c @@ -157,8 +157,7 @@ void qemu_macaddr_default_if_unset(MACAddr *macaddr) /** * Generate a name for net client * - * Only net clients created with the legacy -net option need this. Naming is - * mandatory for net clients created with -netdev. + * Only net clients created with the legacy -net option and NICs need this. */ static char *assign_name(NetClientState *nc1, const char *model) { @@ -170,9 +169,7 @@ static char *assign_name(NetClientState *nc1, const char *model) if (nc == nc1) { continue; } -/* For compatibility only bump id for net clients on a vlan */ -if (strcmp(nc-model, model) == 0 -net_hub_id_for_client(nc, NULL) == 0) { +if (strcmp(nc-model, model) == 0) { id++; } } -- 1.8.1.4
[Qemu-devel] [PATCH 14/15] nbd: support large NBD requests
The Linux nbd driver recently increased the maximum supported request size up to 32 MB: commit 078be02b80359a541928c899c2631f39628f56df Author: Michal Belczyk belc...@bsd.krakow.pl Date: Tue Apr 30 15:28:28 2013 -0700 nbd: increase default and max request sizes Raise the default max request size for nbd to 128KB (from 127KB) to get it 4KB aligned. This patch also allows the max request size to be increased (via /sys/block/nbdx/queue/max_sectors_kb) to 32MB. QEMU's 1 MB buffers are too small to handle these requests. This patch allocates data buffers dynamically and allows up to 32 MB per request. Reported-by: Nick Thomas n...@bytemark.co.uk Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- include/block/nbd.h | 3 ++- nbd.c | 17 +++-- 2 files changed, 13 insertions(+), 7 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 0903d7a..c90f5e4 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -58,7 +58,8 @@ enum { #define NBD_DEFAULT_PORT 10809 -#define NBD_BUFFER_SIZE (1024*1024) +/* Maximum size of a single READ/WRITE data buffer */ +#define NBD_MAX_BUFFER_SIZE (32 * 1024 * 1024) ssize_t nbd_wr_sync(int fd, void *buffer, size_t size, bool do_read); int tcp_socket_incoming(const char *address, uint16_t port); diff --git a/nbd.c b/nbd.c index 761f4ec..2606403 100644 --- a/nbd.c +++ b/nbd.c @@ -844,13 +844,11 @@ void nbd_client_close(NBDClient *client) static NBDRequest *nbd_request_get(NBDClient *client) { NBDRequest *req; -NBDExport *exp = client-exp; assert(client-nb_requests = MAX_NBD_REQUESTS - 1); client-nb_requests++; req = g_slice_new0(NBDRequest); -req-data = qemu_blockalign(exp-bs, NBD_BUFFER_SIZE); nbd_client_get(client); req-client = client; return req; @@ -860,7 +858,9 @@ static void nbd_request_put(NBDRequest *req) { NBDClient *client = req-client; -qemu_vfree(req-data); +if (req-data) { +qemu_vfree(req-data); +} g_slice_free(NBDRequest, req); if (client-nb_requests-- == MAX_NBD_REQUESTS) { @@ -1007,6 +1007,7 @@ static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request *reque { NBDClient *client = req-client; int csock = client-sock; +uint32_t command; ssize_t rc; client-recv_coroutine = qemu_coroutine_self(); @@ -1018,9 +1019,9 @@ static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request *reque goto out; } -if (request-len NBD_BUFFER_SIZE) { +if (request-len NBD_MAX_BUFFER_SIZE) { LOG(len (%u) is larger than max len (%u), -request-len, NBD_BUFFER_SIZE); +request-len, NBD_MAX_BUFFER_SIZE); rc = -EINVAL; goto out; } @@ -1034,7 +1035,11 @@ static ssize_t nbd_co_receive_request(NBDRequest *req, struct nbd_request *reque TRACE(Decoding type); -if ((request-type NBD_CMD_MASK_COMMAND) == NBD_CMD_WRITE) { +command = request-type NBD_CMD_MASK_COMMAND; +if (command == NBD_CMD_READ || command == NBD_CMD_WRITE) { +req-data = qemu_blockalign(client-exp-bs, request-len); +} +if (command == NBD_CMD_WRITE) { TRACE(Reading %u byte(s), request-len); if (qemu_co_recv(csock, req-data, request-len) != request-len) { -- 1.8.1.4
[Qemu-devel] [PATCH 2/2] tap: properly initialize vhostfds
From: Jason Wang jasow...@redhat.com Only tap-vhostfd were checked net_init_tap_one(), but tap-vhostfds were forgot, this will lead qemu to ignore all fds passed by management through vhostfds, and tries to create vhost_net device itself. Fix by adding this check also. Reportyed-by: Michal Privoznik mpriv...@redhat.com Cc: Michal Privoznik mpriv...@redhat.com Cc: qemu-sta...@nongnu.org Signed-off-by: Jason Wang jasow...@redhat.com Reviewed-by: Michal Privoznik mpriv...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- net/tap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/tap.c b/net/tap.c index 17bdf01..e0b7a2a 100644 --- a/net/tap.c +++ b/net/tap.c @@ -623,7 +623,7 @@ static int net_init_tap_one(const NetdevTapOptions *tap, NetClientState *peer, vhostfdname || (tap-has_vhostforce tap-vhostforce)) { int vhostfd; -if (tap-has_vhostfd) { +if (tap-has_vhostfd || tap-has_vhostfds) { vhostfd = monitor_handle_fd_param(cur_mon, vhostfdname); if (vhostfd == -1) { return -1; -- 1.8.1.4
[Qemu-devel] [PATCH 1/4] kvm-all: add kvm_ioctl, kvm_vm_ioctl, kvm_vcpu_ioctl tracepoints
From: Kazuya Saito saito.kaz...@jp.fujitsu.com This patch adds tracepoints at ioctl to kvm. Tracing these ioctl is useful for clarification whether the cause of troubles is qemu or kvm. Signed-off-by: Kazuya Saito saito.kaz...@jp.fujitsu.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- kvm-all.c| 4 trace-events | 5 + 2 files changed, 9 insertions(+) diff --git a/kvm-all.c b/kvm-all.c index f6c0f4a..4f73b98 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -33,6 +33,7 @@ #include exec/memory.h #include exec/address-spaces.h #include qemu/event_notifier.h +#include trace.h /* This check must be after config-host.h is included */ #ifdef CONFIG_EVENTFD @@ -1687,6 +1688,7 @@ int kvm_ioctl(KVMState *s, int type, ...) arg = va_arg(ap, void *); va_end(ap); +trace_kvm_ioctl(type, arg); ret = ioctl(s-fd, type, arg); if (ret == -1) { ret = -errno; @@ -1704,6 +1706,7 @@ int kvm_vm_ioctl(KVMState *s, int type, ...) arg = va_arg(ap, void *); va_end(ap); +trace_kvm_vm_ioctl(type, arg); ret = ioctl(s-vmfd, type, arg); if (ret == -1) { ret = -errno; @@ -1721,6 +1724,7 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) arg = va_arg(ap, void *); va_end(ap); +trace_kvm_vcpu_ioctl(cpu-cpu_index, type, arg); ret = ioctl(cpu-kvm_fd, type, arg); if (ret == -1) { ret = -errno; diff --git a/trace-events b/trace-events index 55e80be..d5bc7a5 100644 --- a/trace-events +++ b/trace-events @@ -1153,3 +1153,8 @@ virtio_ccw_new_device(int cssid, int ssid, int schid, int devno, const char *dev # migration.c migrate_set_state(int new_state) new state %d + +# kvm-all.c +kvm_ioctl(int type, void *arg) type %d, arg %p +kvm_vm_ioctl(int type, void *arg) type %d, arg %p +kvm_vcpu_ioctl(int cpu_index, int type, void *arg) cpu_index %d, type %d, arg %p -- 1.8.1.4
[Qemu-devel] [PULL 0/4] Tracing patches
This tracing pull request is long overdue for QEMU 1.5. Eiichi Tsukata's ftrace backend makes it easy to correlate QEMU events with host kernel events. He also reports good performance. Kazuya Saito's trace events make it easier to observe the KVM run loop. The following changes since commit 8ca27ce2e1150486ea2db4116a03706b28294f16: Merge remote-tracking branch 'afaerber/qom-cpu' into staging (2013-05-02 10:57:01 -0500) are available in the git repository at: git://github.com/stefanha/qemu.git tracing for you to fetch changes up to e64dd5efb2c6d522a3bc9d096cd49a4e53f0ae10: trace: document ftrace backend (2013-05-03 13:58:09 +0200) Eiichi Tsukata (2): trace: Add ftrace tracing backend trace: document ftrace backend Kazuya Saito (2): kvm-all: add kvm_ioctl, kvm_vm_ioctl, kvm_vcpu_ioctl tracepoints kvm-all: add kvm_run_exit tracepoint configure | 8 +++ docs/tracing.txt| 16 ++ kvm-all.c | 5 ++ scripts/tracetool/backend/ftrace.py | 54 +++ trace-events| 7 +++ trace/Makefile.objs | 1 + trace/ftrace.c | 102 trace/ftrace.h | 10 8 files changed, 203 insertions(+) create mode 100644 scripts/tracetool/backend/ftrace.py create mode 100644 trace/ftrace.c create mode 100644 trace/ftrace.h -- 1.8.1.4
[Qemu-devel] [PATCH 2/4] kvm-all: add kvm_run_exit tracepoint
From: Kazuya Saito saito.kaz...@jp.fujitsu.com This patch enable us to know exit reason of KVM_RUN. It will help us know where the trouble is caused. Signed-off-by: Kazuya Saito saito.kaz...@jp.fujitsu.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- kvm-all.c| 1 + trace-events | 2 ++ 2 files changed, 3 insertions(+) diff --git a/kvm-all.c b/kvm-all.c index 4f73b98..3a31602 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -1627,6 +1627,7 @@ int kvm_cpu_exec(CPUArchState *env) abort(); } +trace_kvm_run_exit(cpu-cpu_index, run-exit_reason); switch (run-exit_reason) { case KVM_EXIT_IO: DPRINTF(handle_io\n); diff --git a/trace-events b/trace-events index d5bc7a5..17d75ab 100644 --- a/trace-events +++ b/trace-events @@ -1158,3 +1158,5 @@ migrate_set_state(int new_state) new state %d kvm_ioctl(int type, void *arg) type %d, arg %p kvm_vm_ioctl(int type, void *arg) type %d, arg %p kvm_vcpu_ioctl(int cpu_index, int type, void *arg) cpu_index %d, type %d, arg %p +kvm_run_exit(int cpu_index, uint32_t reason) cpu_index %d, reason %d + -- 1.8.1.4
[Qemu-devel] [PATCH 3/4] trace: Add ftrace tracing backend
From: Eiichi Tsukata eiichi.tsukata...@hitachi.com This patch adds a ftrace tracing backend which sends trace event to ftrace marker file. You can effectively compare qemu trace data and kernel(especially, kvm.ko when using KVM) trace data. The ftrace backend is restricted to Linux only. To try out the ftrace backend: $ ./configure --trace-backend=ftrace $ make if you use KVM, enable kvm events in ftrace: # sudo echo 1 /sys/kernel/debug/tracing/events/kvm/enable After running qemu by root user, you can get the trace: # cat /sys/kernel/debug/tracing/trace Signed-off-by: Eiichi Tsukata eiichi.tsukata...@hitachi.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- configure | 8 +++ scripts/tracetool/backend/ftrace.py | 54 +++ trace/Makefile.objs | 1 + trace/ftrace.c | 102 trace/ftrace.h | 10 5 files changed, 175 insertions(+) create mode 100644 scripts/tracetool/backend/ftrace.py create mode 100644 trace/ftrace.c create mode 100644 trace/ftrace.h diff --git a/configure b/configure index c4d85ba..e818e8b 100755 --- a/configure +++ b/configure @@ -4004,6 +4004,14 @@ if test $trace_backend = dtrace; then echo CONFIG_TRACE_SYSTEMTAP=y $config_host_mak fi fi +if test $trace_backend = ftrace; then + if test $linux = yes ; then +echo CONFIG_TRACE_FTRACE=y $config_host_mak +trace_default=no + else +feature_not_found ftrace(trace backend) + fi +fi echo CONFIG_TRACE_FILE=$trace_file $config_host_mak if test $trace_default = yes; then echo CONFIG_TRACE_DEFAULT=y $config_host_mak diff --git a/scripts/tracetool/backend/ftrace.py b/scripts/tracetool/backend/ftrace.py new file mode 100644 index 000..888c361 --- /dev/null +++ b/scripts/tracetool/backend/ftrace.py @@ -0,0 +1,54 @@ +#!/usr/bin/env python +# -*- coding: utf-8 -*- + + +Ftrace built-in backend. + + +__author__ = Eiichi Tsukata eiichi.tsukata...@hitachi.com +__copyright__ = Copyright (C) 2013 Hitachi, Ltd. +__license__= GPL version 2 or (at your option) any later version + +__maintainer__ = Stefan Hajnoczi +__email__ = stefa...@redhat.com + + +from tracetool import out + + +PUBLIC = True + + +def c(events): +pass + +def h(events): +out('#include trace/ftrace.h', +'#include trace/control.h', +'', +) + +for e in events: +argnames = , .join(e.args.names()) +if len(e.args) 0: +argnames = , + argnames + +out('static inline void trace_%(name)s(%(args)s)', +'{', +'char ftrace_buf[MAX_TRACE_STRLEN];', +'int unused __attribute__ ((unused));', +'int trlen;', +'bool _state = trace_event_get_state(%(event_id)s);', +'if (_state) {', +'trlen = snprintf(ftrace_buf, MAX_TRACE_STRLEN,', +' %(name)s %(fmt)s \\n %(argnames)s);', +'trlen = MIN(trlen, MAX_TRACE_STRLEN - 1);', +'unused = write(trace_marker_fd, ftrace_buf, trlen);', +'}', +'}', +name = e.name, +args = e.args, +event_id = TRACE_ + e.name.upper(), +fmt = e.fmt.rstrip(\n), +argnames = argnames, +) diff --git a/trace/Makefile.objs b/trace/Makefile.objs index a043072..3b88e49 100644 --- a/trace/Makefile.objs +++ b/trace/Makefile.objs @@ -76,5 +76,6 @@ endif util-obj-$(CONFIG_TRACE_DEFAULT) += default.o util-obj-$(CONFIG_TRACE_SIMPLE) += simple.o util-obj-$(CONFIG_TRACE_STDERR) += stderr.o +util-obj-$(CONFIG_TRACE_FTRACE) += ftrace.o util-obj-y += control.o util-obj-y += generated-tracers.o diff --git a/trace/ftrace.c b/trace/ftrace.c new file mode 100644 index 000..46b7fdb --- /dev/null +++ b/trace/ftrace.c @@ -0,0 +1,102 @@ +/* + * Ftrace trace backend + * + * Copyright (C) 2013 Hitachi, Ltd. + * Created by Eiichi Tsukata eiichi.tsukata...@hitachi.com + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include stdio.h +#include string.h +#include fcntl.h +#include limits.h +#include trace.h +#include trace/control.h + +int trace_marker_fd; + +static int find_debugfs(char *debugfs) +{ +char type[100]; +FILE *fp; + +fp = fopen(/proc/mounts, r); +if (fp == NULL) { +return 0; +} + +while (fscanf(fp, %*s % STR(PATH_MAX) s %99s %*s %*d %*d\n, + debugfs, type) == 2) { +if (strcmp(type, debugfs) == 0) { +break; +} +} +fclose(fp); + +if (strcmp(type, debugfs) != 0) { +return 0; +} +return 1; +} + +void trace_print_events(FILE *stream, fprintf_function stream_printf) +{ +TraceEventID i; + +for (i = 0; i trace_event_count(); i++) { +
[Qemu-devel] [PATCH 4/4] trace: document ftrace backend
From: Eiichi Tsukata eiichi.tsukata...@hitachi.com Add documentation of ftrace backend. Signed-off-by: Eiichi Tsukata eiichi.tsukata...@hitachi.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- docs/tracing.txt | 16 1 file changed, 16 insertions(+) diff --git a/docs/tracing.txt b/docs/tracing.txt index cf53c17..60ff9c5 100644 --- a/docs/tracing.txt +++ b/docs/tracing.txt @@ -175,6 +175,22 @@ unless you have specific needs for more advanced backends. The simple backend currently does not capture string arguments, it simply records the char* pointer value instead of the string that is pointed to. +=== Ftrace === + +The ftrace backend writes trace data to ftrace marker. This effectively +sends trace events to ftrace ring buffer, and you can compare qemu trace +data and kernel(especially kvm.ko when using KVM) trace data. + +if you use KVM, enable kvm events in ftrace: + + # echo 1 /sys/kernel/debug/tracing/events/kvm/enable + +After running qemu by root user, you can get the trace: + + # cat /sys/kernel/debug/tracing/trace + +Restriction: ftrace backend is restricted to Linux only. + Monitor commands * trace-file on|off|flush|set path -- 1.8.1.4
Re: [Qemu-devel] [PATCH 1/4] kvm-all: add kvm_ioctl, kvm_vm_ioctl, kvm_vcpu_ioctl tracepoints
Am 03.05.2013 14:01, schrieb Stefan Hajnoczi: From: Kazuya Saito saito.kaz...@jp.fujitsu.com This patch adds tracepoints at ioctl to kvm. Tracing these ioctl is useful for clarification whether the cause of troubles is qemu or kvm. Signed-off-by: Kazuya Saito saito.kaz...@jp.fujitsu.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- kvm-all.c| 4 trace-events | 5 + 2 files changed, 9 insertions(+) diff --git a/kvm-all.c b/kvm-all.c index f6c0f4a..4f73b98 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -33,6 +33,7 @@ #include exec/memory.h #include exec/address-spaces.h #include qemu/event_notifier.h +#include trace.h /* This check must be after config-host.h is included */ #ifdef CONFIG_EVENTFD @@ -1687,6 +1688,7 @@ int kvm_ioctl(KVMState *s, int type, ...) arg = va_arg(ap, void *); va_end(ap); +trace_kvm_ioctl(type, arg); ret = ioctl(s-fd, type, arg); if (ret == -1) { ret = -errno; @@ -1704,6 +1706,7 @@ int kvm_vm_ioctl(KVMState *s, int type, ...) arg = va_arg(ap, void *); va_end(ap); +trace_kvm_vm_ioctl(type, arg); ret = ioctl(s-vmfd, type, arg); if (ret == -1) { ret = -errno; @@ -1721,6 +1724,7 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) arg = va_arg(ap, void *); va_end(ap); +trace_kvm_vcpu_ioctl(cpu-cpu_index, type, arg); ret = ioctl(cpu-kvm_fd, type, arg); if (ret == -1) { ret = -errno; diff --git a/trace-events b/trace-events index 55e80be..d5bc7a5 100644 --- a/trace-events +++ b/trace-events @@ -1153,3 +1153,8 @@ virtio_ccw_new_device(int cssid, int ssid, int schid, int devno, const char *dev # migration.c migrate_set_state(int new_state) new state %d + +# kvm-all.c +kvm_ioctl(int type, void *arg) type %d, arg %p +kvm_vm_ioctl(int type, void *arg) type %d, arg %p +kvm_vcpu_ioctl(int cpu_index, int type, void *arg) cpu_index %d, type %d, arg %p Sorry that I'm just seeing this patch now (wasn't CC'ed), but I wonder whether cpu_index is the best thing to trace here? Can we still change trace event API or would we have to nack/change now? CC'ing Igor since he just introduced a cpu_get_arch_id() and there's also a kvm_arch_vcpu_id() introduced earlier by Eduardo. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [libvirt] [PATCH 3/3] gtk: add devices menu to allow changing removable block devices
On Thu, May 02, 2013 at 10:40:06AM -0500, Anthony Liguori wrote: Kevin Wolf kw...@redhat.com writes: + +if (strcmp(type, ide-cd) == 0) { +disk_type = DT_CDROM; +} else if (strcmp(type, isa-fdc) == 0) { +disk_type = DT_FLOPPY; +} else { +disk_type = DT_NORMAL; +} Same thing here, comparing against strings is a hack. Devices should probably have a property that says what kind of device they are. Ack, this is nasty. I would like to eliminate this. There is a type field in BlockInfo but: # @type: This field is returned only for compatibility reasons, it should #not be used (always returns 'unknown') I vaguely remember this happening but I don't remember the specific reason why. I would definitely prefer that we filled out type correctly. I think Markus was involved in this. Markus or Luiz, do you remember the story here? The reason is that BlockInfo is about the backend and it simply doesn't know (ever since we introduced if=none, this was buggy, so we just abandoned it at some point). We would have to ask the device, not the block layer. Yes, this makes sense. We could introduce an interface that all disks implemented that returned information about whether it was a CD-ROM, Floppy, etc. How does libvirt cope with this today? I presume they do something similar to what this patch is doing in terms of hard coding device names. Sorry, not really sure what your question is here - how does libvirt cope with what exactly ? Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC] Moving Hard Freeze to Monday, May 6th.
On 01/05/13 17:24, Rob Landley wrote: On 04/30/2013 01:36:39 PM, Anthony Liguori wrote: The current release schedule has hard freeze happening tomorrow. There are a few things still outstanding including cpu hotplug and updating SeaBIOS. We still need to resolve how to handle the softfloat code too. I am particularly interested in this last one for the 1.5 release. Sparc's OpenBios also needs a refresh to handle longer command lines: http://lists.gnu.org/archive/html/qemu-devel/2013-04/msg03880.html Rob The plan is to hopefully spin out an OpenBIOS 1.1 release over this weekend in time for the 1.5 hard freeze (the version in git master is old as the 1.4 OpenBIOS images were reverted last minute due to a regression). ATB, Mark.
Re: [Qemu-devel] [PATCH 1/2] Add i.MX I2C device emulator.
Am 02.05.2013 20:33, schrieb Jean-Christophe DUBOIS: On 05/02/2013 02:16 PM, Peter Crosthwaite wrote: On Thu, May 2, 2013 at 5:53 AM, Jean-Christophe DUBOIS j...@tribudubois.net wrote: +static int imx_i2c_init(SysBusDevice *dev) +{ Use of the SysBusDeviceClass::init function is deprecated. Please use DeviceClass::realise or Object::init. With no reliance on properties I would suggest this one can be done as just an Object::init fn. Could you point me to the documentation or an up to date example? Documentation is in include/hw/qdev-core.h and include/qom/object.h. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH v2 10/12] savevm: update error reporting of qemu_savevm_state() and related functions
Am 24.04.2013 um 17:32 hat Pavel Hrdina geschrieben: Signed-off-by: Pavel Hrdina phrd...@redhat.com --- include/sysemu/sysemu.h | 7 --- migration.c | 6 +++--- savevm.c| 38 +++--- 3 files changed, 26 insertions(+), 25 deletions(-) --- a/migration.c +++ b/migration.c @@ -508,7 +508,7 @@ static void *migration_thread(void *opaque) bool old_vm_running = false; DPRINTF(beginning savevm\n); -qemu_savevm_state_begin(s-file, s-params); +qemu_savevm_state_begin(s-file, s-params, NULL); while (s-state == MIG_STATE_ACTIVE) { int64_t current_time; @@ -519,7 +519,7 @@ static void *migration_thread(void *opaque) pending_size = qemu_savevm_state_pending(s-file, max_size); DPRINTF(pending size %lu max %lu\n, pending_size, max_size); if (pending_size pending_size = max_size) { -qemu_savevm_state_iterate(s-file); +qemu_savevm_state_iterate(s-file, NULL); } else { DPRINTF(done iterating\n); qemu_mutex_lock_iothread(); @@ -528,7 +528,7 @@ static void *migration_thread(void *opaque) old_vm_running = runstate_is_running(); vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); qemu_file_set_rate_limit(s-file, INT_MAX); -qemu_savevm_state_complete(s-file); +qemu_savevm_state_complete(s-file, NULL); qemu_mutex_unlock_iothread(); if (!qemu_file_get_error(s-file)) { migrate_finish_set_state(s, MIG_STATE_COMPLETED); Why is it okay to ignore any errors in this function? diff --git a/savevm.c b/savevm.c index 6458621..749d49d 100644 --- a/savevm.c +++ b/savevm.c @@ -1972,37 +1976,33 @@ void qemu_savevm_state_cancel(void) } } -static int qemu_savevm_state(QEMUFile *f) +static void qemu_savevm_state(QEMUFile *f, Error **errp) { -int ret; MigrationParams params = { .blk = 0, .shared = 0 }; -if (qemu_savevm_state_blocked(NULL)) { -return -EINVAL; +if (qemu_savevm_state_blocked(errp)) { +return; } qemu_mutex_unlock_iothread(); -qemu_savevm_state_begin(f, params); +qemu_savevm_state_begin(f, params, errp); qemu_mutex_lock_iothread(); You need to check errp here. while (qemu_file_get_error(f) == 0) { -if (qemu_savevm_state_iterate(f) 0) { +if (qemu_savevm_state_iterate(f, errp) 0) { break; } And here as well. The problem is that you must not overwrite an error, therefore you need to explicitly free and ignore unused errors, and decide which one takes precedence. } -ret = qemu_file_get_error(f); -if (ret == 0) { -qemu_savevm_state_complete(f); -ret = qemu_file_get_error(f); +if (!qemu_file_get_error(f)) { +qemu_savevm_state_complete(f, errp); } -if (ret != 0) { +if (qemu_file_get_error(f)) { qemu_savevm_state_cancel(); } -return ret; } Kevin
Re: [Qemu-devel] [PATCH v2 11/12] qapi: Convert savevm
Am 24.04.2013 um 17:32 hat Pavel Hrdina geschrieben: QMP command vm-snapshot-save takes one parameter name and the name is mandatory. The command returns SnapshotInfo on success, otherwise it returns an error message. If there is a snapshot with the same name it also returns an error message and if you want to overwrite that snapshot, you will have to at first call vm-snapshot-delete. HMP command savevm now has one optional parameter name and one flag -f. If the name parameter isn't provided the HMP command will create new one for internally used vm-snapshot-save. You can also specify the -f flag for overwrite existing snapshot which will internally use vm-snapshot-delete before vm-snapshot-save, otherwise it will print an error message if there is already a snapshot with the same name. If something else goes wrong, an error message will be printed. These improves behavior of the command to let you select only the name of the snapshot you want to create. This will ensure that if you want snapshot with the name '2', it will not rewrite or fail if there is any snapshot with id '2'. Signed-off-by: Pavel Hrdina phrd...@redhat.com --- a/hmp.c +++ b/hmp.c @@ -1491,3 +1491,52 @@ void hmp_vm_snapshot_load(Monitor *mon, const QDict *qdict) qapi_free_SnapshotInfo(info); hmp_handle_error(mon, local_err); } + +void hmp_vm_snapshot_save(Monitor *mon, const QDict *qdict) +{ +const char *name = qdict_get_try_str(qdict, name); +bool force = qdict_get_try_bool(qdict, force, 0); +Error *local_err = NULL; +SnapshotInfo *info = NULL; +qemu_timeval tv; +struct tm tm; +char tmp_name[256]; + +if (!name) { +localtime_r((const time_t *)tv.tv_sec, tm); +strftime(tmp_name, sizeof(tmp_name), vm-%Y%m%d%H%M%S, tm); +name = tmp_name; +} + +if (force) { +info = qmp_vm_snapshot_delete(true, name, false, NULL, local_err); +/* We don't need print info about deleted snapshot. It still needs to + * be freed. */ +qapi_free_SnapshotInfo(info); +if (error_is_set(local_err)) { +hmp_handle_error(mon, local_err); +return; +} +} Deleting a snapshot that doesn't exist returns an error. This means that you can use -f _only_ to overwrite a snapshot. I think this is unexpected, with -f all cases that work without -f should keep working. + +info = qmp_vm_snapshot_save(name, local_err); + +if (info) { +char buf[256]; +QEMUSnapshotInfo sn = { +.vm_state_size = info-vm_state_size, +.date_sec = info-date_sec, +.date_nsec = info-date_nsec, +.vm_clock_nsec = info-vm_clock_sec * 10 + + info-vm_clock_nsec, +}; +pstrcpy(sn.id_str, sizeof(sn.id_str), info-id); +pstrcpy(sn.name, sizeof(sn.name), info-name); +monitor_printf(mon, Created snapshot's info:\n); +monitor_printf(mon, %s\n, bdrv_snapshot_dump(buf, sizeof(buf), NULL)); +monitor_printf(mon, %s\n, bdrv_snapshot_dump(buf, sizeof(buf), sn)); +} + +qapi_free_SnapshotInfo(info); +hmp_handle_error(mon, local_err); +} --- a/savevm.c +++ b/savevm.c @@ -2398,36 +2369,24 @@ void do_savevm(Monitor *mon, const QDict *qdict) sn-date_nsec = tv.tv_usec * 1000; sn-vm_clock_nsec = qemu_get_clock_ns(vm_clock); -if (name) { -if (bdrv_snapshot_find(bs, old_sn, name, NULL, NULL, true)) { -pstrcpy(sn-name, sizeof(sn-name), old_sn-name); -pstrcpy(sn-id_str, sizeof(sn-id_str), old_sn-id_str); -} else { -pstrcpy(sn-name, sizeof(sn-name), name); -} -} else { -/* cast below needed for OpenBSD where tv_sec is still 'long' */ -localtime_r((const time_t *)tv.tv_sec, tm); -strftime(sn-name, sizeof(sn-name), vm-%Y%m%d%H%M%S, tm); -} - -/* Delete old snapshots of the same name */ -if (name del_existing_snapshots(mon, name) 0) { +if (bdrv_snapshot_find(bs, old_sn, name, NULL, NULL, false)) { +error_setg(errp, Snapshot '%s' exists, name); This is only checked for the VM state device. For all other devices the case isn't handled any more. goto the_end; +} else { +pstrcpy(sn-name, sizeof(sn-name), name); } /* save the VM state */ f = qemu_fopen_bdrv(bs, 1); if (!f) { -monitor_printf(mon, Could not open VM state file\n); +error_setg(errp, Failed to open '%s' file, bdrv_get_device_name(bs)); goto the_end; } qemu_savevm_state(f, local_err); vm_state_size = qemu_ftell(f); qemu_fclose(f); if (error_is_set(local_err)) { -qerror_report_err(local_err); -error_free(local_err); +error_propagate(errp, local_err); goto the_end; } @@ -2440,18
Re: [Qemu-devel] [PATCH v2 12/12] savevm: remove backward compatibility from bdrv_snapshot_find()
Am 24.04.2013 um 17:32 hat Pavel Hrdina geschrieben: Signed-off-by: Pavel Hrdina phrd...@redhat.com --- savevm.c | 35 --- 1 file changed, 12 insertions(+), 23 deletions(-) diff --git a/savevm.c b/savevm.c index 2e849b8..45d46c6 100644 --- a/savevm.c +++ b/savevm.c @@ -2263,8 +2263,7 @@ out: } static bool bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info, - const char *name, const char *id, Error **errp, - bool old_match) + const char *name, const char *id, Error **errp) { QEMUSnapshotInfo *sn_tab, *sn; Error *local_err = NULL; @@ -2293,20 +2292,10 @@ static bool bdrv_snapshot_find(BlockDriverState *bs, QEMUSnapshotInfo *sn_info, break; } } else if (name) { -/* for compatibility for old bdrv_snapshot_find call - * will be removed */ -if (old_match) { -if (!strcmp(sn-id_str, name) || !strcmp(sn-name, name)) { -*sn_info = *sn; -found = true; -break; -} -} else { -if (!strcmp(sn-name, name)) { -*sn_info = *sn; -found = true; -break; -} +if (!strcmp(sn-name, name)) { +*sn_info = *sn; +found = true; +break; } } else if (id) { if (!strcmp(sn-id_str, id)) { @@ -2369,7 +2358,7 @@ SnapshotInfo *qmp_vm_snapshot_save(const char *name, Error **errp) sn-date_nsec = tv.tv_usec * 1000; sn-vm_clock_nsec = qemu_get_clock_ns(vm_clock); -if (bdrv_snapshot_find(bs, old_sn, name, NULL, NULL, false)) { +if (bdrv_snapshot_find(bs, old_sn, name, NULL, NULL)) { error_setg(errp, Snapshot '%s' exists, name); goto the_end; } else { @@ -2478,7 +2467,7 @@ SnapshotInfo *qmp_vm_snapshot_load(bool has_name, const char *name, } /* Don't even try to load empty VM states */ -if (!bdrv_snapshot_find(bs_vm_state, sn, name, id, local_err, false)) { +if (!bdrv_snapshot_find(bs_vm_state, sn, name, id, local_err)) { error_setg(errp, Snapshot doesn't exist: %s, error_get_pretty(local_err)); error_free(local_err); @@ -2506,7 +2495,7 @@ SnapshotInfo *qmp_vm_snapshot_load(bool has_name, const char *name, return NULL; } -if (!bdrv_snapshot_find(bs, sn, name, id, local_err, false)) { +if (!bdrv_snapshot_find(bs, sn, name, id, local_err)) { error_setg(errp, Snapshot doesn't exist for device '%s': %s, bdrv_get_device_name(bs), error_get_pretty(local_err)); error_free(local_err); @@ -2599,7 +2588,7 @@ SnapshotInfo *qmp_vm_snapshot_delete(const bool has_name, const char *name, return NULL; } -if (!bdrv_snapshot_find(bs, sn, name, id, local_err, false)) { +if (!bdrv_snapshot_find(bs, sn, name, id, local_err)) { error_propagate(errp, local_err); return NULL; } @@ -2616,7 +2605,7 @@ SnapshotInfo *qmp_vm_snapshot_delete(const bool has_name, const char *name, bs = NULL; while ((bs = bdrv_next(bs))) { if (bdrv_can_snapshot(bs) -bdrv_snapshot_find(bs, sn, name, id, NULL, false)) { +bdrv_snapshot_find(bs, sn, name, id, NULL)) { /* Small hack to ensure that proper snapshot is deleted for every * block driver. This will be fixed soon. */ if (!strcmp(bs-drv-format_name, rbd)) { @@ -2678,8 +2667,8 @@ void do_info_snapshots(Monitor *mon, const QDict *qdict) while ((bs1 = bdrv_next(bs1))) { if (bdrv_can_snapshot(bs1) bs1 != bs) { -if (!bdrv_snapshot_find(bs1, sn_info, sn-id_str, NULL, NULL, -true)) { +if (!bdrv_snapshot_find(bs1, sn_info, sn-name, sn-id_str, +NULL)) { Now you require that both name and ID of the snapshots are equal. I think in practice it's only the name that matches (which means that the code is bad already before this patch). Kevin
Re: [Qemu-devel] [Qemu-trivial] [PATCH v2] bsd-user: OS-agnostic 64-bit SYSCTL types
30.04.2013 17:29, Ed Maste wrote: Use existence of type as #ifdef condition rather than FreeBSD-specific version check, as suggested by Patrick Welche. Also handle the signed (CTLTYPE_S64) case identically to the unsigned (CTLTYPE_U64) case, per later patches in the FreeBSD ports tree (emulators/qemu-devel/files/patch-z-arm-bsd-user-001). Applied to trivial-patches tree (8e4125147a8bc4da2c1e8eced722ebe31bafc19e), thank you! /mjt
Re: [Qemu-devel] [PATCH qom-cpu 6/9] target-i386: Add feature-words property
On Fri, May 03, 2013 at 01:34:23PM +0200, Igor Mammedov wrote: On Mon, 22 Apr 2013 16:00:17 -0300 Eduardo Habkost ehabk...@redhat.com wrote: This property will be useful for libvirt, as libvirt already has logic based on low-level feature bits (not feature names), so it will be really easy to convert the current libvirt logic to something using the feature-words property. The property will have two main use cases: - Checking host capabilities, by checking the features of the host CPU model - Checking which features are enabled on each CPU model patch doesn't apply to current qom-cpu, more comments below. Example output: $ ./QMP/qmp --path=/tmp/m qom-get --path=/machine/unattached/device[1] --property=feature-words item[0].cpuid-register: EDX item[0].cpuid-input-eax: 2147483658 item[0].features: 0 item[1].cpuid-register: EAX item[1].cpuid-input-eax: 1073741825 item[1].features: 0 item[2].cpuid-register: EDX item[2].cpuid-input-eax: 3221225473 item[2].features: 0 item[3].cpuid-register: ECX item[3].cpuid-input-eax: 2147483649 item[3].features: 101 item[4].cpuid-register: EDX item[4].cpuid-input-eax: 2147483649 item[4].features: 563346425 item[5].cpuid-register: EBX item[5].cpuid-input-eax: 7 item[5].features: 0 item[5].cpuid-input-ecx: 0 could item be represented as CPUID function in format used in Intel's SDM? We could, but maybe it would make the interface harder to use and not easier? Even when two feature words are returned in the same CPUID leaf, they are independent and separate feature-words that must be checked individually by libvirt, so I believe returning one feature-word per array-item makes more sense. Having an extra item in the array would make it clear for libvirt that QEMU has a new feature-word that libvirt doesn't know about, and easier to spot than an extra field in an existing array item. item[5].CPUID: EAX=7h,ECX=0h What would be the data type of this CPUID field? Are you suggesting returning a string to be parsed manually? item[5].EBX: 0 item[5].EAX: 0 for simplicity/uniformity ECX could be not optional, always present, and ignored when not needed. Then how would we represent the fact that ECX is optional? If ECX is not important for a given CPUID leaf, I don't see what's the point of including it with a fake value. Note that this interface is not supposed to be a comprehensive CPUID dump with all CPUID bits returned by the CPU, but just a list of CPU capability words that happen to be returned inside CPUID leaves. item[6].cpuid-register: ECX item[6].cpuid-input-eax: 1 item[6].features: 2155880449 item[7].cpuid-register: EDX item[7].cpuid-input-eax: 1 item[7].features: 126614521 Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- Changes v1 - v2: * Merge the non-qapi and qapi patches, to keep series simpler * Use the feature word array series as base, so we don't have to set the feature word values one-by-one in the code * Change type name of property from x86-cpu-feature-words to X86CPUFeatureWordInfo * Remove cpu-qapi-schema.json and simply add the type definitions to qapi-schema.json, to keep the changes simpler * This required compiling qapi-types.o and qapi-visit.o into *-user as well --- .gitignore| 2 ++ Makefile.objs | 7 +- qapi-schema.json | 31 target-i386/cpu.c | 70 +-- 4 files changed, 97 insertions(+), 13 deletions(-) diff --git a/.gitignore b/.gitignore index 487813a..71408e3 100644 --- a/.gitignore +++ b/.gitignore @@ -21,6 +21,8 @@ linux-headers/asm qapi-generated qapi-types.[ch] qapi-visit.[ch] +cpu-qapi-types.[ch] +cpu-qapi-visit.[ch] still needed? Not needed anymore. Thanks for spotting it. qmp-commands.h qmp-marshal.c qemu-doc.html diff --git a/Makefile.objs b/Makefile.objs index a473348..1835d37 100644 --- a/Makefile.objs +++ b/Makefile.objs @@ -78,10 +78,15 @@ common-obj-$(CONFIG_SMARTCARD_NSS) += $(libcacard-y) ## # qapi -common-obj-y += qmp-marshal.o qapi-visit.o qapi-types.o +common-obj-y += qmp-marshal.o common-obj-y += qmp.o hmp.o endif +## +# some qapi visitors are used by both system and user emulation: + +common-obj-y += qapi-visit.o qapi-types.o + ### # Target-independent parts used in system and user emulation common-obj-y += qemu-log.o diff --git a/qapi-schema.json b/qapi-schema.json index 751d3c2..424434f 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -3505,3 +3505,34 @@ '*asl_compiler_rev': 'uint32',
[Qemu-devel] [PATCH for-1.5] qdev: skip bus check for bus-less devices in qdev_unplug()
Since commit 2f7bd829db qdev: Fix device_add bus assumptions it's possible to device_add bus-less device, but if such device is unplugged it will dereference NULL parent_bus in qdev_unplug(). Fix it by taking in account that parent_bus might be NULL and skipping bus check. Signed-off-by: Igor Mammedov imamm...@redhat.com --- hw/core/qdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/core/qdev.c b/hw/core/qdev.c index ab1d8f5..069ac90 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -208,7 +208,7 @@ void qdev_unplug(DeviceState *dev, Error **errp) { DeviceClass *dc = DEVICE_GET_CLASS(dev); -if (!dev-parent_bus-allow_hotplug) { +if (dev-parent_bus !dev-parent_bus-allow_hotplug) { error_set(errp, QERR_BUS_NO_HOTPLUG, dev-parent_bus-name); return; } -- 1.8.2.1
Re: [Qemu-devel] [PATCH 1/4] kvm-all: add kvm_ioctl, kvm_vm_ioctl, kvm_vcpu_ioctl tracepoints
On Fri, May 03, 2013 at 02:12:14PM +0200, Andreas Färber wrote: Am 03.05.2013 14:01, schrieb Stefan Hajnoczi: From: Kazuya Saito saito.kaz...@jp.fujitsu.com This patch adds tracepoints at ioctl to kvm. Tracing these ioctl is useful for clarification whether the cause of troubles is qemu or kvm. Signed-off-by: Kazuya Saito saito.kaz...@jp.fujitsu.com Reviewed-by: Paolo Bonzini pbonz...@redhat.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- kvm-all.c| 4 trace-events | 5 + 2 files changed, 9 insertions(+) diff --git a/kvm-all.c b/kvm-all.c index f6c0f4a..4f73b98 100644 --- a/kvm-all.c +++ b/kvm-all.c @@ -33,6 +33,7 @@ #include exec/memory.h #include exec/address-spaces.h #include qemu/event_notifier.h +#include trace.h /* This check must be after config-host.h is included */ #ifdef CONFIG_EVENTFD @@ -1687,6 +1688,7 @@ int kvm_ioctl(KVMState *s, int type, ...) arg = va_arg(ap, void *); va_end(ap); +trace_kvm_ioctl(type, arg); ret = ioctl(s-fd, type, arg); if (ret == -1) { ret = -errno; @@ -1704,6 +1706,7 @@ int kvm_vm_ioctl(KVMState *s, int type, ...) arg = va_arg(ap, void *); va_end(ap); +trace_kvm_vm_ioctl(type, arg); ret = ioctl(s-vmfd, type, arg); if (ret == -1) { ret = -errno; @@ -1721,6 +1724,7 @@ int kvm_vcpu_ioctl(CPUState *cpu, int type, ...) arg = va_arg(ap, void *); va_end(ap); +trace_kvm_vcpu_ioctl(cpu-cpu_index, type, arg); ret = ioctl(cpu-kvm_fd, type, arg); if (ret == -1) { ret = -errno; diff --git a/trace-events b/trace-events index 55e80be..d5bc7a5 100644 --- a/trace-events +++ b/trace-events @@ -1153,3 +1153,8 @@ virtio_ccw_new_device(int cssid, int ssid, int schid, int devno, const char *dev # migration.c migrate_set_state(int new_state) new state %d + +# kvm-all.c +kvm_ioctl(int type, void *arg) type %d, arg %p +kvm_vm_ioctl(int type, void *arg) type %d, arg %p +kvm_vcpu_ioctl(int cpu_index, int type, void *arg) cpu_index %d, type %d, arg %p Sorry that I'm just seeing this patch now (wasn't CC'ed), but I wonder whether cpu_index is the best thing to trace here? Can we still change trace event API or would we have to nack/change now? CC'ing Igor since he just introduced a cpu_get_arch_id() and there's also a kvm_arch_vcpu_id() introduced earlier by Eduardo. Being kvm_vcpu_ioctl() a very low-level KVM function, I believe the KVM VCPU id (the argument passed to KVM_CREATE_VCPU, that needs to be the APIC ID on x86, and is returned by kvm_arch_vcpu_id()) is the best CPU identifier to be included here. cpu_index is the most ambiguous and least reliable CPU identifier we have today. I wouldn't use it in any new code. -- Eduardo
Re: [Qemu-devel] [RFC/PATCH 0/1] cpu hotplug for s390
I've done some investigating into using the device_add hmp/qmp command to support hot-plugging cpus on S390. The alternative suggestion was to simply use a new cpu_add hmp/qmp command. device_add accepts all of the same options as the -device command line parameter takes. This would imply that to hot-plug cpu's using device add we would need to allow command line arguments of type -device cpu. All of the implications of this are not currently clear to me. How would this interact with the -smp option, for example, how many cpus are created in this case: qemu -smp 2 -device cpu,id=cpu0 -device cpu,id=cpu1, -device cpu,id=cpu2 Is -smp invalid when cpu devices are specified? We would have to fill the smp_cpus variable after all (cpu) devices have been parsed. Since device_add requires a QOM object name (driver parameter) we seem to have two choices. 1. device_add cpu 2. device_add s390-cpu But cpu is actually an abstract QOM class and cannot be instantiated by object_new(cpu) as is done in device_add processing. So we need to use s390-cpu. This adds an architecture specific flavor to cpu hotplug. I would think we'd want to avoid that somehow. perhaps we simply translate that parameter during early device_add processing? Another issue is that the s390-cpu QOM object class is a child of main-system-bus. This bus does not support hotplug: sysbus-allow_hotplug=0. In order to implement cpu hotplug we would need to either switch sysbus-allow_hotplug to 1, or the s390-cpu QOM object class would need to move to a bus that supports hotplug. I'm not sure what the implications of either choice would be. I'm interested in thoughts and comments. Thanks! -- -- Jason J. Herne (jjhe...@linux.vnet.ibm.com)
Re: [Qemu-devel] [libvirt] [PATCH 3/3] gtk: add devices menu to allow changing removable block devices
Daniel P. Berrange berra...@redhat.com writes: On Thu, May 02, 2013 at 10:40:06AM -0500, Anthony Liguori wrote: Kevin Wolf kw...@redhat.com writes: + +if (strcmp(type, ide-cd) == 0) { +disk_type = DT_CDROM; +} else if (strcmp(type, isa-fdc) == 0) { +disk_type = DT_FLOPPY; +} else { +disk_type = DT_NORMAL; +} Same thing here, comparing against strings is a hack. Devices should probably have a property that says what kind of device they are. Ack, this is nasty. I would like to eliminate this. There is a type field in BlockInfo but: # @type: This field is returned only for compatibility reasons, it should #not be used (always returns 'unknown') I vaguely remember this happening but I don't remember the specific reason why. I would definitely prefer that we filled out type correctly. I think Markus was involved in this. Markus or Luiz, do you remember the story here? The reason is that BlockInfo is about the backend and it simply doesn't know (ever since we introduced if=none, this was buggy, so we just abandoned it at some point). We would have to ask the device, not the block layer. Yes, this makes sense. We could introduce an interface that all disks implemented that returned information about whether it was a CD-ROM, Floppy, etc. How does libvirt cope with this today? I presume they do something similar to what this patch is doing in terms of hard coding device names. Sorry, not really sure what your question is here - how does libvirt cope with what exactly ? Given a device, how do you figure out if it's a cdrom/floppy/whatever without hard coding a mapping of class name - device type. Pretty sure libvirt just has a class name mapping, right? Regards, Anthony Liguori Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [PATCH v2 1/2] ARM: Allow boards to provide an fdt blob
On 30 April 2013 17:04, John Rigby john.ri...@linaro.org wrote: If no fdt is provided on command line and the new field get_dtb in struct arm_boot_info is set then call it to get a device tree blob. Also allow dumping of device tree by calling qemu_devtree_dumpdtb near the end of load_dtb. Also ... in a commit message is usually a clue that you should split the patch :-) Signed-off-by: John Rigby john.ri...@linaro.org --- hw/arm/boot.c| 31 --- include/hw/arm/arm.h |6 ++ 2 files changed, 26 insertions(+), 11 deletions(-) diff --git a/hw/arm/boot.c b/hw/arm/boot.c index f451529..de71edf 100644 --- a/hw/arm/boot.c +++ b/hw/arm/boot.c @@ -235,19 +235,27 @@ static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo) int size, rc; uint32_t acells, scells, hival; -filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, binfo-dtb_filename); -if (!filename) { -fprintf(stderr, Couldn't open dtb file %s\n, binfo-dtb_filename); -return -1; -} +if (binfo-dtb_filename) { +filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, binfo-dtb_filename); +if (!filename) { +fprintf(stderr, Couldn't open dtb file %s\n, binfo-dtb_filename); +return -1; +} -fdt = load_device_tree(filename, size); -if (!fdt) { -fprintf(stderr, Couldn't open dtb file %s\n, filename); +fdt = load_device_tree(filename, size); +if (!fdt) { +fprintf(stderr, Couldn't open dtb file %s\n, filename); +g_free(filename); +return -1; +} g_free(filename); -return -1; +} else if (binfo-get_dtb) { +fdt = binfo-get_dtb(addr, binfo, size); +if (!fdt) { +fprintf(stderr, Couldn't get dtb blob from board func\n); I think it's better to avoid being too abbreviated in error messages; Attempt to create dtb blob for this board model failed\n, perhaps? +return -1; +} } -g_free(filename); acells = qemu_devtree_getprop_cell(fdt, /, #address-cells); scells = qemu_devtree_getprop_cell(fdt, /, #size-cells); @@ -304,6 +312,7 @@ static int load_dtb(hwaddr addr, const struct arm_boot_info *binfo) fprintf(stderr, couldn't set /chosen/linux,initrd-end\n); } } +qemu_devtree_dumpdtb(fdt, size); cpu_physical_memory_write(addr, fdt, size); @@ -440,7 +449,7 @@ void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info) /* for device tree boot, we pass the DTB directly in r2. Otherwise * we point to the kernel args. */ -if (info-dtb_filename) { +if (info-dtb_filename || info-get_dtb) { /* Place the DTB after the initrd in memory. Note that some * kernels will trash anything in the 4K page the initrd * ends in, so make sure the DTB isn't caught up in that. diff --git a/include/hw/arm/arm.h b/include/hw/arm/arm.h index 7b2b02d..4c56a1b 100644 --- a/include/hw/arm/arm.h +++ b/include/hw/arm/arm.h @@ -31,6 +31,10 @@ struct arm_boot_info { const char *kernel_cmdline; const char *initrd_filename; const char *dtb_filename; +/* if a board is able to create a dtb without a dtb file then it + * sets get_dtb. This will only be used if no dtb file is provided. + */ +void *(*get_dtb)(hwaddr addr, const struct arm_boot_info *binfo, int *size); hwaddr loader_start; /* multicore boards that use the default secondary core boot functions * need to put the address of the secondary boot code, the boot reg, @@ -59,6 +63,8 @@ struct arm_boot_info { int is_linux; hwaddr initrd_start; hwaddr initrd_size; +void *dtb_blob; +int dtb_blob_size; The get_dtb function returns the blob and its size via the hook's arguments, so what are these extra fields for? hwaddr entry; }; void arm_load_kernel(ARMCPU *cpu, struct arm_boot_info *info); -- 1.7.9.5 thanks -- PMM
Re: [Qemu-devel] [RFC/PATCH 0/1] cpu hotplug for s390
On Fri, 03 May 2013 09:50:50 -0400 Jason J. Herne jjhe...@linux.vnet.ibm.com wrote: I've done some investigating into using the device_add hmp/qmp command to support hot-plugging cpus on S390. The alternative suggestion was to simply use a new cpu_add hmp/qmp command. device_add accepts all of the same options as the -device command line parameter takes. This would imply that to hot-plug cpu's using device add we would need to allow command line arguments of type -device cpu. All of the implications of this are not currently clear to me. How would this interact with the -smp option, for example, how many cpus are created in this case: qemu -smp 2 -device cpu,id=cpu0 -device cpu,id=cpu1, -device cpu,id=cpu2 Is -smp invalid when cpu devices are specified? We would have to fill the smp_cpus variable after all (cpu) devices have been parsed. I haven't looked using -device in case of x86 it would require some extra work besides planned properties+classes to make it usable. However device_add for your case should be usable and there is no need to use -device with it. -smp x,maxcpus=y will tell board how many CPUs 'x' should be created and 'y' - 'x' how many extra CPUs could be added later. Since device_add requires a QOM object name (driver parameter) we seem to have two choices. 1. device_add cpu 2. device_add s390-cpu But cpu is actually an abstract QOM class and cannot be instantiated by object_new(cpu) as is done in device_add processing. So we need to use s390-cpu. This adds an architecture specific flavor to cpu hotplug. I would think we'd want to avoid that somehow. perhaps we simply translate that parameter during early device_add processing? that is exactly intention to use architecture specific types there so that one could specify what kind/flavor of CPU needs to be added using type names. Another issue is that the s390-cpu QOM object class is a child of main-system-bus. This bus does not support hotplug: sysbus-allow_hotplug=0. In order to implement cpu hotplug we would need to either switch sysbus-allow_hotplug to 1, or the s390-cpu QOM object class would need to move to a bus that supports hotplug. I'm not sure what the implications of either choice would be. For x86 CPU was moved to ICC bus that existed in real hardware, probably there is similar bus for s390. Alternatively you can consider if s390-cpu could be bus-less, since commit 2f7bd829db it's possible device_add bus-less devices. I'm interested in thoughts and comments. Thanks! -- -- Jason J. Herne (jjhe...@linux.vnet.ibm.com) -- Regards, Igor
Re: [Qemu-devel] [libvirt] [PATCH 3/3] gtk: add devices menu to allow changing removable block devices
On Fri, May 03, 2013 at 08:52:59AM -0500, Anthony Liguori wrote: Daniel P. Berrange berra...@redhat.com writes: On Thu, May 02, 2013 at 10:40:06AM -0500, Anthony Liguori wrote: Kevin Wolf kw...@redhat.com writes: + +if (strcmp(type, ide-cd) == 0) { +disk_type = DT_CDROM; +} else if (strcmp(type, isa-fdc) == 0) { +disk_type = DT_FLOPPY; +} else { +disk_type = DT_NORMAL; +} Same thing here, comparing against strings is a hack. Devices should probably have a property that says what kind of device they are. Ack, this is nasty. I would like to eliminate this. There is a type field in BlockInfo but: # @type: This field is returned only for compatibility reasons, it should #not be used (always returns 'unknown') I vaguely remember this happening but I don't remember the specific reason why. I would definitely prefer that we filled out type correctly. I think Markus was involved in this. Markus or Luiz, do you remember the story here? The reason is that BlockInfo is about the backend and it simply doesn't know (ever since we introduced if=none, this was buggy, so we just abandoned it at some point). We would have to ask the device, not the block layer. Yes, this makes sense. We could introduce an interface that all disks implemented that returned information about whether it was a CD-ROM, Floppy, etc. How does libvirt cope with this today? I presume they do something similar to what this patch is doing in terms of hard coding device names. Sorry, not really sure what your question is here - how does libvirt cope with what exactly ? Given a device, how do you figure out if it's a cdrom/floppy/whatever without hard coding a mapping of class name - device type. Pretty sure libvirt just has a class name mapping, right? The only place where we'd ever need todo that is when we reverse engineer a libvirt XML config from a set of QEMU command line args. For that we just look at the if=XXX parameter currently. Our reverse engineering code is currently broken for if=none scenarios, due mostly to our laziness in writing code to parse the corresponding -device arg. Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC/PATCH 0/1] cpu hotplug for s390
Hi, Am 03.05.2013 15:50, schrieb Jason J. Herne: I've done some investigating into using the device_add hmp/qmp command to support hot-plugging cpus on S390. The alternative suggestion was to simply use a new cpu_add hmp/qmp command. A cpu-add QMP command has been merged by now. Using it with qemu-system-s390x machines will return a QMP error at the moment. device_add accepts all of the same options as the -device command line parameter takes. This would imply that to hot-plug cpu's using device add we would need to allow command line arguments of type -device cpu. In theory we do, ever since making the CPU a device, it just didn't fully work yet. For all QOM'ified CPUs (i.e., not x86) it should work crash-free now, but it's untested whether a particular machine copes with it or not. All of the implications of this are not currently clear to me. How would this interact with the -smp option, for example, how many cpus are created in this case: qemu -smp 2 -device cpu,id=cpu0 -device cpu,id=cpu1, -device cpu,id=cpu2 Four, if the correct driver is supplied (error for the above). The -smp option indicates how many CPUs the *machine* instantiates. In addition you are trying to create two further devices, just like other machines create a PCI host bridge and a user might try to add another one. Is -smp invalid when cpu devices are specified? We would have to fill the smp_cpus variable after all (cpu) devices have been parsed. Would we? If so, doing some check of -smp maxcpus and/or updating whatever variable in CPU's realizefn feels more natural to me than some post-whatever hook. Since device_add requires a QOM object name (driver parameter) we seem to have two choices. 1. device_add cpu 2. device_add s390-cpu But cpu is actually an abstract QOM class and cannot be instantiated by object_new(cpu) as is done in device_add processing. So we need to use s390-cpu. This adds an architecture specific flavor to cpu hotplug. I would think we'd want to avoid that somehow. perhaps we simply translate that parameter during early device_add processing? You are saying that based on the current s390 code. Actually it was discussed that s390-cpu should be abstract as well and the type should indicate the actual model - host-s390-cpu, z9-s390-cpu, etc. There were two KVM calls that covered future structure of CPU modelling (socket - core - thread) and roadmap towards vCPU hotplug - see the minutes on the list. The current approach of cpu-add for 1.5 was chosen because the refactoring of CPUArchState is rather cumbersome and taking too long. Another issue is that the s390-cpu QOM object class is a child of main-system-bus. [...] That's not true, it is not on any bus at all - I have attempted to fix device_add for this use case and Igor has just sent a patch for unplug. For x86 we have chosen to introduce the ICC bus to handle hot-adding APIC devices (which were on SysBus before) alongside the CPU. With proper CPU modelling that would not be necessary, but for now it has the advantage of giving us a canonical QOM path to the CPUs for free. Regards, Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH V4 3/5] block: package committing code in qmp_transaction()
On 05/01/2013 08:26 PM, Wenchao Xia wrote: The code is simply moved into a separate function. Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com --- blockdev.c | 19 --- 1 files changed, 12 insertions(+), 7 deletions(-) Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH qom-cpu 6/9] target-i386: Add feature-words property
On 05/03/2013 07:17 AM, Eduardo Habkost wrote: We could, but maybe it would make the interface harder to use and not easier? Even when two feature words are returned in the same CPUID leaf, they are independent and separate feature-words that must be checked individually by libvirt, so I believe returning one feature-word per array-item makes more sense. Having an extra item in the array would make it clear for libvirt that QEMU has a new feature-word that libvirt doesn't know about, and easier to spot than an extra field in an existing array item. Firmly agree - bundling multiple features into one array item is not nice. item[5].CPUID: EAX=7h,ECX=0h What would be the data type of this CPUID field? Are you suggesting returning a string to be parsed manually? Anything that requires parsing to break into pieces on the receiving end implies that it was not correctly represented in JSON in the first place. I'd much rather see it kept as multiple fields. +for (w = 0; w FEATURE_WORDS; w++) { +FeatureWordInfo *wi = feature_word_info[w]; +X86CPUFeatureWordInfo *qwi = word_infos[w]; +qwi-cpuid_input_eax = wi-cpuid_eax; +qwi-has_cpuid_input_ecx = wi-cpuid_needs_ecx; +qwi-cpuid_input_ecx = wi-cpuid_ecx; +qwi-cpuid_register = x86_reg_info_32[wi-cpuid_reg].qapi_enum; Is there way not to use qapi_enum at all and use name instead? Are you suggesting making the qapi interface be string-based instead of using an enum? Why? enum-based is better than string based. That way, when we add introspection in qemu 1.6, libvirt can see what enum values to expect, instead of having an open-ended set of strings with no idea what strings will be present. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 2/2] ARM: Add mach-virt platform
On 30 April 2013 17:04, John Rigby john.ri...@linaro.org wrote: Add mach-virt platform support cooresponding to corresponding I wonder if we should just call this virt ? /arch/arm/mach-virt in kernel tree. For now it is not virtual but instantiates a pl011 uart and and sp804 timer. The uart is need for a console the timer is needed when running without kvm and there is no arch timer. Signed-off-by: John Rigby john.ri...@linaro.org --- hw/arm/Makefile.objs |2 +- hw/arm/mach-virt.c | 337 ++ 2 files changed, 338 insertions(+), 1 deletion(-) create mode 100644 hw/arm/mach-virt.c diff --git a/hw/arm/Makefile.objs b/hw/arm/Makefile.objs index 9e3a06f..0789352 100644 --- a/hw/arm/Makefile.objs +++ b/hw/arm/Makefile.objs @@ -1,5 +1,5 @@ obj-y += boot.o collie.o exynos4_boards.o gumstix.o highbank.o -obj-y += integratorcp.o kzm.o mainstone.o musicpal.o nseries.o +obj-y += integratorcp.o kzm.o mach-virt.o mainstone.o musicpal.o nseries.o obj-y += omap_sx1.o palm.o pic_cpu.o realview.o spitz.o stellaris.o obj-y += tosa.o versatilepb.o vexpress.o xilinx_zynq.o z2.o diff --git a/hw/arm/mach-virt.c b/hw/arm/mach-virt.c new file mode 100644 index 000..b55e4d1 --- /dev/null +++ b/hw/arm/mach-virt.c @@ -0,0 +1,337 @@ +/* + * ARM mach-virt emulation + * + * Copyright (c) 2013 Linaro + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see http://www.gnu.org/licenses/. This comment could also use a brief description of what this is emulating. + * + */ + +#include hw/sysbus.h +#include hw/arm-misc.h +#include hw/primecell.h +#include hw/devices.h +#include net/net.h +#include sysemu/device_tree.h +#include sysemu/sysemu.h +#include hw/boards.h +#include exec/address-spaces.h +#include hw/flash.h +#include libfdt_env.h + +#define ARM_GIC_DIST_BASE 0x2c001000 +#define ARM_GIC_DIST_SIZE 0x1000 +#define ARM_GIC_CPUI_BASE 0x2c002000 +#define ARM_GIC_CPUI_SIZE 0x1000 + +#define GIC_FDT_IRQ_NUM_CELLS 3 + +#define GIC_FDT_IRQ_TYPE_SPI 0 +#define GIC_FDT_IRQ_TYPE_PPI 1 + +#define GIC_FDT_IRQ_FLAGS_EDGE_LO_HI 1 +#define GIC_FDT_IRQ_FLAGS_EDGE_HI_LO 2 +#define GIC_FDT_IRQ_FLAGS_LEVEL_HI 4 +#define GIC_FDT_IRQ_FLAGS_LEVEL_LO 8 + +#define GIC_FDT_IRQ_PPI_CPU_SHIFT 8 +#define GIC_FDT_IRQ_PPI_CPU_MASK (0xff GIC_FDT_IRQ_PPI_CPU_SHIFT) + +#define CPU_NAME_MAX_LEN 16 +static void add_cpu_nodes(void *fdt, const struct arm_boot_info *binfo) +{ +int cpu; + +qemu_devtree_add_subnode(fdt, /cpus); +qemu_devtree_setprop_cell(fdt, /cpus, #address-cells, 0x1); +qemu_devtree_setprop_cell(fdt, /cpus, #size-cells, 0x0); + +for (cpu = 0; cpu binfo-nb_cpus; ++cpu) { +char cpu_name[CPU_NAME_MAX_LEN]; + +snprintf(cpu_name, CPU_NAME_MAX_LEN, /cpus/cpu@%d, cpu); + +qemu_devtree_add_subnode(fdt, cpu_name); +qemu_devtree_setprop_string(fdt, cpu_name, device_type, cpu); +qemu_devtree_setprop_string(fdt, cpu_name, compatible, +arm,cortex-a15); + +if (binfo-nb_cpus 1) { +qemu_devtree_setprop_string(fdt, cpu_name, enable-method, psci); Does this make sense for the non-KVM case? Should we implement PSCI for TCG too? +} + +qemu_devtree_setprop_cell(fdt, cpu_name, reg, cpu); +} +} + +static void add_gic_nodes(void *fdt, uint32_t phandle) +{ +uint64_t reg_prop[] = { +cpu_to_fdt64(ARM_GIC_DIST_BASE), cpu_to_fdt64(ARM_GIC_DIST_SIZE), +cpu_to_fdt64(ARM_GIC_CPUI_BASE), cpu_to_fdt64(ARM_GIC_CPUI_SIZE), +}; + +qemu_devtree_add_subnode(fdt, /intc); +qemu_devtree_setprop_string(fdt, /intc, compatible, +arm,cortex-a15-gic); +qemu_devtree_setprop_cell(fdt, /intc, #interrupt-cells, +GIC_FDT_IRQ_NUM_CELLS); +qemu_devtree_setprop(fdt, /intc, interrupt-controller, NULL, 0); +qemu_devtree_setprop(fdt, /intc, reg, reg_prop, sizeof(reg_prop)); +qemu_devtree_setprop_cell(fdt, /intc, phandle, phandle); +} + +static void add_timer_nodes(void *fdt, const struct arm_boot_info *binfo) +{ +uint32_t cpu_mask = +(((1 binfo-nb_cpus) - 1) GIC_FDT_IRQ_PPI_CPU_SHIFT) + GIC_FDT_IRQ_PPI_CPU_MASK; +cpu_mask = 0xf00; +uint32_t irq_prop[] = { +cpu_to_fdt32(GIC_FDT_IRQ_TYPE_PPI), +cpu_to_fdt32(13), +cpu_to_fdt32(cpu_mask |
Re: [Qemu-devel] [PATCH V4 4/5] block: package rollback code in qmp_transaction()
On 05/01/2013 08:26 PM, Wenchao Xia wrote: Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com --- blockdev.c | 11 --- 1 files changed, 8 insertions(+), 3 deletions(-) Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [Qemu-ppc] [PATCH 2/8] target-ppc: Convert ppc cpu savevm to VMStateDescription
On Fri, May 03, 2013 at 01:29:28PM +0200, Andreas Färber wrote: Am 03.05.2013 03:38, schrieb David Gibson: The savevm code for the powerpc cpu emulation is currently based around the old register_savevm() rather than register_vmstate() method. It's also rather broken, missing some important state on some CPU models. This patch completely rewrites the savevm for target-ppc, using the new VMStateDescription approach. Exactly what needs to be saved in what configurations has been more carefully examined, too. This introduces a new version (5) of the cpu save format. The old load function is retained to support version 4 images. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- target-ppc/cpu.h |9 +- target-ppc/machine.c | 542 ++ 2 files changed, 460 insertions(+), 91 deletions(-) [...] diff --git a/target-ppc/machine.c b/target-ppc/machine.c index 2d10adb..594fe6a 100644 --- a/target-ppc/machine.c +++ b/target-ppc/machine.c [...] +void cpu_save(QEMUFile *f, void *opaque) +{ +vmstate_save_state(f, vmstate_cpu, opaque); + +} + +int cpu_load(QEMUFile *f, void *opaque, int version_id) +{ +return vmstate_load_state(f, vmstate_cpu, opaque, version_id); +} Please drop cpu_{save,load}() and use the VMStateDescription-based registration mechanism cpu_class_set_vmsd() from PowerPCCPU's instance_init in translate_init.c. I'm pretty certain I CC'ed you on that series... Very likely. But I initially wrote this patch before that, and didn't notice the relevance of the update. Revised version below: From 84125649c8b299e4c56c9c28abd5f0b5aafee40a Mon Sep 17 00:00:00 2001 From: David Gibson da...@gibson.dropbear.id.au Date: Fri, 22 Feb 2013 09:47:00 +1100 Subject: [PATCH] target-ppc: Convert ppc cpu savevm to VMStateDescription The savevm code for the powerpc cpu emulation is currently based around the old register_savevm() rather than register_vmstate() method. It's also rather broken, missing some important state on some CPU models. This patch completely rewrites the savevm for target-ppc, using the new VMStateDescription approach. Exactly what needs to be saved in what configurations has been more carefully examined, too. This introduces a new version (5) of the cpu save format. The old load function is retained to support version 4 images. Signed-off-by: David Gibson da...@gibson.dropbear.id.au --- target-ppc/cpu-qom.h|4 + target-ppc/cpu.h|9 +- target-ppc/machine.c| 531 --- target-ppc/translate_init.c |2 + 4 files changed, 454 insertions(+), 92 deletions(-) diff --git a/target-ppc/cpu-qom.h b/target-ppc/cpu-qom.h index eb03a00..2b96b04 100644 --- a/target-ppc/cpu-qom.h +++ b/target-ppc/cpu-qom.h @@ -102,4 +102,8 @@ PowerPCCPUClass *ppc_cpu_class_by_pvr(uint32_t pvr); void ppc_cpu_do_interrupt(CPUState *cpu); +#ifndef CONFIG_USER_ONLY +extern const struct VMStateDescription vmstate_ppc_cpu; +#endif + #endif diff --git a/target-ppc/cpu.h b/target-ppc/cpu.h index 7cacb56..1809cb3 100644 --- a/target-ppc/cpu.h +++ b/target-ppc/cpu.h @@ -943,7 +943,7 @@ struct CPUPPCState { #if defined(TARGET_PPC64) /* PowerPC 64 SLB area */ ppc_slb_t slb[64]; -int slb_nr; +int32_t slb_nr; #endif /* segment registers */ hwaddr htab_base; @@ -952,11 +952,11 @@ struct CPUPPCState { /* externally stored hash table */ uint8_t *external_htab; /* BATs */ -int nb_BATs; +uint32_t nb_BATs; target_ulong DBAT[2][8]; target_ulong IBAT[2][8]; /* PowerPC TLB registers (for 4xx, e500 and 60x software driven TLBs) */ -int nb_tlb; /* Total number of TLB */ +int32_t nb_tlb; /* Total number of TLB */ int tlb_per_way; /* Speed-up helper: used to avoid divisions at run time */ int nb_ways; /* Number of ways in the TLB set*/ int last_way;/* Last used way used to allocate TLB in a LRU way */ @@ -973,6 +973,7 @@ struct CPUPPCState { /* Other registers */ /* Special purpose registers */ target_ulong spr[1024]; +uint32_t cr; /* Full CR value used during vmsave/load */ ppc_spr_t spr_cb[1024]; /* Altivec registers */ ppc_avr_t avr[32]; @@ -1171,8 +1172,6 @@ static inline CPUPPCState *cpu_init(const char *cpu_model) #define cpu_signal_handler cpu_ppc_signal_handler #define cpu_list ppc_cpu_list -#define CPU_SAVE_VERSION 4 - /* MMU modes definitions */ #define MMU_MODE0_SUFFIX _user #define MMU_MODE1_SUFFIX _kernel diff --git a/target-ppc/machine.c b/target-ppc/machine.c index 2d10adb..b6be2a7 100644 --- a/target-ppc/machine.c +++ b/target-ppc/machine.c @@ -1,94 +1,9 @@ #include hw/hw.h #include hw/boards.h #include sysemu/kvm.h +#include helper_regs.h -void cpu_save(QEMUFile
Re: [Qemu-devel] [PATCH V4 1/5] block: package preparation code in qmp_transaction()
On 05/01/2013 08:26 PM, Wenchao Xia wrote: The code before really committing is moved into a function. Most code are simply moved from qmp_transaction(), except that on fail it s/are/is/ just return now. Other code such as input parsing is not touched, s/return/returns/ to make it easier in review. Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com --- blockdev.c | 139 +--- 1 files changed, 77 insertions(+), 62 deletions(-) Assuming the maintainer can correct commit messages without needing a v5. Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH V4 5/5] block: make all steps in qmp_transaction() as callback
On 05/01/2013 08:26 PM, Wenchao Xia wrote: Make it easier to add other operations to qmp_transaction() by using callbacks, with external snapshots serving as an example implementation of the callbacks. Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com --- blockdev.c | 92 ++- 1 files changed, 71 insertions(+), 21 deletions(-) +/* + * This structure must be arranged as first member in parent type, assuming To me, parent type seems like the wrong relationship - in C++, the parent type is the smaller type, and the child type is the larger type that includes the parent type as its first member. That is, BlkTransationStates IS the parent type, and the comment would read better as first member in child type. + * that compiler will also arrange it to the same address with parent instance. + * Later it will be used in free(). + */ +struct BlkTransactionStates { +BlockdevAction *action; +const BdrvActionOps *ops; +QSIMPLEQ_ENTRY(BlkTransactionStates) entry; +}; /* Now we are going to do the actual pivot. Everything up to this point * is reversible, but we are committed at this point */ Is this comment still correct, now that things are extensible, or should you s/pivot/commit/? If you do respin, I agree that s/rollback/abort/ might be a nicer name for that particular callback. But I'm also quite okay with using the patch as-is without a respin (while I pointed out two possible comment wording changes, they don't make or break the patch for me). Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 09/12] linux-user: Add signal handling for AArch64
On 30 April 2013 07:38, John Rigby john.ri...@linaro.org wrote: +/* set up the stack frame for unwinding */ +err |= __put_user(env-xregs[29], sf-fp); __put_user can't fail so all this collection of error status is unnecessary. (Every address we write with it has to be covered by the lock_user_struct() call which is where we'll find out if it wasn't writable memory. Existing code in signal.c that checks __put_user and __get_user return values is generally cut-n-paste from the Linux kernel, which does do access rights testing on each individual load/store.) +err |= __put_user(env-xregs[30], sf-lr); + +for (i = 0; i 31; i++) { +err |= __put_user(env-xregs[i], sf-uc.tuc_mcontext.regs[i]); +} +err |= __put_user(env-sp, sf-uc.tuc_mcontext.sp); +err |= __put_user(env-pc, sf-uc.tuc_mcontext.pc); +err |= __put_user(env-pstate, sf-uc.tuc_mcontext.pstate); + +for (i = 0; i 32 * 2; i++) { + err |= __get_user(env-vfp.regs[i], aux-fpsimd.vregs[i]); +} + +#if 0 +err |= __get_user(env-fpsr, aux-fpsimd.fpsr); +err |= __get_user(env-fpcr, aux-fpsimd.fpcr); +#endif No #if-0 code, please. thanks -- PMM
Re: [Qemu-devel] [PATCH qom-cpu 6/9] target-i386: Add feature-words property
On 04/22/2013 01:00 PM, Eduardo Habkost wrote: This property will be useful for libvirt, as libvirt already has logic based on low-level feature bits (not feature names), so it will be really easy to convert the current libvirt logic to something using the feature-words property. The property will have two main use cases: - Checking host capabilities, by checking the features of the host CPU model - Checking which features are enabled on each CPU model item[6].cpuid-register: ECX item[6].cpuid-input-eax: 1 item[6].features: 2155880449 item[7].cpuid-register: EDX item[7].cpuid-input-eax: 1 item[7].features: 126614521 I'm guessing the corresponding JSON passed over QMP would look something like: [ ... { cpuid-register: ECX, cpuid-input-eax: 1, features: 2155880449 }, { cpuid-register: EDX, cpuid-input-eax: 1, features: 126614521 } ] which libvirt can reasonably parse. Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- Changes v1 - v2: * Merge the non-qapi and qapi patches, to keep series simpler * Use the feature word array series as base, so we don't have to set the feature word values one-by-one in the code * Change type name of property from x86-cpu-feature-words to X86CPUFeatureWordInfo * Remove cpu-qapi-schema.json and simply add the type definitions to qapi-schema.json, to keep the changes simpler * This required compiling qapi-types.o and qapi-visit.o into *-user as well --- .gitignore| 2 ++ Makefile.objs | 7 +- qapi-schema.json | 31 target-i386/cpu.c | 70 +-- 4 files changed, 97 insertions(+), 13 deletions(-) I'm not sure I'm the best person to review cpu.c, but I can at least review the interface from what libvirt plans on using: +++ b/qapi-schema.json @@ -3505,3 +3505,34 @@ '*asl_compiler_rev': 'uint32', '*file': 'str', '*data': 'str' }} + +# @X86CPURegister32 +# +# A X86 32-bit register +# +# Since: 1.5 Yes, I'd still like to get this into 1.5. On some enums, we have called out doc-text for each enum value; but I'm fine with your choice here to omit that. +## +{ 'enum': 'X86CPURegister32', + 'data': [ 'EAX', 'EBX', 'ECX', 'EDX', 'ESP', 'EBP', 'ESI', 'EDI' ] } Any reason you favored ALL-CAPS names? But it doesn't matter to me, as long as we stick to the name once baked into a release. + +## +# @X86CPUFeatureWordInfo +# +# Information about a X86 CPU feature word +# +# @cpuid-input-eax: Input EAX value for CPUID instruction for that feature word +# +# @cpuid-input-ecx: #optional Input ECX value for CPUID instruction for that +# feature word +# +# @cpuid-register: Output register containing the feature bits +# +# @features: value of output register, containing the feature bits +# +# Since: 1.5 +## +{ 'type': 'X86CPUFeatureWordInfo', + 'data': { 'cpuid-input-eax': 'int', +'*cpuid-input-ecx': 'int', +'cpuid-register': 'X86CPURegister32', +'features': 'int' } } Looks reasonable for the interface side of things. +static void x86_cpu_get_feature_words(Object *obj, Visitor *v, void *opaque, + const char *name, Error **errp) Indentation looks off. Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH qom-cpu 0/9] x86: feature words array (v11) + feature-words property
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 02.05.2013 21:48, schrieb Eric Blake: On 05/02/2013 01:43 PM, Eduardo Habkost wrote: As mentioned earlier I'd prefer to defer the property design rather than putting it lightly reviewed into 1.5 and living with some ABI. If libvirt urgently needs this info, this series needs to be reviewed and sorted out until the weekend (Hard Freeze on Monday). I consider it an important bugfix for the QEMU+libvirt stack. The current libvirt behavior (checking CPUID directly; not using the enforce flag; and having its own copy of each CPU model definition) is unsafe and may break live-migration silently under many circumstances. I agree that libvirt would very much like to have this in 1.5. How can I help in reviewing things? Apart from the usual QMP considerations that you will know much better than me, I have two concerns here: 1) Polluting the QOM namespace with this dump-all implementation for libvirt and interfering with more fine-grained property getters/setters. 2) Basing its design on current code of which we are not sure yet how it may evolve and having to live with that for ABI stability. Like I said, I hadn't reviewed that part yet, so couldn't pick it up on short notice. If we get it respun and reviewed today, I can (try to) prepare a PULL on Sunday. On Igor's series (latest: v7 from Feb 25) I had more or less nack'ed the attempt to introduce f-* properties due to Anthony asking for verbose QOM property names, so we're in need of a better name, likely something with feature in it, similar to what is being proposed here. I had also argued with Anthony that QOM's object_property_add_bool() should allow us to create a container object for accessing features in a more simple way, such as .../icc/child[0]/cpuid-features/foo rather than f-foo or feature-foo or foo-feature to avoid the constant repetition and an unreadable long list of CPU properties, but the addition of an opaque to support this was turned down. So it boils down to the questions of where do we want to expose which information, how should it be structured and where does/will that information come from. Thanks. Regards, Andreas - -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAEBAgAGBQJRg9CkAAoJEPou0S0+fgE/Mu8P/1FFoXTMawQ2o8np/cjOFEze zv+MJ5DUKZK96PPNoZjsM8y0tmNZ8VT9q578AQuElQiA/AbOaUqEoqL/NB9i9Bqc PBZw7KwgNkH8Mogw7izOmOybKZbshdin9uBxRugG+Xyg5Nk7oMYkTQV8PLHmAgRc LxgeMAJHsPY9LXksCNUbZNblK//EQfP90e7v0fU+ys5xrlCFlCl1xRQd9Cw2QvHd 7gECUSlwOlkHY32BFEn/epqay45uZqlECyGXDqrssg5htLM5McbzKCa1sgdQbuqp HqsO3WdM6jBrse5EApxdoaYmz8Yhl6ls+YOQY+l3DjjhHNcDzxtIqbAK36ErBHFz 9d+NTcXBlGrC0N0L7VZmwLihJ3bT/IIEP7ybLFN/QKHlz4H83pEGftbBpPipqrwq NZWk7Z6IiOKptxNyBKOa04+2DJvlafgwjysfTf5bjEQ+WDTEeMoubIOZiG9bC1bm WdqAC6JzQYTpjT3kqbfxGlV8328N3Z1qrVpRZOevkPHpotaaSDa5VVSCOvj6hdJZ P4L2hq94bskumINJWHZxYEGvrB+6MJfOn73icNSpzyg+2sVw2QVfVAfbe0XfGFag 2JO5sFbl0be8rOh6Y7b2uxltfI1RnGIBmemRQkjP6Z3mynLs7EYKqs8LwpHR0FMm 3oSPrNXELr02m/9eGpsb =tUDY -END PGP SIGNATURE-
Re: [Qemu-devel] [PATCH qom-cpu 7/9] target-i386: Use FeatureWord loop on filter_features_for_kvm()
On 04/22/2013 01:00 PM, Eduardo Habkost wrote: Instead of open-coding the filtering code for each feature word, change the existing code to use the feature_word_info array, that have exactly the same CPUID eax/ecx/register values for each feature word. Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- target-i386/cpu.c | 24 +++- 1 file changed, 7 insertions(+), 17 deletions(-) +for (w = 0; w FEATURE_WORDS; w++) { +FeatureWordInfo *wi = feature_word_info[w]; +env-features[w] = kvm_arch_get_supported_cpuid(s, wi-cpuid_eax, +wi-cpuid_ecx, +wi-cpuid_reg); Indentation is unusual, but the resulting alignment is nicer than having 'wi-' flush under 's'. I would have written the call in four lines instead of 3, but that's not essential. Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH qom-cpu 8/9] target-i386: Introduce X86CPU.filtered_features field
On 04/22/2013 01:00 PM, Eduardo Habkost wrote: This field will contain the feature bits that were filtered out because of missing host support. Yes, libvirt would definitely like to know that. Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- target-i386/cpu-qom.h | 3 +++ target-i386/cpu.c | 9 ++--- 2 files changed, 9 insertions(+), 3 deletions(-) Reviewed-by: Eric Blake ebl...@redhat.com +++ b/target-i386/cpu.c @@ -1642,9 +1642,12 @@ static void filter_features_for_kvm(X86CPU *cpu) for (w = 0; w FEATURE_WORDS; w++) { FeatureWordInfo *wi = feature_word_info[w]; -env-features[w] = kvm_arch_get_supported_cpuid(s, wi-cpuid_eax, -wi-cpuid_ecx, -wi-cpuid_reg); +uint32_t host_feat = kvm_arch_get_supported_cpuid(s, wi-cpuid_eax, + wi-cpuid_ecx, + wi-cpuid_reg); Alignment is still interesting, but still no impact to the patch review. +uint32_t requested_features = env-features[w]; +env-features[w] = host_feat; +cpu-filtered_features[w] = requested_features ~env-features[w]; } } #endif -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH qom-cpu 5/9] target-i386: Add ECX information to FeatureWordInfo
Am 22.04.2013 21:00, schrieb Eduardo Habkost: FEAT_7_0_EBX uses ECX as input, so we have to take that into account when reporting feature word values. Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- target-i386/cpu.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 110ef98..314931e 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -151,8 +151,10 @@ static const char *cpuid_7_0_ebx_feature_name[] = { typedef struct FeatureWordInfo { const char **feat_names; -uint32_t cpuid_eax; /* Input EAX for CPUID */ -int cpuid_reg; /* R_* register constant */ +uint32_t cpuid_eax; /* Input EAX for CPUID */ +bool cpuid_needs_ecx; /* CPUID instruction uses ECX as input */ Why do we need this needs_ field here? eax and reg just seem to be reindented and the comment reworded for reg. It just seems to be passed through to has_cpuid_input_ecx, which I neither see defined nor checked - that data structure already existed elsewhere as such? Andreas +uint32_t cpuid_ecx; /* Input ECX value for CPUID */ +int cpuid_reg;/* output register (R_* constant) */ } FeatureWordInfo; static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { @@ -186,7 +188,9 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { }, [FEAT_7_0_EBX] = { .feat_names = cpuid_7_0_ebx_feature_name, -.cpuid_eax = 7, .cpuid_reg = R_EBX, +.cpuid_eax = 7, +.cpuid_needs_ecx = true, .cpuid_ecx = 0, +.cpuid_reg = R_EBX, }, }; -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH 1/2] Add i.MX I2C device emulator.
Andreas, On 05/02/2013 02:38 PM, Andreas Färber wrote: Hi, Am 01.05.2013 21:53, schrieb Jean-Christophe DUBOIS: Signed-off-by: Jean-Christophe DUBOIS j...@tribudubois.net --- default-configs/arm-softmmu.mak | 2 + hw/i2c/Makefile.objs| 1 + hw/i2c/imx_i2c.c| 374 3 files changed, 377 insertions(+) create mode 100644 hw/i2c/imx_i2c.c Please thread your messages together so they can be reviewed in context. Since you're adding a new I2C device and we have a qtest framework for I2C, please supply an implementation for this device (which will require some constant sharing via header file) and some simple test case for the board you're using it on, to assure it keeps working. What is the target for building qtest? And then how to run it? Could you point me to some documentation please? It doesn't build out of the box for me. JC
Re: [Qemu-devel] [PATCH 1/2] Add i.MX I2C device emulator.
Am 03.05.2013 17:16, schrieb Jean-Christophe DUBOIS: Andreas, On 05/02/2013 02:38 PM, Andreas Färber wrote: Hi, Am 01.05.2013 21:53, schrieb Jean-Christophe DUBOIS: Signed-off-by: Jean-Christophe DUBOIS j...@tribudubois.net --- default-configs/arm-softmmu.mak | 2 + hw/i2c/Makefile.objs| 1 + hw/i2c/imx_i2c.c| 374 3 files changed, 377 insertions(+) create mode 100644 hw/i2c/imx_i2c.c Please thread your messages together so they can be reviewed in context. Since you're adding a new I2C device and we have a qtest framework for I2C, please supply an implementation for this device (which will require some constant sharing via header file) and some simple test case for the board you're using it on, to assure it keeps working. What is the target for building qtest? And then how to run it? make check Andreas Could you point me to some documentation please? It doesn't build out of the box for me. JC -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg
Re: [Qemu-devel] [PATCH qom-cpu 0/9] x86: feature words array (v11) + feature-words property
On Fri, 03 May 2013 16:58:44 +0200 Andreas Färber afaer...@suse.de wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Am 02.05.2013 21:48, schrieb Eric Blake: On 05/02/2013 01:43 PM, Eduardo Habkost wrote: As mentioned earlier I'd prefer to defer the property design rather than putting it lightly reviewed into 1.5 and living with some ABI. If libvirt urgently needs this info, this series needs to be reviewed and sorted out until the weekend (Hard Freeze on Monday). I consider it an important bugfix for the QEMU+libvirt stack. The current libvirt behavior (checking CPUID directly; not using the enforce flag; and having its own copy of each CPU model definition) is unsafe and may break live-migration silently under many circumstances. I agree that libvirt would very much like to have this in 1.5. How can I help in reviewing things? Apart from the usual QMP considerations that you will know much better than me, I have two concerns here: 1) Polluting the QOM namespace with this dump-all implementation for libvirt and interfering with more fine-grained property getters/setters. I think feature-words could be replaced with fine-grained feature properties eventually. 2) Basing its design on current code of which we are not sure yet how it may evolve and having to live with that for ABI stability. Like I said, I hadn't reviewed that part yet, so couldn't pick it up on short notice. If we get it respun and reviewed today, I can (try to) prepare a PULL on Sunday. On Igor's series (latest: v7 from Feb 25) I had more or less nack'ed the attempt to introduce f-* properties due to Anthony asking for I don't recall it being nacked or any other way commented. verbose QOM property names, so we're in need of a better name, likely something with feature in it, similar to what is being proposed here. I had also argued with Anthony that QOM's object_property_add_bool() should allow us to create a container object for accessing features in a more simple way, such as .../icc/child[0]/cpuid-features/foo rather than f-foo or feature-foo or foo-feature to avoid the constant repetition and an unreadable long list of CPU properties, but the addition of an opaque to support this was turned down. what if FeatureWordArray inherits from Object? than it would be trivial to create /icc/child[0]/cpuid-features/ where cpuid-features will be FeatureWordArray and each feature it's child? So it boils down to the questions of where do we want to expose which information, how should it be structured and where does/will that information come from. Thanks. Regards, Andreas - -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.19 (GNU/Linux) iQIcBAEBAgAGBQJRg9CkAAoJEPou0S0+fgE/Mu8P/1FFoXTMawQ2o8np/cjOFEze zv+MJ5DUKZK96PPNoZjsM8y0tmNZ8VT9q578AQuElQiA/AbOaUqEoqL/NB9i9Bqc PBZw7KwgNkH8Mogw7izOmOybKZbshdin9uBxRugG+Xyg5Nk7oMYkTQV8PLHmAgRc LxgeMAJHsPY9LXksCNUbZNblK//EQfP90e7v0fU+ys5xrlCFlCl1xRQd9Cw2QvHd 7gECUSlwOlkHY32BFEn/epqay45uZqlECyGXDqrssg5htLM5McbzKCa1sgdQbuqp HqsO3WdM6jBrse5EApxdoaYmz8Yhl6ls+YOQY+l3DjjhHNcDzxtIqbAK36ErBHFz 9d+NTcXBlGrC0N0L7VZmwLihJ3bT/IIEP7ybLFN/QKHlz4H83pEGftbBpPipqrwq NZWk7Z6IiOKptxNyBKOa04+2DJvlafgwjysfTf5bjEQ+WDTEeMoubIOZiG9bC1bm WdqAC6JzQYTpjT3kqbfxGlV8328N3Z1qrVpRZOevkPHpotaaSDa5VVSCOvj6hdJZ P4L2hq94bskumINJWHZxYEGvrB+6MJfOn73icNSpzyg+2sVw2QVfVAfbe0XfGFag 2JO5sFbl0be8rOh6Y7b2uxltfI1RnGIBmemRQkjP6Z3mynLs7EYKqs8LwpHR0FMm 3oSPrNXELr02m/9eGpsb =tUDY -END PGP SIGNATURE- -- Regards, Igor
Re: [Qemu-devel] [PATCH qom-cpu 0/9] x86: feature words array (v11) + feature-words property
On 05/03/2013 08:58 AM, Andreas Färber wrote: I agree that libvirt would very much like to have this in 1.5. How can I help in reviewing things? Apart from the usual QMP considerations that you will know much better than me, I have two concerns here: 1) Polluting the QOM namespace with this dump-all implementation for libvirt and interfering with more fine-grained property getters/setters. Ultimately, it would be nice to have full QOM introspection. We are part way there: we know WHAT properties we can query ('qom-list' exists now), but right now it returns details on the type of a property as a string, rather than as a full-blown JSON representation of the full details of that type. Maybe if we had an additional QMP command that gave back a full JSON representation of any type, so you can feed the 'type':'str' field from ObjectPropertyInfo in 'qom-list' into the new command to get a full idea of what each property contains. With full introspection, we could then have the option of changing the layout of the feature-words and filtered-features properties, but if we do make a change in the future, it would be nice to remain backwards compatible (adding features, rather than removing). 2) Basing its design on current code of which we are not sure yet how it may evolve and having to live with that for ABI stability. Like I said, I hadn't reviewed that part yet, so couldn't pick it up on short notice. If we get it respun and reviewed today, I can (try to) prepare a PULL on Sunday. I guess I see where you are coming from - once we bake in the QOM property name and libvirt starts relying on it, it becomes harder to support the generation of that property in the future if the underlying implementation of how feature bits are represented in qemu changed. But the hope is that we have already sanitized the feature word generation into something that is a lot more maintainable (iterating over a struct of descriptions of how to get at each feature), and where future changes would merely be extending (not deleting) from that struct (and therefore making the corresponding JSON type returned via the QOM property larger). And since these two properties are read-only, extending the JSON struct shouldn't be a problem. On Igor's series (latest: v7 from Feb 25) I had more or less nack'ed the attempt to introduce f-* properties due to Anthony asking for verbose QOM property names, so we're in need of a better name, likely something with feature in it, similar to what is being proposed here. I had also argued with Anthony that QOM's object_property_add_bool() should allow us to create a container object for accessing features in a more simple way, such as .../icc/child[0]/cpuid-features/foo rather than f-foo or feature-foo or foo-feature to avoid the constant repetition and an unreadable long list of CPU properties, but the addition of an opaque to support this was turned down. The problem with adding container objects is that you can only access one feature at a time, instead of all of them in a single array. I guess libvirt would cope with either style, but it is nicer to be able to get everything at once via JSON struct than it is to have to make multiple qom-get calls. So it boils down to the questions of where do we want to expose which information, how should it be structured and where does/will that information come from. Thanks. I guess that decision is up to the qemu maintainers - all I can do is say whether the proposed interface is usable, not whether it is the ideal interface compared to some other layout of where the property is attached. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH qom-cpu 9/9] target-i386: Add filtered-features property to X86CPU
On 04/22/2013 01:00 PM, Eduardo Habkost wrote: This property will contain all the features that were removed from the CPU because they are not supported by the host. This way, libvirt or other management tools can emulate the check/enforce behavior by checking if filtered-properties is all zeroes, before starting the guest. Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- target-i386/cpu.c | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) Definitely wanted by libvirt, and I hope it can be included in 1.5. Reviewed-by: Eric Blake ebl...@redhat.com -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH qom-cpu 5/9] target-i386: Add ECX information to FeatureWordInfo
On Fri, May 03, 2013 at 05:16:46PM +0200, Andreas Färber wrote: Am 22.04.2013 21:00, schrieb Eduardo Habkost: FEAT_7_0_EBX uses ECX as input, so we have to take that into account when reporting feature word values. Signed-off-by: Eduardo Habkost ehabk...@redhat.com --- target-i386/cpu.c | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/target-i386/cpu.c b/target-i386/cpu.c index 110ef98..314931e 100644 --- a/target-i386/cpu.c +++ b/target-i386/cpu.c @@ -151,8 +151,10 @@ static const char *cpuid_7_0_ebx_feature_name[] = { typedef struct FeatureWordInfo { const char **feat_names; -uint32_t cpuid_eax; /* Input EAX for CPUID */ -int cpuid_reg; /* R_* register constant */ +uint32_t cpuid_eax; /* Input EAX for CPUID */ +bool cpuid_needs_ecx; /* CPUID instruction uses ECX as input */ Why do we need this needs_ field here? eax and reg just seem to be reindented and the comment reworded for reg. We need it so we know if ECX is used as input on that CPUID leaf, in other words: so we know if cpuid-input-ecx needs to be set on X86CPUFeatureWordInfo or not. cpuid-input-ecx is optional because most CPUID leafs don't depend on input ECX value, only EAX. cpuid_needs_ecx has a similar meaning to KVM_CPUID_FLAG_SIGNIFCANT_INDEX on kvm_cpuid_entry2.flags. It just seems to be passed through to has_cpuid_input_ecx, which I neither see defined nor checked - that data structure already existed elsewhere as such? X86CPUFeatureWordInfo is generated from the QAPI schema. Andreas +uint32_t cpuid_ecx; /* Input ECX value for CPUID */ +int cpuid_reg;/* output register (R_* constant) */ } FeatureWordInfo; static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { @@ -186,7 +188,9 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { }, [FEAT_7_0_EBX] = { .feat_names = cpuid_7_0_ebx_feature_name, -.cpuid_eax = 7, .cpuid_reg = R_EBX, +.cpuid_eax = 7, +.cpuid_needs_ecx = true, .cpuid_ecx = 0, +.cpuid_reg = R_EBX, }, }; -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg -- Eduardo
Re: [Qemu-devel] [PATCH 1/2] Add i.MX I2C device emulator.
On 05/03/2013 05:20 PM, Andreas Färber wrote: Am 03.05.2013 17:16, schrieb Jean-Christophe DUBOIS: Andreas, On 05/02/2013 02:38 PM, Andreas Färber wrote: Hi, Am 01.05.2013 21:53, schrieb Jean-Christophe DUBOIS: Signed-off-by: Jean-Christophe DUBOIS j...@tribudubois.net --- default-configs/arm-softmmu.mak | 2 + hw/i2c/Makefile.objs| 1 + hw/i2c/imx_i2c.c| 374 3 files changed, 377 insertions(+) create mode 100644 hw/i2c/imx_i2c.c Please thread your messages together so they can be reviewed in context. Since you're adding a new I2C device and we have a qtest framework for I2C, please supply an implementation for this device (which will require some constant sharing via header file) and some simple test case for the board you're using it on, to assure it keeps working. What is the target for building qtest? And then how to run it? make check Thanks Andreas, So I added a libi2c-imx.c file, a ds1338-test.c file ... but when I run make check I get the following. What am I missing? JC $ make check GTESTER tests/check-qdict GTESTER tests/check-qfloat GTESTER tests/check-qint GTESTER tests/check-qstring GTESTER tests/check-qlist GTESTER tests/check-qjson GTESTER tests/test-qmp-output-visitor GTESTER tests/test-qmp-input-visitor GTESTER tests/test-qmp-input-strict GTESTER tests/test-qmp-commands GTESTER tests/test-string-input-visitor GTESTER tests/test-string-output-visitor GTESTER tests/test-coroutine GTESTER tests/test-visitor-serialization GTESTER tests/test-iov GTESTER tests/test-aio GTESTER tests/test-thread-pool GTESTER tests/test-hbitmap GTESTER tests/test-x86-cpuid GTESTER tests/test-xbzrle GTESTER tests/test-cutils GTESTER tests/test-mul64 CCtests/libi2c-imx.o LINK tests/tmp105-test CCtests/ds1338-test.o LINK tests/ds1338-test GTESTER check-qtest-arm Kernel image must be specified Broken pipe make: *** [check-qtest-arm] Error 1 $ Andreas Could you point me to some documentation please? It doesn't build out of the box for me. JC
[Qemu-devel] [PATCH 2/9] qom: add object_property_add_unnamed_child
This interface allows us to add a child property without specifying a name. Instead, a unique name is created and passed back after adding the property. Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- include/qom/object.h | 16 qom/object.c | 25 + 2 files changed, 41 insertions(+) diff --git a/include/qom/object.h b/include/qom/object.h index 86f1e2e..ca0fce8 100644 --- a/include/qom/object.h +++ b/include/qom/object.h @@ -1041,6 +1041,22 @@ void object_property_add_child(Object *obj, const char *name, Object *child, struct Error **errp); /** + * object_property_add_unnamed_child: + * + * @obj: the object to add a property to + * @name: the name of the property + * @child: the child object + * @errp: if an error occurs, a pointer to an area to store the area + * + * Same as object_property_add_child, but will allocate a unique name to + * identify the child property. + * + * Returns: The name assigned to the child property, or NULL on failure. + */ +char *object_property_add_unnamed_child(Object *obj, Object *child, +struct Error **errp); + +/** * object_property_add_link: * @obj: the object to add a property to * @name: the name of the property diff --git a/qom/object.c b/qom/object.c index c932f64..229a9a7 100644 --- a/qom/object.c +++ b/qom/object.c @@ -926,6 +926,31 @@ static void object_finalize_child_property(Object *obj, const char *name, object_unref(child); } +char *object_property_add_unnamed_child(Object *obj, Object *child, Error **errp) +{ +int idx = 0; +bool next_idx_found = false; +char name[64]; +ObjectProperty *prop; + +while (!next_idx_found) { +sprintf(name, unnamed[%d], idx); +QTAILQ_FOREACH(prop, obj-properties, node) { +if (strcmp(name, prop-name) == 0) { +idx++; +break; +} +} +if (!prop) { +next_idx_found = true; +} +} + +object_property_add_child(obj, name, child, errp); + +return error_is_set(errp) ? NULL : g_strdup(name); +} + void object_property_add_child(Object *obj, const char *name, Object *child, Error **errp) { -- 1.7.9.5
[Qemu-devel] [PATCH 1/9] qom: add qom_init_completion
This is similar in concept to realize, though semantics are a bit more open-ended: And object might in some cases need a number of properties to be specified before it can be used/started/etc. This can't always be done via an open-ended new() function, the main example being objects that around created via the command-line by -object. To support these cases we allow a function, -instance_init_completion, to be registered that will be called by the -object constructor, or can be called at the end of new() constructors and such. Signed-off-by: Michael Roth mdr...@linux.vnet.ibm.com --- include/qom/object.h | 19 +++ qom/object.c | 21 + vl.c |2 ++ 3 files changed, 42 insertions(+) diff --git a/include/qom/object.h b/include/qom/object.h index d0f99c5..86f1e2e 100644 --- a/include/qom/object.h +++ b/include/qom/object.h @@ -394,6 +394,11 @@ struct Object * @instance_init: This function is called to initialize an object. The parent * class will have already been initialized so the type is only responsible * for initializing its own members. + * @instance_init_completion: This function is used mainly cases where an + * object has been instantiated via the command-line, and is called once all + * properties specified via command-line have been set for the object. This + * is not called automatically, but manually via @object_init_completion once + * the processing of said properties is completed. * @instance_finalize: This function is called during object destruction. This * is called before the parent @instance_finalize function has been called. * An object should only free the members that are unique to its type in this @@ -429,6 +434,7 @@ struct TypeInfo size_t instance_size; void (*instance_init)(Object *obj); +void (*instance_init_completion)(Object *obj); void (*instance_finalize)(Object *obj); bool abstract; @@ -562,6 +568,19 @@ struct InterfaceClass Object *object_new(const char *typename); /** + * object_init_completion: + * @obj: The object to complete initialization of + * + * In cases where an object is instantiated from a command-line with a number + * of properties specified as parameters (generally via -object), or for cases + * where a new()/helper function is used to pass/set some minimal number of + * properties that are required prior to completion of object initialization, + * this function can be called to mark when that occurs to complete object + * initialization. + */ +void object_init_completion(Object *obj); + +/** * object_new_with_type: * @type: The type of the object to instantiate. * diff --git a/qom/object.c b/qom/object.c index 75e6aac..c932f64 100644 --- a/qom/object.c +++ b/qom/object.c @@ -50,6 +50,7 @@ struct TypeImpl void *class_data; void (*instance_init)(Object *obj); +void (*instance_init_completion)(Object *obj); void (*instance_finalize)(Object *obj); bool abstract; @@ -110,6 +111,7 @@ static TypeImpl *type_register_internal(const TypeInfo *info) ti-class_data = info-class_data; ti-instance_init = info-instance_init; +ti-instance_init_completion = info-instance_init_completion; ti-instance_finalize = info-instance_finalize; ti-abstract = info-abstract; @@ -422,6 +424,25 @@ Object *object_new(const char *typename) return object_new_with_type(ti); } + +static void object_init_completion_with_type(Object *obj, TypeImpl *ti) +{ +if (type_has_parent(ti)) { +object_init_completion_with_type(obj, type_get_parent(ti)); +} + +if (ti-instance_init_completion) { +ti-instance_init_completion(obj); +} +} + +void object_init_completion(Object *obj) +{ +TypeImpl *ti = type_get_by_name(object_get_class(obj)-type-name); + +object_init_completion_with_type(obj, ti); +} + Object *object_dynamic_cast(Object *obj, const char *typename) { if (obj object_class_dynamic_cast(object_get_class(obj), typename)) { diff --git a/vl.c b/vl.c index 6e6225f..d454c86 100644 --- a/vl.c +++ b/vl.c @@ -2831,6 +2831,8 @@ static int object_create(QemuOpts *opts, void *opaque) object_property_add_child(container_get(object_get_root(), /objects), id, obj, NULL); +object_init_completion(obj); + return 0; } -- 1.7.9.5
[Qemu-devel] [RFC 0/9] QContext: QOM class to support multiple event loops
These patches apply on top of qemu.git master, and can also be obtained from: git://github.com/mdroth/qemu.git qcontext OVERVIEW This series introduces a set of QOM classes/interfaces for event registration/handling: QContext and QSource, which are based closely on their GMainContext/GSource GLib counterparts. QContexts can be created via the command-line via -object, and can also be intructed (via -object params/properties) to automatically start a thread/event-loop to handle QSources we attach to them. The reference implementation of QContext is GlibQContext, which uses GMainContext/GSource interfaces underneath the covers, thus we can also attach GSource (and as a result, AioContexts) to it. As part of this series we also port the QEMU main loop to using QContext and drop virtio-blk's dataplane thread in favor of a GlibQContext thread, which virtio-blk dataplane attaches to via it's AioContext's GSource. With these patches in place we can do virtio-blk dataplane assignment like so: qemu ... \ -object glib-qcontext,id=ctx0,threaded=yes -drive file=file.raw,id=drive0,aio=native,cache=none \ -device virtio-blk,drive=drive0,scsi=off,x-data-plane=on,context=ctx0 And also gain the ability to assign a virtio-blk dataplane to the default QContext driven by the QEMU main iothread: qemu ... \ -drive file=file.raw,id=drive0,aio=native,cache=none \ -device virtio-blk,drive=drive0,scsi=off,x-data-plane=on,context=main The latter likely isn't particularly desirable, and the series is in rough shape in general, but the goal of this RFC to demonstrate the approach and get some feedback on how we might handle thread assignments for things like virtio-blk/virtio-net dataplane, and multi-threaded device models general. Input on this would be greatly appreciated. BACKGROUND There has been an outgoing discussion on qemu-devel about what event loop interface to consolidate around for virtio-blk dataplane, threaded virtio-net, and offloading device workloads to seperate threads in general. Currently the main candidates seem to be GLib GSources and AioContext, with virtio-blk/virtio-net dataplane being the use-cases under consideration. virtio-blk: In the case of virtio-blk dataplane, we need to drive virtqueue and AIO events. Since AioContext is used extensively throughout the block layer to drive AIO, it makes sense to re-use it here as we look toward pushing dataplane functionality deeper into the block layer to benefit from things like image format support/snapshots/etc. virtio-net: In the case of Ping Fan's virtio-net dataplane patches (http://thread.gmane.org/gmane.comp.emulators.qemu/196111/focus=196111), we need to drive virtqueue and NetClient peer events (such as TAP devices, or possibly things like slirp in the future). Currently NetClient events rely on IOHandler interfaces such as qemu_set_fd_handler(). These interfaces are global ones that rely on a single IOHandler list serviced by QEMU's main loop. An effort is currently underway to port these to GSources so that can be more easilly attached to other event loops (as opposed to the hooks used for the virtio-net dataplane series). Theoretically, much of the latter (such as TAP devices) can also be done around AioContext with some minor changes, but other NetClient backends such as slirp lend themselves to more open-ended event loop interfaces like GSources. Other devices might also find themselves needing something more open-ended (a somewhat silly but present example being virtio-serial + GSource-driven chardev) QContext: Since the direction for the forseeable future will likely continue to be GSources for some things, AioContext for others, a way to reconcile these approaches would be the age-old approach of adding a layer of abstration on top of the 2 so that we can handle device/backend thread assignments using a general mechanism. Building around this abstration also leaves open the ability to deal with things like locking considerations for real-time support in the future. A reasonable start to modeling abstraction layer this would be the open-ended GMainContext/GSource approach that QEMU relies on already, which is what the QContext/QSource interfaces do (with some minor differences/additions such as QSources storing and opaque instead of the GSource-subclassing approach for GLib). TODO/KNOWN ISSUES - GTK UI causes a crash during certain window refresh events. works fine with VNC though, and no problems with other GSources. - slirp isn't working (will work with Ping Fan's slirp-GSource conversion) - add proper locking - integrate timers into a QSource to make timer-event driveable in seperate QContext event loops - consider a command-line wrapper to -object, such as: -context id,[threaded=yes|no],[backend=default|glib] Makefile.objs|6 +- async.c | 16 +++ hw/block/dataplane/virtio-blk.c | 46 ++- include/block/aio.h |