Re: [Qemu-devel] [PATCH] ui/input.c: replace magic numbers with macros
On 05/16/2013 01:19 PM, Amos Kong wrote: It's clearer to use defined macros than magic numbers. Signed-off-by: Amos Kong ak...@redhat.com --- ui/input.c | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/ui/input.c b/ui/input.c index 8ca1a03..92c44ca 100644 --- a/ui/input.c +++ b/ui/input.c @@ -28,6 +28,7 @@ #include qapi/error.h #include qmp-commands.h #include qapi-types.h +#include ui/keymaps.h struct QEMUPutMouseEntry { QEMUPutMouseEvent *qemu_put_mouse_event; @@ -260,10 +261,10 @@ static void free_keycodes(void) static void release_keys(void *opaque) { while (keycodes_size 0) { -if (keycodes[--keycodes_size] 0x80) { -kbd_put_keycode(0xe0); +if (keycodes[--keycodes_size] SCANCODE_GREY) { +kbd_put_keycode(SCANCODE_EMUL0); } -kbd_put_keycode(keycodes[keycodes_size] | 0x80); +kbd_put_keycode(keycodes[keycodes_size] | SCANCODE_UP); } free_keycodes(); @@ -297,10 +298,10 @@ void qmp_send_key(KeyValueList *keys, bool has_hold_time, int64_t hold_time, return; } -if (keycode 0x80) { -kbd_put_keycode(0xe0); +if (keycode SCANCODE_GREY) { +kbd_put_keycode(SCANCODE_EMUL0); } -kbd_put_keycode(keycode 0x7f); +kbd_put_keycode(keycode SCANCODE_KEYCODEMASK); keycodes = g_realloc(keycodes, sizeof(int) * (keycodes_size + 1)); keycodes[keycodes_size++] = keycode; Looks good to me. Reviewed-by: Lei Li li...@linux.vnet.ibm.com -- Lei
Re: [Qemu-devel] [PATCH v3 0/8] block: drive-backup live backup command
于 2013-5-15 22:34, Stefan Hajnoczi 写道: Note: These patches apply to my block-next tree. You can also grab the code from git here: git://github.com/stefanha/qemu.git block-backup-core This series adds a new QMP command, drive-backup, which takes a point-in-time snapshot of a block device. The snapshot is copied out to a target block device. A simple example is: drive-backup device=virtio0 format=qcow2 target=backup-20130401.qcow2 The original drive-backup blockjob was written by Dietmar Maurer diet...@proxmox.com. He is currently busy but I feel the feature is worth pushing into QEMU since there has been interest. This is my version of his patch, plus the QMP command and qemu-iotests test case. QMP 'transaction' support is included since v3. It adds support for atomic snapshots of multiple block devices. I also added an 'abort' transaction to allow testing of the .abort()/.cleanup() code path. Thanks to Wenchao for making qmp_transaction() extensible. How is this different from block-stream and drive-mirror? - Both block-stream and drive-mirror do not provide immediate point-in-time snapshots. Instead they copy data into a new file and then switch to it. In other words, the point at which the snapshot is taken cannot be controlled directly. drive-backup intercepts guest writes and saves data into the target block device before it is overwritten. The target block device can be a raw image file, backing files are not used to implement this feature. How can drive-backup be used? - The simplest use-case is to copy a point-in-time snapshot to a local file. More advanced users may wish to make the target an NBD URL. The NBD server listening on the other side can process the backup writes any way it wishes. I previously posted an RFC series with a backup server that streamed Dietmar's VMA backup archive format. What's next for drive-backup? - 1. Sync modes like drive-mirror (top, full, none). This makes it possible to preserve the backing file chain. v3: * Rename to drive-backup for consistency with drive-mirror [kwolf] * Add QMP transaction support [kwolf] * Introduce bdrv_add_before_write_cb() to hook writes * Mention 'query-block-jobs' lists job of type 'backup' [eblake] * Rename rwlock to flush_rwlock [kwolf] * Fix space in block/backup.c comment [kwolf] v2: * s/block_backup/block-backup/ in commit message [eblake] * Avoid funny spacing in QMP docs [eblake] * Document query-block-jobs and block-job-cancel usage [eblake] After checking the code, I found it possible to add delta data backup support also, If an additional dirty bitmap was added. Compared with current solution, I think it is doing COW at qemu device level: qemu device | general block layer | virtual format layer | --- | | qcow2 vmdk This will make things complicated when more works comes, a better place for block COW, is under general block layer. Maybe later we can adjust block for it. -- Best Regards Wenchao Xia
Re: [Qemu-devel] tap devices not receiving packets from a bridge
On Thu, May 16, 2013 at 09:24:05AM +0300, Michael S. Tsirkin wrote: On Wed, May 15, 2013 at 12:00:03PM +0100, Nicholas Thomas wrote: Hi again, On Tue, 2013-05-14 at 15:49 +0100, Nicholas Thomas wrote: /sys/devices/virtual/net/t100/tun_flags is 0x5002 - so it looks like IFF_ONE_QUEUE was indeed unset by qemu (which is lacking the patch). It surprises me, but that's probably my fault, rather than qemu's. I've rebuilt 1.4.1 with the IFF_ONE_QUEUE patch and tun_flags is now 0x7002; unfortunately, I'm still seeing this bug, twice in five trials. Symptoms in `ifconfig t100` now differ; overruns stays at 0, and dropped increases monotonically as I send packets. Those packets do appear if I tcpdump t100 on the host, but not if I tcpdump t100 on the guest. I've turned off gro in the guest, which makes no difference, and tried changing the queue sizes (post-hoc) in both guest and host, in the hope of causing them to be emptied out, clearing the condition; again to no effect. The VMs in question are bridged to a large (and busy) VLAN with no ingress filtering to speak of; I guess what's happening is that the transmit queue is filled up by that traffic while the guest is in ipxe, and it never gets out of that state when it happens... so maybe there is still an underlying problem? /Nick Is this with or without vhost-net in host? never mind, I see it's without. Try to enable vhost-net (you'll have to switch to -netdev syntax for that to work) and see if this help. If it does it's likely a qemu bug if not probably a guest bug. -- MST
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On 05/16/2013 12:30 PM, Amos Kong wrote: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com --- hw/input/ps2.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 3412079..1cfe055 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -94,6 +94,9 @@ typedef struct { int translate; int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */ int ledstate; +int repeat_period; /* typematic period, ms */ +int repeat_delay; /* typematic delay, ms */ +int repeat_key; /* keycode to repeat */ } PS2KbdState; typedef struct { @@ -146,6 +149,22 @@ void ps2_queue(void *opaque, int b) s-update_irq(s-update_arg, 1); } +static QEMUTimer *repeat_timer; +static bool auto_repeat; + +static void repeat_ps2_queue(void *opaque) +{ +PS2KbdState *s = opaque; + +if (auto_repeat) { +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_period, + 1000)); +ps2_queue(s-common, s-repeat_key); +} +} + + /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; ps2_queue(s-common, keycode); + +if (auto_repeat) { +s-repeat_key = keycode; +/* delay a while before first repeat */ +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_delay, 1000)); +} } Not familiar with ps2, but don't we need something to make this safe after migration? uint32_t ps2_read_data(void *opaque) @@ -213,6 +242,11 @@ static void ps2_reset_keyboard(PS2KbdState *s) void ps2_write_keyboard(void *opaque, int val) { +/* repeat period/delay table from kernel (drivers/input/keyboard/atkbd.c) */ +const short period[32] = { 33, 37, 42, 46, 50, 54, 58, 63, 67, 75, +83, 92, 100, 109, 116, 125, 133, 149, 167, 182, 200, 217, 232, +250, 270, 303, 333, 370, 400, 435, 470, 500 }; +const short delay[4] = { 250, 500, 750, 1000 }; PS2KbdState *s = (PS2KbdState *)opaque; switch(s-common.write_cmd) { @@ -288,6 +322,11 @@ void ps2_write_keyboard(void *opaque, int val) s-common.write_cmd = -1; break; case KBD_CMD_SET_RATE: + /* Bit0-4 specifies the repeat rate */ +s-repeat_period = period[val 0x1f]; + /* Bit5-6 bit specifies the delay time */ +s-repeat_delay = delay[(val 0x60) 5]; + ps2_queue(s-common, KBD_REPLY_ACK); s-common.write_cmd = -1; break; @@ -536,6 +575,8 @@ static void ps2_kbd_reset(void *opaque) s-scan_enabled = 0; s-translate = 0; s-scancode_set = 0; +s-repeat_period = 92; +s-repeat_delay = 500; } static void ps2_mouse_reset(void *opaque) @@ -660,6 +701,7 @@ void *ps2_kbd_init(void (*update_irq)(void *, int), void *update_arg) vmstate_register(NULL, 0, vmstate_ps2_keyboard, s); qemu_add_kbd_event_handler(ps2_put_keycode, s); qemu_register_reset(ps2_kbd_reset, s); +repeat_timer = qemu_new_timer_ns(vm_clock, repeat_ps2_queue, s); return s; }
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On 16 May 2013 05:30, Amos Kong ak...@redhat.com wrote: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 3412079..1cfe055 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -94,6 +94,9 @@ typedef struct { int translate; int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */ int ledstate; +int repeat_period; /* typematic period, ms */ +int repeat_delay; /* typematic delay, ms */ +int repeat_key; /* keycode to repeat */ } PS2KbdState; typedef struct { @@ -146,6 +149,22 @@ void ps2_queue(void *opaque, int b) s-update_irq(s-update_arg, 1); } +static QEMUTimer *repeat_timer; +static bool auto_repeat; These shouldn't be static -- what would happen on a system with two ps2 keyboard models in it? You need to reset your qemu_timer in the ps2 reset handler, as well; otherwise it could go off unexpectedly after a reset. (Though perhaps not if we're simulating a human with their finger held down on the key...) thanks -- PMM
Re: [Qemu-devel] [Qemu-ppc] [PATCH 11/11] vfio: Add guest side IOMMU support
On Wed, May 15, 2013 at 01:32:50PM +1000, David Gibson wrote: On Tue, May 14, 2013 at 08:51:08PM -0600, Alex Williamson wrote: On Wed, 2013-05-15 at 11:33 +1000, David Gibson wrote: On Tue, May 14, 2013 at 11:15:26AM -0600, Alex Williamson wrote: On Tue, 2013-05-14 at 19:13 +1000, David Gibson wrote: [snip] Trying to make sure I understand why guest_iommus is a list. We can have multiple guest iommus, but in the majority of those cases I would expect only one iommu per VFIOAddressSpace. So if the listener is for this space, we only add the relevant iommu and not all of the iommus in the machine. That's what we do - we only add notifiers for iommu regions that appear within the relevant address space. It's a list because at least theoretically there could be more than one iommu region in the AS, although I don't know of any real cases where that would be true. Actually maybe two guest iommu ranges are common for spapr per space, a 32bit and a 64bit range? I think 64-bit DMA windows on PAPR are usually just mapped to RAM with a fixed offset, rather than having TCEs (page table). That might well introduce some extra complexities in how we mirror that into VFIO, but it's not directly relevant to this point. So, since I wrote that I've heard from Ben that while current machines don't have multiple (translated) iommu windows, this could well be found on future machines. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: Digital signature
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On Thu, May 16, 2013 at 01:30:28PM +0800, li guang wrote: 在 2013-05-16四的 12:30 +0800,Amos Kong写道: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com --- hw/input/ps2.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 3412079..1cfe055 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -94,6 +94,9 @@ typedef struct { int translate; int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */ int ledstate; +int repeat_period; /* typematic period, ms */ +int repeat_delay; /* typematic delay, ms */ +int repeat_key; /* keycode to repeat */ } PS2KbdState; typedef struct { @@ -146,6 +149,22 @@ void ps2_queue(void *opaque, int b) s-update_irq(s-update_arg, 1); } +static QEMUTimer *repeat_timer; +static bool auto_repeat; + +static void repeat_ps2_queue(void *opaque) +{ +PS2KbdState *s = opaque; + Hi, Li guang theoretically, we have to check if the typematic key is in break now, if so, we will not do repeat anymore. You mean key is released? I checked by '~keycode 0x80', stop repeat when key is release. don't you think we have a chance that new key can come in during waiting? When the new key (press) comes, repeat_timer will be modified by qemu_mod_timer(), original repate will be end. +if (auto_repeat) { +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_period, + 1000)); +ps2_queue(s-common, s-repeat_key); +} +} + + /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; ^^^ ps2_queue(s-common, keycode); + +if (auto_repeat) { +s-repeat_key = keycode; +/* delay a while before first repeat */ +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_delay, 1000)); +} } uint32_t ps2_read_data(void *opaque) @@ -213,6 +242,11 @@ static void ps2_reset_keyboard(PS2KbdState *s) void ps2_write_keyboard(void *opaque, int val) { +/* repeat period/delay table from kernel (drivers/input/keyboard/atkbd.c) */ +const short period[32] = { 33, 37, 42, 46, 50, 54, 58, 63, 67, 75, +83, 92, 100, 109, 116, 125, 133, 149, 167, 182, 200, 217, 232, +250, 270, 303, 333, 370, 400, 435, 470, 500 }; +const short delay[4] = { 250, 500, 750, 1000 }; PS2KbdState *s = (PS2KbdState *)opaque; switch(s-common.write_cmd) { @@ -288,6 +322,11 @@ void ps2_write_keyboard(void *opaque, int val) s-common.write_cmd = -1; break; case KBD_CMD_SET_RATE: + /* Bit0-4 specifies the repeat rate */ +s-repeat_period = period[val 0x1f]; + /* Bit5-6 bit specifies the delay time */ +s-repeat_delay = delay[(val 0x60) 5]; s/(val 0x60) 5/(val 0x60) 5 0x4/ ^^ 0x3 ? val 5 0x3 -- Amos.
Re: [Qemu-devel] tap devices not receiving packets from a bridge
On Wed, May 15, 2013 at 12:00:03PM +0100, Nicholas Thomas wrote: Hi again, On Tue, 2013-05-14 at 15:49 +0100, Nicholas Thomas wrote: /sys/devices/virtual/net/t100/tun_flags is 0x5002 - so it looks like IFF_ONE_QUEUE was indeed unset by qemu (which is lacking the patch). It surprises me, but that's probably my fault, rather than qemu's. I've rebuilt 1.4.1 with the IFF_ONE_QUEUE patch and tun_flags is now 0x7002; unfortunately, I'm still seeing this bug, twice in five trials. Symptoms in `ifconfig t100` now differ; overruns stays at 0, and dropped increases monotonically as I send packets. Those packets do appear if I tcpdump t100 on the host, but not if I tcpdump t100 on the guest. I've turned off gro in the guest, which makes no difference, and tried changing the queue sizes (post-hoc) in both guest and host, in the hope of causing them to be emptied out, clearing the condition; again to no effect. The VMs in question are bridged to a large (and busy) VLAN with no ingress filtering to speak of; I guess what's happening is that the transmit queue is filled up by that traffic while the guest is in ipxe, and it never gets out of that state when it happens... so maybe there is still an underlying problem? /Nick Is this with or without vhost-net in host? -- MST
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
在 2013-05-16四的 14:58 +0800,Amos Kong写道: On Thu, May 16, 2013 at 01:30:28PM +0800, li guang wrote: 在 2013-05-16四的 12:30 +0800,Amos Kong写道: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com --- hw/input/ps2.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 3412079..1cfe055 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -94,6 +94,9 @@ typedef struct { int translate; int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */ int ledstate; +int repeat_period; /* typematic period, ms */ +int repeat_delay; /* typematic delay, ms */ +int repeat_key; /* keycode to repeat */ } PS2KbdState; typedef struct { @@ -146,6 +149,22 @@ void ps2_queue(void *opaque, int b) s-update_irq(s-update_arg, 1); } +static QEMUTimer *repeat_timer; +static bool auto_repeat; + +static void repeat_ps2_queue(void *opaque) +{ +PS2KbdState *s = opaque; + Hi, Li guang theoretically, we have to check if the typematic key is in break now, if so, we will not do repeat anymore. You mean key is released? I checked by '~keycode 0x80', stop repeat when key is release. don't you think we have a chance that new key can come in during waiting? When the new key (press) comes, repeat_timer will be modified by qemu_mod_timer(), original repate will be end. +if (auto_repeat) { +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_period, + 1000)); +ps2_queue(s-common, s-repeat_key); +} +} + + /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; ^^^ case: 1. new key pressed 2. enter ps2_put_keycode 3. previous timer timeout 4. enter repeat_ps2_queue 5. put previous keycode in queue 6. back to ps2_put_keycode 7. check auto_repeat so, an obsolete key comes up. can timer interrupt ps2_put_keycode? ps2_queue(s-common, keycode); + +if (auto_repeat) { +s-repeat_key = keycode; +/* delay a while before first repeat */ +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_delay, 1000)); +} } uint32_t ps2_read_data(void *opaque) @@ -213,6 +242,11 @@ static void ps2_reset_keyboard(PS2KbdState *s) void ps2_write_keyboard(void *opaque, int val) { +/* repeat period/delay table from kernel (drivers/input/keyboard/atkbd.c) */ +const short period[32] = { 33, 37, 42, 46, 50, 54, 58, 63, 67, 75, +83, 92, 100, 109, 116, 125, 133, 149, 167, 182, 200, 217, 232, +250, 270, 303, 333, 370, 400, 435, 470, 500 }; +const short delay[4] = { 250, 500, 750, 1000 }; PS2KbdState *s = (PS2KbdState *)opaque; switch(s-common.write_cmd) { @@ -288,6 +322,11 @@ void ps2_write_keyboard(void *opaque, int val) s-common.write_cmd = -1; break; case KBD_CMD_SET_RATE: + /* Bit0-4 specifies the repeat rate */ +s-repeat_period = period[val 0x1f]; + /* Bit5-6 bit specifies the delay time */ +s-repeat_delay = delay[(val 0x60) 5]; s/(val 0x60) 5/(val 0x60) 5 0x4/ ^^ 0x3 ? val 5 0x3 Oh, yes should be ' 0x3'
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On Thu, May 16, 2013 at 07:50:35AM +0100, Peter Maydell wrote: On 16 May 2013 05:30, Amos Kong ak...@redhat.com wrote: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 3412079..1cfe055 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -94,6 +94,9 @@ typedef struct { int translate; int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */ int ledstate; +int repeat_period; /* typematic period, ms */ +int repeat_delay; /* typematic delay, ms */ +int repeat_key; /* keycode to repeat */ } PS2KbdState; typedef struct { @@ -146,6 +149,22 @@ void ps2_queue(void *opaque, int b) s-update_irq(s-update_arg, 1); } +static QEMUTimer *repeat_timer; +static bool auto_repeat; These shouldn't be static -- what would happen on a system with two ps2 keyboard models in it? I will move them to PS2KbdState. For the migrate issue mentioned by jason, the keyboard state (repeat_period/repeat_delay/etc) are configured by guest, it should be saved to vmstate and migrated to dest vm. If we want the unfinished repeat continue, repeat-timer also should be migrated, but another key_timer in ui/input.c for send-key could not be migrated, it means the release event will be lost. Do we need to migrate the keyboard state? You need to reset your qemu_timer in the ps2 reset handler, as well; otherwise it could go off unexpectedly after a reset. (Though perhaps not if we're simulating a human with their finger held down on the key...) Ok thanks -- PMM -- Amos.
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On 05/16/2013 12:30 PM, Amos Kong wrote: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com --- hw/input/ps2.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 3412079..1cfe055 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -94,6 +94,9 @@ typedef struct { int translate; int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */ int ledstate; +int repeat_period; /* typematic period, ms */ +int repeat_delay; /* typematic delay, ms */ +int repeat_key; /* keycode to repeat */ } PS2KbdState; typedef struct { @@ -146,6 +149,22 @@ void ps2_queue(void *opaque, int b) s-update_irq(s-update_arg, 1); } +static QEMUTimer *repeat_timer; +static bool auto_repeat; + +static void repeat_ps2_queue(void *opaque) +{ +PS2KbdState *s = opaque; + +if (auto_repeat) { +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_period, + 1000)); +ps2_queue(s-common, s-repeat_key); +} +} + + /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; Does this check allow to distinguish the difference between auto-repeat and actual repeated entry by the user? ps2_queue(s-common, keycode); + +if (auto_repeat) { +s-repeat_key = keycode; +/* delay a while before first repeat */ +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_delay, 1000)); +} } uint32_t ps2_read_data(void *opaque) @@ -213,6 +242,11 @@ static void ps2_reset_keyboard(PS2KbdState *s) void ps2_write_keyboard(void *opaque, int val) { +/* repeat period/delay table from kernel (drivers/input/keyboard/atkbd.c) */ +const short period[32] = { 33, 37, 42, 46, 50, 54, 58, 63, 67, 75, +83, 92, 100, 109, 116, 125, 133, 149, 167, 182, 200, 217, 232, +250, 270, 303, 333, 370, 400, 435, 470, 500 }; +const short delay[4] = { 250, 500, 750, 1000 }; PS2KbdState *s = (PS2KbdState *)opaque; switch(s-common.write_cmd) { @@ -288,6 +322,11 @@ void ps2_write_keyboard(void *opaque, int val) s-common.write_cmd = -1; break; case KBD_CMD_SET_RATE: + /* Bit0-4 specifies the repeat rate */ +s-repeat_period = period[val 0x1f]; + /* Bit5-6 bit specifies the delay time */ +s-repeat_delay = delay[(val 0x60) 5]; + ps2_queue(s-common, KBD_REPLY_ACK); s-common.write_cmd = -1; break; @@ -536,6 +575,8 @@ static void ps2_kbd_reset(void *opaque) s-scan_enabled = 0; s-translate = 0; s-scancode_set = 0; +s-repeat_period = 92; +s-repeat_delay = 500; } static void ps2_mouse_reset(void *opaque) @@ -660,6 +701,7 @@ void *ps2_kbd_init(void (*update_irq)(void *, int), void *update_arg) vmstate_register(NULL, 0, vmstate_ps2_keyboard, s); qemu_add_kbd_event_handler(ps2_put_keycode, s); qemu_register_reset(ps2_kbd_reset, s); +repeat_timer = qemu_new_timer_ns(vm_clock, repeat_ps2_queue, s); return s; } -- Lei
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On Thu, May 16, 2013 at 03:13:10PM +0800, li guang wrote: 在 2013-05-16四的 14:58 +0800,Amos Kong写道: On Thu, May 16, 2013 at 01:30:28PM +0800, li guang wrote: 在 2013-05-16四的 12:30 +0800,Amos Kong写道: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com Hi, Li guang theoretically, we have to check if the typematic key is in break now, if so, we will not do repeat anymore. You mean key is released? I checked by '~keycode 0x80', stop repeat when key is release. don't you think we have a chance that new key can come in during waiting? When the new key (press) comes, repeat_timer will be modified by qemu_mod_timer(), original repate will be end. +if (auto_repeat) { +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_period, + 1000)); +ps2_queue(s-common, s-repeat_key); +} +} + + /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; ^^^ case: 1. new key pressed 2. enter ps2_put_keycode 3. previous timer timeout I guess it's repeat_timer 4. enter repeat_ps2_queue 5. put previous keycode in queue 6. back to ps2_put_keycode back? repeat_ps_queue() is called in background. 7. check auto_repeat so, an obsolete key comes up. can timer interrupt ps2_put_keycode? no. -- Amos.
Re: [Qemu-devel] [PATCH v3 2/8] block: add basic backup support to block driver
On Thu, May 16, 2013 at 11:27:38AM +0800, Wenchao Xia wrote: +/* See if in-flight requests overlap and wait for them to complete */ +static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job, + int64_t start, + int64_t end) +{ +CowRequest *req; +bool retry; + +do { +retry = false; +QLIST_FOREACH(req, job-inflight_reqs, list) { +if (end req-start start req-end) { +qemu_co_queue_wait(req-wait_queue); +retry = true; +break; +} +} +} while (retry); +} + In my understanding, there will be possible two program routines entering here at same time since it holds read lock instead of write lock, and they may also modify job-inflight_reqs, is it possible a race condition here? I am not sure whether back-ground job will becomes a thread. No, all operations on a BlockDriverState execute in the same event loop thread. Only coroutine synchronization is necessary, which is provided in these patches.
Re: [Qemu-devel] [PATCH v3 3/8] block: add drive-backup QMP command
On Wed, May 15, 2013 at 01:04:55PM -0600, Eric Blake wrote: On 05/15/2013 08:34 AM, Stefan Hajnoczi wrote: @drive-backup Start a point-in-time copy of a block device to a new destination. The status of ongoing drive-backup operations can be checked with query-block-jobs where the BlockJobInfo.type field has the value 'backup'. The operation can be stopped before it has completed using the block-job-cancel command. +++ b/qmp-commands.hx @@ -912,6 +912,12 @@ EQMP }, { +.name = drive-backup, +.args_type = device:B,target:s,speed:i?,mode:s?,format:s?, +.mhandler.cmd_new = qmp_marshal_input_drive_backup, +}, No example? But I'm not going to block this patch because of that. I'll add an example in v4.
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On Thu, May 16, 2013 at 03:23:21PM +0800, Lei Li wrote: On 05/16/2013 12:30 PM, Amos Kong wrote: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; Hi Lei, Does this check allow to distinguish the difference between auto-repeat and actual repeated entry by the user? Actual repeat by user: press event release event press event release event press event release event Auto-repeat example: press event press event press event release event so here we check if it's a press event, only set repeat_timer for press event. When we get release event, we just stop repeat action. ps2_queue(s-common, keycode); + +if (auto_repeat) { +s-repeat_key = keycode; +/* delay a while before first repeat */ +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_delay, 1000)); +} } -- Amos.
Re: [Qemu-devel] [PATCH v3 6/8] blockdev: add DriveBackup transaction
On Wed, May 15, 2013 at 01:13:26PM -0600, Eric Blake wrote: On 05/15/2013 08:34 AM, Stefan Hajnoczi wrote: This patch adds a transactional version of the drive-backup QMP command. It allows atomic snapshots of multiple drives along with automatic cleanup if there is a failure to start one of the backup jobs. Note that QMP events are emitted for block job completion/cancellation and the block job will be listed by query-block-jobs. + +static void drive_backup_abort(BlkTransactionState *common) +{ +DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common); +BlockDriverState *bs = state-bs; + +/* Only cancel if it's the job we started */ +if (bs bs-job bs-job == state-job) { +block_job_cancel_sync(bs-job); +} Question - if starting the job created the target file, should aborting the job unlink() that file so that we aren't polluting the file system? blockdev-snapshot-sync action does not delete the file on abort. +++ b/qapi-schema.json @@ -1609,6 +1609,27 @@ '*mode': 'NewImageMode' } } ## +# @DriveBackup +# +# @device: the name of the device whose writes should be mirrored. +# +# @target: the target of the new image. If the file exists, or if it +# is a device, the existing file/device will be used as the new +# destination. If it does not exist, a new file will be created. +# +# @format: #optional the format of the new destination, default is to +# probe if @mode is 'existing', else the format of the source +# +# @mode: #optional whether and how QEMU should create a new image, default is +#'absolute-paths'. +# +# @speed: #optional the maximum speed, in bytes per second +## Mention Since: 1.6 Will add in v4.
Re: [Qemu-devel] [PATCH v3 0/8] block: drive-backup live backup command
On Thu, May 16, 2013 at 02:16:20PM +0800, Wenchao Xia wrote: After checking the code, I found it possible to add delta data backup support also, If an additional dirty bitmap was added. I've been thinking about this. Incremental backups need to know which blocks have changed, but keeping a persistent dirty bitmap is expensive and unnecessary. Backup applications need to support the full backup case anyway for their first run. Therefore we can keep a best-effort dirty bitmap which is persisted only when the guest is terminated cleanly. If the QEMU process crashes then the on-disk dirty bitmap will be invalid and the backup application needs to do a full backup next time. The advantage of this approach is that we don't need to fdatasync(2) before every guest write operation. Compared with current solution, I think it is doing COW at qemu device level: qemu device | general block layer | virtual format layer | --- | | qcow2 vmdk This will make things complicated when more works comes, a better place for block COW, is under general block layer. Maybe later we can adjust block for it. I don't consider block jobs to be qemu device layer. It sounds like you think the code should be in bdrv_co_do_writev()? The drive-backup operation doesn't really affect the source BlockDriverState, it just needs to intercept writes. Therefore it seems cleaner for the code to live separately (plus we reuse the code for the block job loop which copies out data while the guest is running). Otherwise we would squash all of the blockjob code into block.c and it would be an even bigger mess than it is today :-).
Re: [Qemu-devel] [PATCH v3 7/8] blockdev: add Abort transaction
On Wed, May 15, 2013 at 01:01:15PM -0600, Eric Blake wrote: On 05/15/2013 08:34 AM, Stefan Hajnoczi wrote: +++ b/qapi-schema.json @@ -1630,6 +1630,14 @@ '*mode': 'NewImageMode', '*speed': 'int' } } ## +# @Abort +# +# This action can be used to test transaction failure. +### +{ 'type': 'Abort', + 'data': { } } + Probably should add a Since: 1.6 notation. With that, Will add in v4.
Re: [Qemu-devel] [PATCH v3 1/8] block: add bdrv_add_before_write_cb()
On Wed, May 15, 2013 at 04:42:57PM +0200, Paolo Bonzini wrote: Il 15/05/2013 16:34, Stefan Hajnoczi ha scritto: The bdrv_add_before_write_cb() function installs a callback that is invoked before a write request is processed. This will be used to implement copy-on-write point-in-time snapshots where we need to copy out old data before overwriting it. Perhaps a notifier list that receives the BdrvTrackedRequest? (BTW we should probably remove all the notifier_remove wrappers, they're useless). Nice idea, done in v4. I originally rejected NotifierList because it only has a void * argument, but BdrvRequest has the information we need.
Re: [Qemu-devel] tap devices not receiving packets from a bridge
Hi, On Thu, 2013-05-16 at 09:27 +0300, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 09:24:05AM +0300, Michael S. Tsirkin wrote: Is this with or without vhost-net in host? never mind, I see it's without. Try to enable vhost-net (you'll have to switch to -netdev syntax for that to work) and see if this help. If it does it's likely a qemu bug if not probably a guest bug. Switching to -netdev is non-trivial for me, unfortunately. Anyway, it's definitely a qemu bug - it happens on kernels 3.2 and 3.9 with 1.4.1, but doesn't happen with qemu 0.15.0 or 1.5.0rc1. I'll have a dig through git to see if I can identify the patch that resolves it. It feels-like qemu sometimes stops reading from the tap file descriptor between ipxe exiting and the linux kernel bringing up the network interface, and never recovers from that. /Nick
[Qemu-devel] QMP interface for drive-add (or even blockdev-add)
[ CCed qemu-devel, Stefan and Luiz ] Am 15.05.2013 um 19:43 hat Eric Blake geschrieben: On 05/15/2013 09:58 AM, Kevin Wolf wrote: Not really. Basically what I have in mind is the obvious mapping of the command line arguments (using the downstream command here because upstream doesn't even have a QMP drive_add yet): -drive format=qcow2,id=some_disk,file.driver=nbd,\ file.host=localhost,lazy_refcounts=on would become something like: { execute: __com.redhat_drive_add, arguments: { format: qcow2, id: some_disk, file: { driver: nbd, host: localhost } lazy_refcounts: true, } } Yeah, that might work. More interesting, like you say, would be learning how to know what options are available, and figuring out how to write that into a maintainable JSON spec. Actually, there is one important point to note in this proposal: It's that driver-specific and generic options are mixed in the same namespace. This allows a conversion of one option after another and the API doesn't change after each step. On the other hand, they behave somewhat different (only converted options can be overridden for backing files), so maybe we actually have to switch everything within one release. I'm not sure how well that works with the existing implementation of QAPI etc. but it's what I think makes the most sense from a protocol level perspective. And that should take precedence. By the way, if you have any ideas about how to make driver-specific options discoverable for libvirt, I'd be glad to hear them. Should it be a QMP query command that returns a list of all drivers and what options they support? Or can we somehow handle it with (dynamic) schema introspection? Or would a command line option similar to '-device help' be more helpful than a monitor command because you need to know the valid values before starting qemu? Introspection would work, although having a dedicated command might make the information easier to get at. Libvirt would prefer to get information through QMP, not a command-line option. But don't worry about timing - when libvirt starts, we spawn a dummy qemu just to do QMP queries of everything we need to know (and cache those results), so that when we finally spawn qemu command lines for real, we've already learned from the earlier QMP calls what is available on the command line. Okay, that's already a very helpful information to have. I believe exposing things over QMP is more convenient for qemu as well. Also, the recent addition of query-command-line-options might be extensible - right now, the QAPI type CommandLineParameterType is hard-coded to one of four basic types; but we could extend it to list a fifth type, maybe named 'qapi', and where we can then expose an additional argument stating what QAPI type to introspect for further details. That is, something like: -{execute:query-command-line-options, arguments:{option:drive} } -{return:[ {option:drive, parameters:[ {name:file, type:qapi, qapi:BlockFile}, {name:backing, type:qapi, BlockBackingFile} ]}]} Here you're assuming a different layout than I outlined above. I guess we'd have to directly make parameters the union... -{execute:query-qapi-type, arguments:{type:BlockFile} } -{return:{union:BlockFile, data:{ nbd:BlockFileNBD, qcow2:BlockFileQcow2} } -{execute:query-qapi-type, arguments:{type:BlockFileNBD} } -{return:{type,BlockFileNBD, data:{ port:int, host:str, ... } } which would be enough to know that if I want to open an NBD disk, I can use -drive file=nbd:,file.port=1234,file.host=::1, because the 'file' option to -drive has a union type that I can introspect to learn what additional file.suboptions I can specify when using a given arm of the union. ...but I agree that using a union here may be workable. However, it brings us back to the question asked in a different thread: How to expose which drivers are actually available? I still think we need a dynamic schema for introspection. (But how should doing this dynmically work if QAPI maps this to a C union?) Another thing that we'll probably want from QAPI is some kind of inheritance, like this: { type: BlockFileBase, data: { driver: str, cache: CacheEnum, ... } { type: BlockFileNBD, extends: BlockFileBase, data: { port: *int, host: str, ... }} In working all of this out upstream, we may decide to tweak the command line syntax (after all, how do I know that file.driver controls which arm of the union to take, and therefore which other file.FOO sub-options are available?), so I'm glad we disabled sub-maps in qemu 1.5 until we can match the QMP exposure of submaps. Okay, let's take a step back here. The idea was more or less that you can specify each BlockDriverState by itself in the end, like this:
Re: [Qemu-devel] [PATCH 08/12] qemu-thread: report RCU quiescent states
On Wed, May 15, 2013 at 11:48 PM, Paolo Bonzini pbonz...@redhat.com wrote: Most threads will use mutexes and other sleeping synchronization primitives (condition variables, semaphores, events) periodically. For these threads, the synchronization primitives are natural places to report a quiescent state (possibly an extended one). Signed-off-by: Paolo Bonzini pbonz...@redhat.com --- docs/rcu.txt | 28 util/qemu-thread-posix.c | 30 ++ util/qemu-thread-win32.c | 16 +++- util/rcu.c | 3 --- 4 files changed, 69 insertions(+), 8 deletions(-) diff --git a/docs/rcu.txt b/docs/rcu.txt index 4e7cde3..d249ebf 100644 --- a/docs/rcu.txt +++ b/docs/rcu.txt @@ -168,6 +168,34 @@ of quiescent states, i.e. points where no RCU read-side critical section can be active. All threads created with qemu_thread_create participate in the RCU mechanism and need to annotate such points. +Luckily, in most cases no manual annotation is needed, because waiting +on condition variables (qemu_cond_wait), semaphores (qemu_sem_wait, +qemu_sem_timedwait) or events (qemu_event_wait) implicitly marks the thread +as quiescent for the whole duration of the wait. (There is an exception +for semaphore waits with a zero timeout). + Why not the same rule for zero timeout? +Manual annotation is still needed in the following cases: + +- threads that spend their sleeping time in the kernel, for example + in a call to select(), poll() or WaitForMultipleObjects(). The QEMU + I/O thread is an example of this case. + +- threads that perform a lot of I/O. In QEMU, the workers used for + aio=thread are an example of this case (see aio_worker in block/raw-*). + +- threads that run continuously until they exit. The migration thread + is an example of this case. + +Regarding the second case, note that the workers run in the QEMU thread +pool. The thread pool uses semaphores for synchronization, hence it does +report quiescent states periodically. However, in some cases (e.g. NFS +mounted with the hard option) the workers can take an arbitrarily long +amount of time. When this happens, synchronize_rcu() will not exit and +call_rcu() callbacks will be delayed arbitrarily. It is therefore a +good idea to mark I/O system calls as quiescence points in the worker +functions. + + Marking quiescent states is done with the following three APIs: void rcu_quiescent_state(void); diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c index 2df3382..f1f325a 100644 --- a/util/qemu-thread-posix.c +++ b/util/qemu-thread-posix.c @@ -119,7 +119,9 @@ void qemu_cond_wait(QemuCond *cond, QemuMutex *mutex) { int err; +rcu_thread_offline(); err = pthread_cond_wait(cond-cond, mutex-lock); +rcu_thread_online(); if (err) error_exit(err, __func__); } @@ -212,6 +214,10 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms) int rc; struct timespec ts; +if (ms) { +rcu_thread_offline(); +} + #if defined(__APPLE__) || defined(__NetBSD__) compute_abs_deadline(ts, ms); pthread_mutex_lock(sem-lock); @@ -227,7 +233,10 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms) } } pthread_mutex_unlock(sem-lock); -return (rc == ETIMEDOUT ? -1 : 0); +if (rc == ETIMEDOUT) { +rc == -1; +} + #else if (ms = 0) { /* This is cheaper than sem_timedwait. */ @@ -235,7 +244,7 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms) rc = sem_trywait(sem-sem); } while (rc == -1 errno == EINTR); if (rc == -1 errno == EAGAIN) { -return -1; +goto out; } } else { compute_abs_deadline(ts, ms); @@ -243,18 +252,25 @@ int qemu_sem_timedwait(QemuSemaphore *sem, int ms) rc = sem_timedwait(sem-sem, ts); } while (rc == -1 errno == EINTR); if (rc == -1 errno == ETIMEDOUT) { -return -1; +goto out; } } if (rc 0) { error_exit(errno, __func__); } -return 0; #endif + +out: +if (ms) { +rcu_thread_offline(); s/offline/online/ Regards, Pingfan +} +return rc; } void qemu_sem_wait(QemuSemaphore *sem) { +rcu_thread_offline(); + #if defined(__APPLE__) || defined(__NetBSD__) pthread_mutex_lock(sem-lock); --sem-count; @@ -272,6 +288,8 @@ void qemu_sem_wait(QemuSemaphore *sem) error_exit(errno, __func__); } #endif + +rcu_thread_online(); } #ifdef __linux__ @@ -380,7 +398,11 @@ void qemu_event_wait(QemuEvent *ev) return; } } +rcu_thread_offline(); futex_wait(ev, EV_BUSY); +rcu_thread_online(); +} else { +rcu_quiescent_state(); } } diff
[Qemu-devel] [PATCH v4 0/8] block: drive-backup live backup command
Note: These patches apply to my block-next tree. You can also grab the code from git here: git://github.com/stefanha/qemu.git block-backup-core This series adds a new QMP command, drive-backup, which takes a point-in-time snapshot of a block device. The snapshot is copied out to a target block device. A simple example is: drive-backup device=virtio0 format=qcow2 target=backup-20130401.qcow2 The original drive-backup blockjob was written by Dietmar Maurer diet...@proxmox.com. He is currently busy but I feel the feature is worth pushing into QEMU since there has been interest. This is my version of his patch, plus the QMP command and qemu-iotests test case. QMP 'transaction' support is included since v3. It adds support for atomic snapshots of multiple block devices. I also added an 'abort' transaction to allow testing of the .abort()/.cleanup() code path. Thanks to Wenchao for making qmp_transaction() extensible. How is this different from block-stream and drive-mirror? - Both block-stream and drive-mirror do not provide immediate point-in-time snapshots. Instead they copy data into a new file and then switch to it. In other words, the point at which the snapshot is taken cannot be controlled directly. drive-backup intercepts guest writes and saves data into the target block device before it is overwritten. The target block device can be a raw image file, backing files are not used to implement this feature. How can drive-backup be used? - The simplest use-case is to copy a point-in-time snapshot to a local file. More advanced users may wish to make the target an NBD URL. The NBD server listening on the other side can process the backup writes any way it wishes. I previously posted an RFC series with a backup server that streamed Dietmar's VMA backup archive format. What's next for drive-backup? - 1. Sync modes like drive-mirror (top, full, none). This makes it possible to preserve the backing file chain. v4: * Use notifier lists and BdrvTrackedRequest instead of custom callbacks [bonzini] * Add drive-backup QMP example JSON [eblake] * Add Since: 1.6 to QMP schema changes [eblake] v3: * Rename to drive-backup for consistency with drive-mirror [kwolf] * Add QMP transaction support [kwolf] * Introduce bdrv_add_before_write_cb() to hook writes * Mention 'query-block-jobs' lists job of type 'backup' [eblake] * Rename rwlock to flush_rwlock [kwolf] * Fix space in block/backup.c comment [kwolf] v2: * s/block_backup/block-backup/ in commit message [eblake] * Avoid funny spacing in QMP docs [eblake] * Document query-block-jobs and block-job-cancel usage [eblake] Dietmar Maurer (1): block: add basic backup support to block driver Stefan Hajnoczi (7): block: add bdrv_add_before_write_notifier() block: add drive-backup QMP command qemu-iotests: add 055 drive-backup test case blockdev: rename BlkTransactionStates to singular blockdev: add DriveBackup transaction blockdev: add Abort transaction qemu-iotests: test 'drive-backup' transaction in 055 block.c| 18 ++- block/Makefile.objs| 1 + block/backup.c | 283 blockdev.c | 264 +++--- include/block/block_int.h | 38 - qapi-schema.json | 69 - qmp-commands.hx| 37 + tests/qemu-iotests/055 | 348 + tests/qemu-iotests/055.out | 5 + tests/qemu-iotests/group | 1 + 10 files changed, 1000 insertions(+), 64 deletions(-) create mode 100644 block/backup.c create mode 100755 tests/qemu-iotests/055 create mode 100644 tests/qemu-iotests/055.out -- 1.8.1.4
[Qemu-devel] [PATCH v4 3/8] block: add drive-backup QMP command
@drive-backup Start a point-in-time copy of a block device to a new destination. The status of ongoing drive-backup operations can be checked with query-block-jobs where the BlockJobInfo.type field has the value 'backup'. The operation can be stopped before it has completed using the block-job-cancel command. @device: the name of the device whose writes should be mirrored. @target: the target of the new image. If the file exists, or if it is a device, the existing file/device will be used as the new destination. If it does not exist, a new file will be created. @format: #optional the format of the new destination, default is to probe if @mode is 'existing', else the format of the source @mode: #optional whether and how QEMU should create a new image, default is 'absolute-paths'. @speed: #optional the maximum speed, in bytes per second Returns: nothing on success If @device is not a valid block device, DeviceNotFound Since 1.6 Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 92 qapi-schema.json | 32 qmp-commands.hx | 37 +++ 3 files changed, 161 insertions(+) diff --git a/blockdev.c b/blockdev.c index d1ec99a..c7a5e1e 100644 --- a/blockdev.c +++ b/blockdev.c @@ -1353,6 +1353,98 @@ void qmp_block_commit(const char *device, drive_get_ref(drive_get_by_blockdev(bs)); } +void qmp_drive_backup(const char *device, const char *target, + bool has_format, const char *format, + bool has_mode, enum NewImageMode mode, + bool has_speed, int64_t speed, + Error **errp) +{ +BlockDriverState *bs; +BlockDriverState *target_bs; +BlockDriver *proto_drv; +BlockDriver *drv = NULL; +Error *local_err = NULL; +int flags; +uint64_t size; +int ret; + +if (!has_speed) { +speed = 0; +} +if (!has_mode) { +mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS; +} + +bs = bdrv_find(device); +if (!bs) { +error_set(errp, QERR_DEVICE_NOT_FOUND, device); +return; +} + +if (!bdrv_is_inserted(bs)) { +error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device); +return; +} + +if (!has_format) { +format = mode == NEW_IMAGE_MODE_EXISTING ? NULL : bs-drv-format_name; +} +if (format) { +drv = bdrv_find_format(format); +if (!drv) { +error_set(errp, QERR_INVALID_BLOCK_FORMAT, format); +return; +} +} + +if (bdrv_in_use(bs)) { +error_set(errp, QERR_DEVICE_IN_USE, device); +return; +} + +flags = bs-open_flags | BDRV_O_RDWR; + +proto_drv = bdrv_find_protocol(target); +if (!proto_drv) { +error_set(errp, QERR_INVALID_BLOCK_FORMAT, format); +return; +} + +bdrv_get_geometry(bs, size); +size *= 512; +if (mode != NEW_IMAGE_MODE_EXISTING) { +assert(format drv); +bdrv_img_create(target, format, +NULL, NULL, NULL, size, flags, local_err, false); +} + +if (error_is_set(local_err)) { +error_propagate(errp, local_err); +return; +} + +target_bs = bdrv_new(); +ret = bdrv_open(target_bs, target, NULL, flags, drv); + +if (ret 0) { +bdrv_delete(target_bs); +error_set(errp, QERR_OPEN_FILE_FAILED, target); +return; +} + +backup_start(bs, target_bs, speed, block_job_cb, bs, local_err); +if (local_err != NULL) { +bdrv_delete(target_bs); +error_propagate(errp, local_err); +return; +} + +/* Grab a reference so hotplug does not delete the BlockDriverState from + * underneath us. + */ +drive_get_ref(drive_get_by_blockdev(bs)); +} + #define DEFAULT_MIRROR_BUF_SIZE (10 20) void qmp_drive_mirror(const char *device, const char *target, diff --git a/qapi-schema.json b/qapi-schema.json index 98cd81c..e716522 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1730,6 +1730,38 @@ '*speed': 'int' } } ## +# @drive-backup +# +# Start a point-in-time copy of a block device to a new destination. The +# status of ongoing drive-backup operations can be checked with +# query-block-jobs where the BlockJobInfo.type field has the value 'backup'. +# The operation can be stopped before it has completed using the +# block-job-cancel command. +# +# @device: the name of the device whose writes should be mirrored. +# +# @target: the target of the new image. If the file exists, or if it +# is a device, the existing file/device will be used as the new +# destination. If it does not exist, a new file will be created. +# +# @format: #optional the format of the new destination, default is to +# probe if @mode is 'existing', else the format of the source +# +# @mode: #optional whether and how QEMU
Re: [Qemu-devel] Para-Virtualized Clock Usage
On Wed, May 15, 2013 at 08:40:25PM +, Joji Mekkattuparamban (joji) wrote: Hi Gleb, On a related note, does Qemu have an option to emulate RDTSC? QEMU or KVM? With KVM guest executes RDTSC directly on CPU. Thanks, Joji. -Original Message- From: Gleb Natapov [mailto:g...@redhat.com] Sent: Wednesday, April 24, 2013 11:57 PM To: Joji Mekkattuparamban (joji) Cc: Marcelo Tosatti; qemu-devel@nongnu.org; k...@vger.kernel.org Subject: Re: [Qemu-devel] Para-Virtualized Clock Usage On Thu, Apr 25, 2013 at 12:28:35AM +, Joji Mekkattuparamban (joji) wrote: Thank you Gleb and Marcelo. I will migrate the API using gettimeofday. Is there any dependency on the QEMU or the Guest? If the host supports pvclock and the guest invokes gettimeofday, would the pvclock be automatically used? Or do I require a patch in either the Qemu or the guest kernel? Guest and host kernel should be at least 3.8. IIRC there is not QEMU version dependency. Thanks! Joji. -Original Message- From: Marcelo Tosatti [mailto:mtosa...@redhat.com] Sent: Wednesday, April 24, 2013 1:28 AM To: Gleb Natapov Cc: Joji Mekkattuparamban (joji); qemu-devel@nongnu.org; k...@vger.kernel.org Subject: Re: [Qemu-devel] Para-Virtualized Clock Usage On Tue, Apr 23, 2013 at 08:52:16AM +0300, Gleb Natapov wrote: On Mon, Apr 22, 2013 at 04:58:01PM +, Joji Mekkattuparamban (joji) wrote: Greetings, I have a SMP guest application, running on the 2.6.27 Linux kernel. The application, originally written for bare metal, makes extensive use of the TSC, by directly invoking rdtsc from the user space for timestamp purposes. While running on KVM (RHEL version 6.3), we are running into TSC issues on some hardware. As a solution, I am considering migrating to the pvclock. I am wondering if there is an example for migrating from TSC to the pvclock. Any pointers? Wrong list, you should ask KVM (copied). Recent kernels have pvclock vdso support which means that gettimeofday() uses it without entering the kernel. Marcelo? -- Gleb. Converting application to make use of gettimeofday() should be the best way to make use of pvclock, yes. -- Gleb. -- To unsubscribe from this list: send the line unsubscribe kvm in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- Gleb.
[Qemu-devel] [PATCH v4 7/8] blockdev: add Abort transaction
The Abort action can be used to test QMP 'transaction' failure. Add it as the last action to exercise the .abort() and .cleanup() code paths for all previous actions. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 15 +++ qapi-schema.json | 13 - 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/blockdev.c b/blockdev.c index c386bb6..fcce219 100644 --- a/blockdev.c +++ b/blockdev.c @@ -973,6 +973,16 @@ static void drive_backup_abort(BlkTransactionState *common) } } +static void abort_prepare(BlkTransactionState *common, Error **errp) +{ +error_setg(errp, Transaction aborted using Abort action); +} + +static void abort_commit(BlkTransactionState *common) +{ +assert(false); /* this action never succeeds */ +} + static const BdrvActionOps actions[] = { [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = { .instance_size = sizeof(ExternalSnapshotState), @@ -986,6 +996,11 @@ static const BdrvActionOps actions[] = { .commit = drive_backup_commit, .abort = drive_backup_abort, }, +[TRANSACTION_ACTION_KIND_ABORT] = { +.instance_size = sizeof(BlkTransactionState), +.prepare = abort_prepare, +.commit = abort_commit, +}, }; /* diff --git a/qapi-schema.json b/qapi-schema.json index 114ae50..ac7bb0b 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1632,6 +1632,16 @@ '*mode': 'NewImageMode', '*speed': 'int' } } ## +# @Abort +# +# This action can be used to test transaction failure. +# +# Since: 1.6 +### +{ 'type': 'Abort', + 'data': { } } + +## # @TransactionAction # # A discriminated record of operations that can be performed with @@ -1640,7 +1650,8 @@ { 'union': 'TransactionAction', 'data': { 'blockdev-snapshot-sync': 'BlockdevSnapshot', - 'drive-backup': 'DriveBackup' + 'drive-backup': 'DriveBackup', + 'abort': 'Abort' } } ## -- 1.8.1.4
[Qemu-devel] [PATCH v4 2/8] block: add basic backup support to block driver
From: Dietmar Maurer diet...@proxmox.com backup_start() creates a block job that copies a point-in-time snapshot of a block device to a target block device. We call backup_do_cow() for each write during backup. That function reads the original data from the block device before it gets overwritten. The data is then written to the target device. Currently backup cluster size is hardcoded to 65536 bytes. [I made a number of changes to Dietmar's original patch and folded them in to make code review easy. Here is the full list: * Drop BackupDumpFunc interface in favor of a target block device * Detect zero clusters with buffer_is_zero() * Don't write zero clusters to the target * Use 0 delay instead of 1us, like other block jobs * Unify creation/start functions into backup_start() * Simplify cleanup, free bitmap in backup_run() instead of cb * function * Use HBitmap to avoid duplicating bitmap code * Use bdrv_getlength() instead of accessing -total_sectors * directly * Delete the backup.h header file, it is no longer necessary * Move ./backup.c to block/backup.c * Remove #ifdefed out code * Coding style and whitespace cleanups * Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks * Keep our own in-flight CowRequest list instead of using block.c tracked requests. This means a little code duplication but is much simpler than trying to share the tracked requests list and use the backup block size. -- stefanha] Signed-off-by: Dietmar Maurer diet...@proxmox.com Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block/Makefile.objs | 1 + block/backup.c| 283 ++ include/block/block_int.h | 16 +++ 3 files changed, 300 insertions(+) create mode 100644 block/backup.c diff --git a/block/Makefile.objs b/block/Makefile.objs index 5f0358a..88bd101 100644 --- a/block/Makefile.objs +++ b/block/Makefile.objs @@ -20,5 +20,6 @@ endif common-obj-y += stream.o common-obj-y += commit.o common-obj-y += mirror.o +common-obj-y += backup.o $(obj)/curl.o: QEMU_CFLAGS+=$(CURL_CFLAGS) diff --git a/block/backup.c b/block/backup.c new file mode 100644 index 000..5438e26 --- /dev/null +++ b/block/backup.c @@ -0,0 +1,283 @@ +/* + * QEMU backup + * + * Copyright (C) 2013 Proxmox Server Solutions + * + * Authors: + * Dietmar Maurer (diet...@proxmox.com) + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include stdio.h +#include errno.h +#include unistd.h + +#include block/block.h +#include block/block_int.h +#include block/blockjob.h +#include qemu/ratelimit.h + +#define DEBUG_BACKUP 0 + +#define DPRINTF(fmt, ...) \ +do { \ +if (DEBUG_BACKUP) { \ +fprintf(stderr, backup: fmt, ## __VA_ARGS__); \ +} \ +} while (0) + +#define BACKUP_CLUSTER_BITS 16 +#define BACKUP_CLUSTER_SIZE (1 BACKUP_CLUSTER_BITS) +#define BACKUP_BLOCKS_PER_CLUSTER (BACKUP_CLUSTER_SIZE / BDRV_SECTOR_SIZE) + +#define SLICE_TIME 1ULL /* ns */ + +typedef struct CowRequest { +int64_t start; +int64_t end; +QLIST_ENTRY(CowRequest) list; +CoQueue wait_queue; /* coroutines blocked on this request */ +} CowRequest; + +typedef struct BackupBlockJob { +BlockJob common; +BlockDriverState *target; +RateLimit limit; +CoRwlock flush_rwlock; +uint64_t sectors_read; +HBitmap *bitmap; +QLIST_HEAD(, CowRequest) inflight_reqs; +} BackupBlockJob; + +/* See if in-flight requests overlap and wait for them to complete */ +static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job, + int64_t start, + int64_t end) +{ +CowRequest *req; +bool retry; + +do { +retry = false; +QLIST_FOREACH(req, job-inflight_reqs, list) { +if (end req-start start req-end) { +qemu_co_queue_wait(req-wait_queue); +retry = true; +break; +} +} +} while (retry); +} + +/* Keep track of an in-flight request */ +static void cow_request_begin(CowRequest *req, BackupBlockJob *job, + int64_t start, int64_t end) +{ +req-start = start; +req-end = end; +qemu_co_queue_init(req-wait_queue); +QLIST_INSERT_HEAD(job-inflight_reqs, req, list); +} + +/* Forget about a completed request */ +static void cow_request_end(CowRequest *req) +{ +QLIST_REMOVE(req, list); +qemu_co_queue_restart_all(req-wait_queue); +} + +static int coroutine_fn backup_do_cow(BlockDriverState *bs, + int64_t sector_num, int nb_sectors) +{ +BackupBlockJob *job = (BackupBlockJob *)bs-job; +CowRequest cow_request; +struct iovec iov; +QEMUIOVector bounce_qiov; +void *bounce_buffer = NULL; +int ret = 0; +
[Qemu-devel] [PATCH v4 5/8] blockdev: rename BlkTransactionStates to singular
The QMP 'transaction' command keeps a list of in-flight transactions. The transaction state structure is called BlkTransactionStates even though it only deals with a single transaction. The only plural thing is the linked list of transaction states. I find it confusing to call the single structure States. This patch renames it to State, just like BlockDriverState is singular. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 104 ++--- 1 file changed, 52 insertions(+), 52 deletions(-) diff --git a/blockdev.c b/blockdev.c index c7a5e1e..b6109da 100644 --- a/blockdev.c +++ b/blockdev.c @@ -780,7 +780,7 @@ void qmp_blockdev_snapshot_sync(const char *device, const char *snapshot_file, /* New and old BlockDriverState structs for group snapshots */ -typedef struct BlkTransactionStates BlkTransactionStates; +typedef struct BlkTransactionState BlkTransactionState; /* Only prepare() may fail. In a single transaction, only one of commit() or abort() will be called, clean() will always be called if it present. */ @@ -788,13 +788,13 @@ typedef struct BdrvActionOps { /* Size of state struct, in bytes. */ size_t instance_size; /* Prepare the work, must NOT be NULL. */ -void (*prepare)(BlkTransactionStates *common, Error **errp); +void (*prepare)(BlkTransactionState *common, Error **errp); /* Commit the changes, must NOT be NULL. */ -void (*commit)(BlkTransactionStates *common); +void (*commit)(BlkTransactionState *common); /* Abort the changes on fail, can be NULL. */ -void (*abort)(BlkTransactionStates *common); +void (*abort)(BlkTransactionState *common); /* Clean up resource in the end, can be NULL. */ -void (*clean)(BlkTransactionStates *common); +void (*clean)(BlkTransactionState *common); } BdrvActionOps; /* @@ -802,20 +802,20 @@ typedef struct BdrvActionOps { * that compiler will also arrange it to the same address with parent instance. * Later it will be used in free(). */ -struct BlkTransactionStates { +struct BlkTransactionState { TransactionAction *action; const BdrvActionOps *ops; -QSIMPLEQ_ENTRY(BlkTransactionStates) entry; +QSIMPLEQ_ENTRY(BlkTransactionState) entry; }; /* external snapshot private data */ -typedef struct ExternalSnapshotStates { -BlkTransactionStates common; +typedef struct ExternalSnapshotState { +BlkTransactionState common; BlockDriverState *old_bs; BlockDriverState *new_bs; -} ExternalSnapshotStates; +} ExternalSnapshotState; -static void external_snapshot_prepare(BlkTransactionStates *common, +static void external_snapshot_prepare(BlkTransactionState *common, Error **errp) { BlockDriver *proto_drv; @@ -826,8 +826,8 @@ static void external_snapshot_prepare(BlkTransactionStates *common, const char *new_image_file; const char *format = qcow2; enum NewImageMode mode = NEW_IMAGE_MODE_ABSOLUTE_PATHS; -ExternalSnapshotStates *states = - DO_UPCAST(ExternalSnapshotStates, common, common); +ExternalSnapshotState *state = + DO_UPCAST(ExternalSnapshotState, common, common); TransactionAction *action = common-action; /* get parameters */ @@ -849,30 +849,30 @@ static void external_snapshot_prepare(BlkTransactionStates *common, return; } -states-old_bs = bdrv_find(device); -if (!states-old_bs) { +state-old_bs = bdrv_find(device); +if (!state-old_bs) { error_set(errp, QERR_DEVICE_NOT_FOUND, device); return; } -if (!bdrv_is_inserted(states-old_bs)) { +if (!bdrv_is_inserted(state-old_bs)) { error_set(errp, QERR_DEVICE_HAS_NO_MEDIUM, device); return; } -if (bdrv_in_use(states-old_bs)) { +if (bdrv_in_use(state-old_bs)) { error_set(errp, QERR_DEVICE_IN_USE, device); return; } -if (!bdrv_is_read_only(states-old_bs)) { -if (bdrv_flush(states-old_bs)) { +if (!bdrv_is_read_only(state-old_bs)) { +if (bdrv_flush(state-old_bs)) { error_set(errp, QERR_IO_ERROR); return; } } -flags = states-old_bs-open_flags; +flags = state-old_bs-open_flags; proto_drv = bdrv_find_protocol(new_image_file); if (!proto_drv) { @@ -883,8 +883,8 @@ static void external_snapshot_prepare(BlkTransactionStates *common, /* create new image w/backing file */ if (mode != NEW_IMAGE_MODE_EXISTING) { bdrv_img_create(new_image_file, format, -states-old_bs-filename, -states-old_bs-drv-format_name, +state-old_bs-filename, +state-old_bs-drv-format_name, NULL, -1, flags, local_err, false); if (error_is_set(local_err)) { error_propagate(errp,
[Qemu-devel] [PATCH v4 4/8] qemu-iotests: add 055 drive-backup test case
Testing drive-backup is similar to image streaming and drive mirroring. This test case is based on 041. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- tests/qemu-iotests/055 | 230 + tests/qemu-iotests/055.out | 5 + tests/qemu-iotests/group | 1 + 3 files changed, 236 insertions(+) create mode 100755 tests/qemu-iotests/055 create mode 100644 tests/qemu-iotests/055.out diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055 new file mode 100755 index 000..bc2eebf --- /dev/null +++ b/tests/qemu-iotests/055 @@ -0,0 +1,230 @@ +#!/usr/bin/env python +# +# Tests for drive-backup +# +# Copyright (C) 2013 Red Hat, Inc. +# +# Based on 041. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see http://www.gnu.org/licenses/. +# + +import time +import os +import iotests +from iotests import qemu_img, qemu_io + +test_img = os.path.join(iotests.test_dir, 'test.img') +target_img = os.path.join(iotests.test_dir, 'target.img') + +class DriveBackupTestCase(iotests.QMPTestCase): +'''Abstract base class for drive-backup test cases''' + +def assert_no_active_backups(self): +result = self.vm.qmp('query-block-jobs') +self.assert_qmp(result, 'return', []) + +def cancel_and_wait(self, drive='drive0'): +'''Cancel a block job and wait for it to finish''' +result = self.vm.qmp('block-job-cancel', device=drive) +self.assert_qmp(result, 'return', {}) + +cancelled = False +while not cancelled: +for event in self.vm.get_qmp_events(wait=True): +if event['event'] == 'BLOCK_JOB_COMPLETED' or \ + event['event'] == 'BLOCK_JOB_CANCELLED': +self.assert_qmp(event, 'data/type', 'backup') +self.assert_qmp(event, 'data/device', drive) +cancelled = True + +self.assert_no_active_backups() + +def complete_and_wait(self): +completed = False +while not completed: +for event in self.vm.get_qmp_events(wait=True): +if event['event'] == 'BLOCK_JOB_COMPLETED': +self.assert_qmp(event, 'data/type', 'backup') +self.assert_qmp(event, 'data/device', 'drive0') +self.assert_qmp(event, 'data/offset', self.image_len) +self.assert_qmp(event, 'data/len', self.image_len) +completed = True +self.assert_no_active_backups() + +def compare_images(self, img1, img2): +try: +qemu_img('convert', '-f', iotests.imgfmt, '-O', 'raw', img1, img1 + '.raw') +qemu_img('convert', '-f', iotests.imgfmt, '-O', 'raw', img2, img2 + '.raw') +file1 = open(img1 + '.raw', 'r') +file2 = open(img2 + '.raw', 'r') +return file1.read() == file2.read() +finally: +if file1 is not None: +file1.close() +if file2 is not None: +file2.close() +try: +os.remove(img1 + '.raw') +except OSError: +pass +try: +os.remove(img2 + '.raw') +except OSError: +pass + +class TestSingleDrive(DriveBackupTestCase): +image_len = 120 * 1024 * 1024 # MB + +def setUp(self): +qemu_img('create', '-f', iotests.imgfmt, test_img, str(TestSingleDrive.image_len)) +self.vm = iotests.VM().add_drive(test_img) +self.vm.launch() + +def tearDown(self): +self.vm.shutdown() +os.remove(test_img) +try: +os.remove(target_img) +except OSError: +pass + +def test_complete(self): +self.assert_no_active_backups() + +result = self.vm.qmp('drive-backup', device='drive0', + target=target_img) +self.assert_qmp(result, 'return', {}) + +self.complete_and_wait() +self.vm.shutdown() +self.assertTrue(self.compare_images(test_img, target_img), +'target image does not match source after backup') + +def test_cancel(self): +self.assert_no_active_backups() + +result = self.vm.qmp('drive-backup', device='drive0', + target=target_img) +self.assert_qmp(result, 'return', {}) + +self.cancel_and_wait() + +
[Qemu-devel] [PATCH v4 6/8] blockdev: add DriveBackup transaction
This patch adds a transactional version of the drive-backup QMP command. It allows atomic snapshots of multiple drives along with automatic cleanup if there is a failure to start one of the backup jobs. Note that QMP events are emitted for block job completion/cancellation and the block job will be listed by query-block-jobs. @DriveBackup @device: the name of the device whose writes should be mirrored. @target: the target of the new image. If the file exists, or if it is a device, the existing file/device will be used as the new destination. If it does not exist, a new file will be created. @format: #optional the format of the new destination, default is to probe if @mode is 'existing', else the format of the source @mode: #optional whether and how QEMU should create a new image, default is 'absolute-paths'. @speed: #optional the maximum speed, in bytes per second Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- blockdev.c | 53 + qapi-schema.json | 26 +- 2 files changed, 78 insertions(+), 1 deletion(-) diff --git a/blockdev.c b/blockdev.c index b6109da..c386bb6 100644 --- a/blockdev.c +++ b/blockdev.c @@ -926,6 +926,53 @@ static void external_snapshot_abort(BlkTransactionState *common) } } +typedef struct DriveBackupState { +BlkTransactionState common; +BlockDriverState *bs; +BlockJob *job; +} DriveBackupState; + +static void drive_backup_prepare(BlkTransactionState *common, Error **errp) +{ +DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common); +DriveBackup *backup; +Error *local_err = NULL; + +assert(common-action-kind == TRANSACTION_ACTION_KIND_DRIVE_BACKUP); +backup = common-action-drive_backup; + +qmp_drive_backup(backup-device, backup-target, + backup-has_format, backup-format, + backup-has_mode, backup-mode, + backup-has_speed, backup-speed, + local_err); +if (error_is_set(local_err)) { +error_propagate(errp, local_err); +state-bs = NULL; +state-job = NULL; +return; +} + +state-bs = bdrv_find(backup-device); +state-job = state-bs-job; +} + +static void drive_backup_commit(BlkTransactionState *common) +{ +/* Block job has started, nothing to do here */ +} + +static void drive_backup_abort(BlkTransactionState *common) +{ +DriveBackupState *state = DO_UPCAST(DriveBackupState, common, common); +BlockDriverState *bs = state-bs; + +/* Only cancel if it's the job we started */ +if (bs bs-job bs-job == state-job) { +block_job_cancel_sync(bs-job); +} +} + static const BdrvActionOps actions[] = { [TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC] = { .instance_size = sizeof(ExternalSnapshotState), @@ -933,6 +980,12 @@ static const BdrvActionOps actions[] = { .commit = external_snapshot_commit, .abort = external_snapshot_abort, }, +[TRANSACTION_ACTION_KIND_DRIVE_BACKUP] = { +.instance_size = sizeof(DriveBackupState), +.prepare = drive_backup_prepare, +.commit = drive_backup_commit, +.abort = drive_backup_abort, +}, }; /* diff --git a/qapi-schema.json b/qapi-schema.json index e716522..114ae50 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1609,6 +1609,29 @@ '*mode': 'NewImageMode' } } ## +# @DriveBackup +# +# @device: the name of the device whose writes should be mirrored. +# +# @target: the target of the new image. If the file exists, or if it +# is a device, the existing file/device will be used as the new +# destination. If it does not exist, a new file will be created. +# +# @format: #optional the format of the new destination, default is to +# probe if @mode is 'existing', else the format of the source +# +# @mode: #optional whether and how QEMU should create a new image, default is +#'absolute-paths'. +# +# @speed: #optional the maximum speed, in bytes per second +# +# Since: 1.6 +## +{ 'type': 'DriveBackup', + 'data': { 'device': 'str', 'target': 'str', '*format': 'str', +'*mode': 'NewImageMode', '*speed': 'int' } } + +## # @TransactionAction # # A discriminated record of operations that can be performed with @@ -1616,7 +1639,8 @@ ## { 'union': 'TransactionAction', 'data': { - 'blockdev-snapshot-sync': 'BlockdevSnapshot' + 'blockdev-snapshot-sync': 'BlockdevSnapshot', + 'drive-backup': 'DriveBackup' } } ## -- 1.8.1.4
[Qemu-devel] [PATCH v4 1/8] block: add bdrv_add_before_write_notifier()
The bdrv_add_before_write_notifier() function installs a callback that is invoked before a write request is processed. This will be used to implement copy-on-write point-in-time snapshots where we need to copy out old data before overwriting it. Note that BdrvTrackedRequest is moved to block_int.h since it is passed to .notify() functions. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- block.c | 18 -- include/block/block_int.h | 22 +- 2 files changed, 29 insertions(+), 11 deletions(-) diff --git a/block.c b/block.c index 3f87489..591e312 100644 --- a/block.c +++ b/block.c @@ -308,6 +308,7 @@ BlockDriverState *bdrv_new(const char *device_name) } bdrv_iostatus_disable(bs); notifier_list_init(bs-close_notifiers); +notifier_list_init(bs-before_write_notifiers); return bs; } @@ -1839,16 +1840,6 @@ int bdrv_commit_all(void) return 0; } -struct BdrvTrackedRequest { -BlockDriverState *bs; -int64_t sector_num; -int nb_sectors; -bool is_write; -QLIST_ENTRY(BdrvTrackedRequest) list; -Coroutine *co; /* owner, used for deadlock detection */ -CoQueue wait_queue; /* coroutines blocked on this request */ -}; - /** * Remove an active request from the tracked requests list * @@ -2619,6 +2610,8 @@ static int coroutine_fn bdrv_co_do_writev(BlockDriverState *bs, tracked_request_begin(req, bs, sector_num, nb_sectors, true); +notifier_list_notify(bs-before_write_notifiers, req); + if (flags BDRV_REQ_ZERO_WRITE) { ret = bdrv_co_do_write_zeroes(bs, sector_num, nb_sectors); } else { @@ -4883,3 +4876,8 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs) /* Currently BlockDriverState always uses the main loop AioContext */ return qemu_get_aio_context(); } + +void bdrv_add_before_write_notifier(BlockDriverState *bs, Notifier *notifier) +{ +notifier_list_add(bs-before_write_notifiers, notifier); +} diff --git a/include/block/block_int.h b/include/block/block_int.h index 6078dd3..a498fb0 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -58,7 +58,16 @@ #define BLOCK_OPT_LAZY_REFCOUNTSlazy_refcounts #define BLOCK_OPT_ADAPTER_TYPE adapter_type -typedef struct BdrvTrackedRequest BdrvTrackedRequest; +typedef struct BdrvTrackedRequest { +BlockDriverState *bs; +int64_t sector_num; +int nb_sectors; +bool is_write; +QLIST_ENTRY(BdrvTrackedRequest) list; +Coroutine *co; /* owner, used for deadlock detection */ +CoQueue wait_queue; /* coroutines blocked on this request */ +} BdrvTrackedRequest; + typedef struct BlockIOLimit { int64_t bps[3]; @@ -247,6 +256,9 @@ struct BlockDriverState { NotifierList close_notifiers; +/* Callback before write request is processed */ +NotifierList before_write_notifiers; + /* number of in-flight copy-on-read requests */ unsigned int copy_on_read_in_flight; @@ -298,6 +310,14 @@ void bdrv_set_io_limits(BlockDriverState *bs, BlockIOLimit *io_limits); /** + * bdrv_add_before_write_notifier: + * + * Register a callback that is invoked before write requests are processed but + * after any throttling or waiting for overlapping requests. + */ +void bdrv_add_before_write_notifier(BlockDriverState *bs, Notifier *notifier); + +/** * bdrv_get_aio_context: * * Returns: the currently bound #AioContext -- 1.8.1.4
[Qemu-devel] [PATCH v4 8/8] qemu-iotests: test 'drive-backup' transaction in 055
Ensure that the 'drive-backup' transaction works and check that a transaction abort works properly. Signed-off-by: Stefan Hajnoczi stefa...@redhat.com --- tests/qemu-iotests/055 | 118 + tests/qemu-iotests/055.out | 4 +- 2 files changed, 120 insertions(+), 2 deletions(-) diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055 index bc2eebf..d61993f 100755 --- a/tests/qemu-iotests/055 +++ b/tests/qemu-iotests/055 @@ -226,5 +226,123 @@ class TestSetSpeed(DriveBackupTestCase): self.cancel_and_wait() +class TestSingleTransaction(DriveBackupTestCase): +image_len = 120 * 1024 * 1024 # MB + +def setUp(self): +qemu_img('create', '-f', iotests.imgfmt, test_img, str(TestSingleDrive.image_len)) +self.vm = iotests.VM().add_drive(test_img) +self.vm.launch() + +def tearDown(self): +self.vm.shutdown() +os.remove(test_img) +try: +os.remove(target_img) +except OSError: +pass + +def test_complete(self): +self.assert_no_active_backups() + +result = self.vm.qmp('transaction', actions=[{ +'type': 'drive-backup', +'data': { 'device': 'drive0', + 'target': target_img }, +} +]) +self.assert_qmp(result, 'return', {}) + +self.complete_and_wait() +self.vm.shutdown() +self.assertTrue(self.compare_images(test_img, target_img), +'target image does not match source after backup') + +def test_cancel(self): +self.assert_no_active_backups() + +result = self.vm.qmp('transaction', actions=[{ +'type': 'drive-backup', +'data': { 'device': 'drive0', + 'target': target_img }, +} +]) +self.assert_qmp(result, 'return', {}) + +self.cancel_and_wait() + +def test_pause(self): +self.assert_no_active_backups() + +result = self.vm.qmp('transaction', actions=[{ +'type': 'drive-backup', +'data': { 'device': 'drive0', + 'target': target_img }, +} +]) +self.assert_qmp(result, 'return', {}) + +result = self.vm.qmp('block-job-pause', device='drive0') +self.assert_qmp(result, 'return', {}) + +time.sleep(1) +result = self.vm.qmp('query-block-jobs') +offset = self.dictpath(result, 'return[0]/offset') + +time.sleep(1) +result = self.vm.qmp('query-block-jobs') +self.assert_qmp(result, 'return[0]/offset', offset) + +result = self.vm.qmp('block-job-resume', device='drive0') +self.assert_qmp(result, 'return', {}) + +self.complete_and_wait() +self.vm.shutdown() +self.assertTrue(self.compare_images(test_img, target_img), +'target image does not match source after backup') + +def test_medium_not_found(self): +result = self.vm.qmp('transaction', actions=[{ +'type': 'drive-backup', +'data': { 'device': 'ide1-cd0', + 'target': target_img }, +} +]) +self.assert_qmp(result, 'error/class', 'GenericError') + +def test_image_not_found(self): +result = self.vm.qmp('transaction', actions=[{ +'type': 'drive-backup', +'data': { 'device': 'drive0', + 'mode': 'existing', + 'target': target_img }, +} +]) +self.assert_qmp(result, 'error/class', 'GenericError') + +def test_device_not_found(self): +result = self.vm.qmp('transaction', actions=[{ +'type': 'drive-backup', +'data': { 'device': 'nonexistent', + 'mode': 'existing', + 'target': target_img }, +} +]) +self.assert_qmp(result, 'error/class', 'DeviceNotFound') + +def test_abort(self): +result = self.vm.qmp('transaction', actions=[{ +'type': 'drive-backup', +'data': { 'device': 'nonexistent', + 'mode': 'existing', + 'target': target_img }, +}, { +'type': 'Abort', +'data': {}, +} +]) +self.assert_qmp(result, 'error/class', 'GenericError') +self.assert_no_active_backups() + if __name__ == '__main__': iotests.main(supported_fmts=['raw', 'qcow2']) diff --git a/tests/qemu-iotests/055.out b/tests/qemu-iotests/055.out index 594c16f..96961ed 100644 --- a/tests/qemu-iotests/055.out +++ b/tests/qemu-iotests/055.out @@ -1,5 +1,5 @@ - +... -- -Ran 8 tests +Ran 15 tests OK -- 1.8.1.4
Re: [Qemu-devel] tap devices not receiving packets from a bridge
On Thu, May 16, 2013 at 09:20:55AM +0100, Nicholas Thomas wrote: Hi, On Thu, 2013-05-16 at 09:27 +0300, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 09:24:05AM +0300, Michael S. Tsirkin wrote: Is this with or without vhost-net in host? never mind, I see it's without. Try to enable vhost-net (you'll have to switch to -netdev syntax for that to work) and see if this help. If it does it's likely a qemu bug if not probably a guest bug. Switching to -netdev is non-trivial for me, unfortunately. Interesting. Why is that? Anyway, it's definitely a qemu bug - it happens on kernels 3.2 and 3.9 with 1.4.1, but doesn't happen with qemu 0.15.0 or 1.5.0rc1. I'll have a dig through git to see if I can identify the patch that resolves it. It feels-like qemu sometimes stops reading from the tap file descriptor between ipxe exiting and the linux kernel bringing up the network interface, and never recovers from that. /Nick You can try to bisect, yes.
Re: [Qemu-devel] [PATCH 08/12] qemu-thread: report RCU quiescent states
Il 16/05/2013 10:33, liu ping fan ha scritto: +Luckily, in most cases no manual annotation is needed, because waiting +on condition variables (qemu_cond_wait), semaphores (qemu_sem_wait, +qemu_sem_timedwait) or events (qemu_event_wait) implicitly marks the thread +as quiescent for the whole duration of the wait. (There is an exception +for semaphore waits with a zero timeout). + Why not the same rule for zero timeout? Because they are not really doing synchronization, they are basically an atomic_dec_if_not_zero on the semaphore count. Paolo
Re: [Qemu-devel] tap devices not receiving packets from a bridge
Am 16.05.2013 10:40, schrieb Michael S. Tsirkin: On Thu, May 16, 2013 at 09:20:55AM +0100, Nicholas Thomas wrote: Hi, On Thu, 2013-05-16 at 09:27 +0300, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 09:24:05AM +0300, Michael S. Tsirkin wrote: Is this with or without vhost-net in host? never mind, I see it's without. Try to enable vhost-net (you'll have to switch to -netdev syntax for that to work) and see if this help. If it does it's likely a qemu bug if not probably a guest bug. Switching to -netdev is non-trivial for me, unfortunately. Interesting. Why is that? Anyway, it's definitely a qemu bug - it happens on kernels 3.2 and 3.9 with 1.4.1, but doesn't happen with qemu 0.15.0 or 1.5.0rc1. I'll have a dig through git to see if I can identify the patch that resolves it. It feels-like qemu sometimes stops reading from the tap file descriptor between ipxe exiting and the linux kernel bringing up the network interface, and never recovers from that. /Nick You can try to bisect, yes. It would be good to bisect this. I would appreciate it. I have a similar problem with rtl8139 (without vhost-net), but I was unable to reproduce yet. Thanks, Peter
Re: [Qemu-devel] [Bug 1179731] [NEW] is networking broken on windows hosts?
Il 16/05/2013 07:52, TeLeMan ha scritto: The patch is working on 134a03e0b3d34b01b68107104c525c3bff1211d4 and is not working from cbff4b342b000a7642125dbdabf61113e05eee44. Thanks. Fabien or Stefan, can you take a look? TeLeMan, can you post the exact patches that you tested on those two commits? (To recap, there are at least two bugs. The first is fixed by the patch at http://article.gmane.org/gmane.comp.emulators.qemu/211333/raw, the second is introduced by patch cbff4b3, main-loop: switch to g_poll() on POSIX hosts, 2013-02-20). Paolo -- SUN OF A BEACH On Wed, May 15, 2013 at 4:37 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 15/05/2013 03:38, TeLeMan ha scritto: On Tue, May 14, 2013 at 7:55 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 14/05/2013 13:48, Paolo Bonzini ha scritto: Il 14/05/2013 13:39, TeLeMan ha scritto: On Tue, May 14, 2013 at 6:46 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 14/05/2013 12:24, TeLeMan ha scritto: On Tue, May 14, 2013 at 3:51 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Tue, May 14, 2013 at 12:02:24AM -, therock247uk wrote: just wondering as i just compiled the latest git and qemu goes into none responding mode when i try to do any networking stuff on guests (both linux and windows) Works for me on qemu.git/master on Linux: $ git rev-parse HEAD b087143b4d010451208264b7c841436aafe1cbb1 $ x86_64-softmmu/qemu-system-x86_64 -m 1024 -enable-kvm -cpu host \ -drive if=virtio,cache=none,file=test.img Please include more information, like the QEMU command-line and commit ID. Stefan This regression occurs on the Windows host. SLIRP hangs in sorecvfrom(). Can you bisect it? Paolo The first break is the commit 5e3bc735d93dd23f074b5116fd11e1ad8cd4962f. But it need more packets than HEAD to reproduce this regression. Please check if this partial revert of that commit fixes it: Yeah, this should work... WSAEventSelect is edge-triggered and the event will not be signaled if the socket handler does not consume all the data in the socket buffer. Unfortunately, it does not work. Ok... as you can see the patch is just moving a block of code just before g_main_context_prepare(context, max_priority). Can you please try doing the same on top of these six commits: 134a03e0b3d34b01b68107104c525c3bff1211d4 cbff4b342b000a7642125dbdabf61113e05eee44 48ce11ff972c733afaed3e2a2613a2e56081ec92 8917c3bdba37d6fe4393db0fad3fabbde9530d6b a3e4b4a8091cc4fcf7cb619570c72c54c2d6a6e9 9cbaacf999b01b27dc3a22502705178057af66de Paolo
Re: [Qemu-devel] [PATCH] ide-test: Fix endianness problems
On Wed, May 15, 2013 at 03:00:39PM +0200, Kevin Wolf wrote: @@ -355,6 +364,17 @@ static void test_bmdma_teardown(void) ide_test_quit(); } +static void string_cpu_to_be16(uint16_t *s, size_t bytes) +{ +g_assert((bytes 1) == 0); +bytes /= 2; + +while (bytes--) { +*s = cpu_to_be16(*s); +s++; +} +} + static void test_identify(void) { uint8_t data; @@ -389,10 +409,12 @@ static void test_identify(void) assert_bit_clear(data, BSY | DF | ERR | DRQ); /* Check serial number/version in the buffer */ -ret = memcmp(buf[10], ettsidks, 20); +string_cpu_to_be16(buf[10], 20); +ret = memcmp(buf[10], testdisk, 20); g_assert(ret == 0); -ret = memcmp(buf[23], evsroi n, 8); +string_cpu_to_be16(buf[23], 8); +ret = memcmp(buf[23], version , 8); It would have been simpler to specify string_cpu_to_be16() length in elements instead of bytes. Then you can drop the assertion and conversion. Not a problem though. Anthony: Please take this patch without a pull request. I think me sending pull requests for a single late-rc fix doesn't add value. Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
Re: [Qemu-devel] [Bug 1094564] Re: images used as scsi disks not readable (qemu-system-arm, macos 10.8)
On 15 May 2013 21:18, Peter Maydell peter.mayd...@linaro.org wrote: I suspect that if you configure on linux with --with-coroutine=sigaltstack you might then find they both behave the same (MacOSX can't do the ucontext coroutines we default to on linux). They don't, so it's a MacOSX specific issue of some kind. PS: you don't need -global versatile_pci.broken-irq-mapping=1 in the command line because we do correctly autodetect and handle your kernel now. thanks -- PMM
Re: [Qemu-devel] [RFC PATCH 2/2] mem: prepare address_space listener rcu style
Better: rcu_read_lock(); do { pgtbl = d-cur_pgtbl; smp_rmb(); root = d-cur_root; smp_rmb(); /* d-cur_pgtbl == d-next_pgtbl only during an update. */ } while (pgtbl == d-next_pgtbl); ... rcu_read_unlock(); And in the writer: old_pgtbl = d-cur_pgtbl; /* Point to the new page table, tell readers cur_root is invalid. */ smp_wmb(); d-cur_pgtbl = d-next_pgtbl; /* Write the root before updating the page table. */ smp_wmb(); d-cur_root = d-next_root; /* Write cur_root before telling readers it is valid. */ smp_wmb(); d-next_pgtbl = NULL; /* Drop reference once readers will be done with it. */ call_rcu(page_table_unref, old_pgtbl, rcu); Paolo
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On 05/16/2013 03:35 PM, Amos Kong wrote: On Thu, May 16, 2013 at 03:23:21PM +0800, Lei Li wrote: On 05/16/2013 12:30 PM, Amos Kong wrote: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; Hi Lei, Does this check allow to distinguish the difference between auto-repeat and actual repeated entry by the user? Actual repeat by user: press event release event press event release event press event release event Auto-repeat example: press event press event press event release event On what platform? AFAIK, the Auto-repeat event is like below on some GTK-based environments, keydown keypress keyup keydown keypress keyup| ... as reference link: https://developer.mozilla.org/zh-CN/docs/DOM/KeyboardEvent And on Xwindows: keypress keyrelease keypress keyrelease ... as reference link: http://www.ypass.net/blog/2009/06/detecting-xlibs-keyboard-auto-repeat-functionality-and-how-to-fix-it/ This would cause it's hard to distinguish them. But looks like the links above is a little out of time, and I am not sure if the auto-repeat behaviour on such platforms has been changed. :) | so here we check if it's a press event, only set repeat_timer for press event. When we get release event, we just stop repeat action. ps2_queue(s-common, keycode); + +if (auto_repeat) { +s-repeat_key = keycode; +/* delay a while before first repeat */ +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_delay, 1000)); +} } -- Lei
Re: [Qemu-devel] [PATCH] ide-test: Fix endianness problems
Am 16.05.2013 um 11:04 hat Stefan Hajnoczi geschrieben: Anthony: Please take this patch without a pull request. I think me sending pull requests for a single late-rc fix doesn't add value. Reviewed-by: Stefan Hajnoczi stefa...@redhat.com It's already merged. Kevin
Re: [Qemu-devel] [PATCH] qemu-io: Fix 'map' output
On Wed, May 15, 2013 at 01:47:12PM +0200, Kevin Wolf wrote: +static int map_is_allocated(int64_t sector_num, int64_t nb_sectors, int64_t *pnum) +{ +int num, num_checked; +int ret, firstret; + +num_checked = MIN(nb_sectors, INT_MAX); +ret = bdrv_is_allocated(bs, sector_num, num_checked, num); +if (ret 0) { +return ret; +} + +firstret = ret; +*pnum = num; + +while (nb_sectors 0 ret == firstret) { +sector_num += num; +nb_sectors -= num; + +num_checked = MIN(nb_sectors, INT_MAX); +ret = bdrv_is_allocated(bs, sector_num, num_checked, num); +if (ret == firstret) { +*pnum += num; +} else { +break; +} The break makes ret == firstret redundant above. I suggest just while (nb_sectors 0) { ... } which is easier to read. Also, if you respin the patch please tweak the commit message. Coalesce 'map' output is more specific than Fix 'map' output - unless this really fixes a bug which you didn't mention in the commit description.
Re: [Qemu-devel] [PATCH] qemu-io: Fix 'map' output
Am 16.05.2013 um 11:14 hat Stefan Hajnoczi geschrieben: On Wed, May 15, 2013 at 01:47:12PM +0200, Kevin Wolf wrote: +static int map_is_allocated(int64_t sector_num, int64_t nb_sectors, int64_t *pnum) +{ +int num, num_checked; +int ret, firstret; + +num_checked = MIN(nb_sectors, INT_MAX); +ret = bdrv_is_allocated(bs, sector_num, num_checked, num); +if (ret 0) { +return ret; +} + +firstret = ret; +*pnum = num; + +while (nb_sectors 0 ret == firstret) { +sector_num += num; +nb_sectors -= num; + +num_checked = MIN(nb_sectors, INT_MAX); +ret = bdrv_is_allocated(bs, sector_num, num_checked, num); +if (ret == firstret) { +*pnum += num; +} else { +break; +} The break makes ret == firstret redundant above. I suggest just while (nb_sectors 0) { ... } which is easier to read. Okay. I wasn't sure which was better. Don't know though how it came that I have both checks now... Also, if you respin the patch please tweak the commit message. Coalesce 'map' output is more specific than Fix 'map' output - unless this really fixes a bug which you didn't mention in the commit description. I'll change the title. It makes different formats behave the same even if they work in different granularities. I think QED was bitten by this in qemu-iotests somwhere because it could give different results than qcow2, possibly also dependent on timing. Maybe I should mention that as well in the commit message. Kevin
Re: [Qemu-devel] [PATCH v2 for-1.5 0/4] qcow2: Catch some L1 table index overflows
On Wed, May 15, 2013 at 10:28:04AM +0200, Stefan Hajnoczi wrote: On Tue, May 14, 2013 at 04:14:32PM +0200, Kevin Wolf wrote: Kevin Wolf (4): qcow2: Catch some L1 table index overflows block: Add hint to -EFBIG error message qcow2.py: Subcommand for changing header fields qemu-iotests: Try creating huge qcow2 image block.c | 8 -- block/qcow2-cluster.c| 23 -- block/qcow2.c| 13 -- block/qcow2.h| 5 ++-- tests/qemu-iotests/054 | 58 tests/qemu-iotests/054.out | 10 tests/qemu-iotests/common.rc | 2 +- tests/qemu-iotests/group | 1 + tests/qemu-iotests/qcow2.py | 17 + 9 files changed, 122 insertions(+), 15 deletions(-) create mode 100755 tests/qemu-iotests/054 create mode 100644 tests/qemu-iotests/054.out Thanks, applied Patches 2 3 to my block-next tree: https://github.com/stefanha/qemu/commits/block-next This should have said Patches 3 4 :). Stefan
Re: [Qemu-devel] [Bug 1179731] [NEW] is networking broken on windows hosts?
On Thu, May 16, 2013 at 4:59 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 16/05/2013 07:52, TeLeMan ha scritto: The patch is working on 134a03e0b3d34b01b68107104c525c3bff1211d4 and is not working from cbff4b342b000a7642125dbdabf61113e05eee44. Thanks. Fabien or Stefan, can you take a look? TeLeMan, can you post the exact patches that you tested on those two commits? (To recap, there are at least two bugs. The first is fixed by the patch at http://article.gmane.org/gmane.comp.emulators.qemu/211333/raw, the second is introduced by patch cbff4b3, main-loop: switch to g_poll() on POSIX hosts, 2013-02-20). Paolo The patch for cbff4b342b000a7642125dbdabf61113e05eee44: diff --git a/main-loop.c b/main-loop.c index 489b27c..1558e4b 100644 --- a/main-loop.c +++ b/main-loop.c @@ -454,6 +454,22 @@ static int os_host_main_loop_wait(uint32_t timeout) return ret; } +gpollfds_from_select(); +FD_ZERO(rfds); +FD_ZERO(wfds); +FD_ZERO(xfds); +nfds = pollfds_fill(gpollfds, rfds, wfds, xfds); +if (nfds = 0) { +select_ret = select(nfds + 1, rfds, wfds, xfds, tv0); +if (select_ret != 0) { +timeout = 0; +} +if (select_ret 0) { +pollfds_poll(gpollfds, nfds, rfds, wfds, xfds); +} +} +gpollfds_to_select(select_ret); + g_main_context_prepare(context, max_priority); n_poll_fds = g_main_context_query(context, max_priority, poll_timeout, poll_fds, ARRAY_SIZE(poll_fds)); @@ -493,21 +509,6 @@ static int os_host_main_loop_wait(uint32_t timeout) /* This back-and-forth between GPollFDs and select(2) is temporary. We'll * drop it in a couple of patches, I promise :). */ -gpollfds_from_select(); -FD_ZERO(rfds); -FD_ZERO(wfds); -FD_ZERO(xfds); -nfds = pollfds_fill(gpollfds, rfds, wfds, xfds); -if (nfds = 0) { -select_ret = select(nfds + 1, rfds, wfds, xfds, tv0); -if (select_ret != 0) { -timeout = 0; -} -if (select_ret 0) { -pollfds_poll(gpollfds, nfds, rfds, wfds, xfds); -} -} -gpollfds_to_select(select_ret); return select_ret || g_poll_ret; } The patch for 134a03e0b3d34b01b68107104c525c3bff1211d4 : diff --git a/main-loop.c b/main-loop.c index d0d8fe4..7cdd969 100644 --- a/main-loop.c +++ b/main-loop.c @@ -346,6 +346,13 @@ static int os_host_main_loop_wait(uint32_t timeout) return ret; } +if (nfds = 0) { +select_ret = select(nfds + 1, rfds, wfds, xfds, tv0); +if (select_ret != 0) { +timeout = 0; +} +} + g_main_context_prepare(context, max_priority); n_poll_fds = g_main_context_query(context, max_priority, poll_timeout, poll_fds, ARRAY_SIZE(poll_fds)); @@ -382,12 +389,6 @@ static int os_host_main_loop_wait(uint32_t timeout) * improve socket latency. */ -if (nfds = 0) { -select_ret = select(nfds + 1, rfds, wfds, xfds, tv0); -if (select_ret != 0) { -timeout = 0; -} -} return select_ret || g_poll_ret; } -- SUN OF A BEACH On Wed, May 15, 2013 at 4:37 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 15/05/2013 03:38, TeLeMan ha scritto: On Tue, May 14, 2013 at 7:55 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 14/05/2013 13:48, Paolo Bonzini ha scritto: Il 14/05/2013 13:39, TeLeMan ha scritto: On Tue, May 14, 2013 at 6:46 PM, Paolo Bonzini pbonz...@redhat.com wrote: Il 14/05/2013 12:24, TeLeMan ha scritto: On Tue, May 14, 2013 at 3:51 PM, Stefan Hajnoczi stefa...@gmail.com wrote: On Tue, May 14, 2013 at 12:02:24AM -, therock247uk wrote: just wondering as i just compiled the latest git and qemu goes into none responding mode when i try to do any networking stuff on guests (both linux and windows) Works for me on qemu.git/master on Linux: $ git rev-parse HEAD b087143b4d010451208264b7c841436aafe1cbb1 $ x86_64-softmmu/qemu-system-x86_64 -m 1024 -enable-kvm -cpu host \ -drive if=virtio,cache=none,file=test.img Please include more information, like the QEMU command-line and commit ID. Stefan This regression occurs on the Windows host. SLIRP hangs in sorecvfrom(). Can you bisect it? Paolo The first break is the commit 5e3bc735d93dd23f074b5116fd11e1ad8cd4962f. But it need more packets than HEAD to reproduce this regression. Please check if this partial revert of that commit fixes it: Yeah, this should work... WSAEventSelect is edge-triggered and the event will not be signaled if the socket handler does not consume all the data in the socket buffer. Unfortunately, it does not work. Ok... as you can see the patch is just moving a block of code just before g_main_context_prepare(context, max_priority). Can you please try doing the same on top of these six commits:
Re: [Qemu-devel] [PATCH 1/2] qcow2: free allocated cluster on fail in qcow2_write_snapshots()
On Wed, May 15, 2013 at 04:43:38PM +0800, Wenchao Xia wrote: Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com --- block/qcow2-snapshot.c |2 ++ 1 files changed, 2 insertions(+), 0 deletions(-) diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c index 992a5c8..a6065a9 100644 --- a/block/qcow2-snapshot.c +++ b/block/qcow2-snapshot.c @@ -268,6 +268,8 @@ static int qcow2_write_snapshots(BlockDriverState *bs) return 0; fail: +/* free the new snapshot table */ +qcow2_free_clusters(bs, snapshots_offset, snapshots_size); return ret; } snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size); offset = snapshots_offset; if (offset 0) { return offset; } ret = bdrv_flush(bs); if (ret 0) { return ret; } For completeness the bdrv_flush() return ret should be change to a goto fail so that we cover all failure cases.
Re: [Qemu-devel] [PATCH 2/2] qcow2: cancel the modification on fail in qcow2_snapshot_create()
On Wed, May 15, 2013 at 04:43:39PM +0800, Wenchao Xia wrote: Signed-off-by: Wenchao Xia xiaw...@linux.vnet.ibm.com --- block/qcow2-snapshot.c | 11 +++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c index a6065a9..ad030f9 100644 --- a/block/qcow2-snapshot.c +++ b/block/qcow2-snapshot.c @@ -410,6 +410,17 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info) #endif return 0; +restore_refcount: +ret = qcow2_update_snapshot_refcount(bs, s-l1_table_offset, + s-l1_size, -1); +if (ret 0) { +/* Nothing can be done none now, need image check later */ +error_report(qcow2: Error in restoring refcount in snapshot); +} + +dealloc_cluster: +qcow2_free_clusters(bs, sn-l1_table_offset, sn-l1_size); + fail: g_free(sn-id_str); g_free(sn-name); This patch is missing goto restore_refcount and goto dealloc_cluster.
[Qemu-devel] QEMU NUMA and memory allocation problem
Hi, We just met a problem of QEMU memory allocation. Here is the description: On my host, I have two nodes, # numactl -H available: 2 nodes (0-1) node 0 cpus: 0 2 node 0 size: 4010 MB node 0 free: 3021 MB node 1 cpus: 1 3 node 1 size: 4030 MB node 1 free: 2881 MB node distances: node 0 1 0: 10 20 1: 20 10 I created a guest using the following XML: ... memory unit='KiB'1048576/memory currentMemory unit='KiB'1048576/currentMemory vcpu placement='static'2/vcpu cputune vcpupin vcpu='0' cpuset='2'/ vcpupin vcpu='1' cpuset='3'/ /cputune numatune memory mode='strict' nodeset='0-1'/ /numatune cpu topology sockets='2' cores='1' threads='1'/ numa cell cpus='0' memory='524288'/ cell cpus='1' memory='524288'/ /numa /cpu ... As you can see, I assigned 1G memory to this guest, pined vcpu0 to the host CPU 2, it's in host node0, pined vcpu1 to the host CPU 3 that is in host node1. The guest also has two nodes, each node contains 512M memory. Now, I started the guest, then printed the host numa state : # numactl -H available: 2 nodes (0-1) node 0 cpus: 0 2 node 0 size: 4010 MB node 0 free: 2647 MB === freecell of node0 node 1 cpus: 1 3 node 1 size: 4030 MB node 1 free: 2746 MB node distances: node 0 1 0: 10 20 1: 20 10 Then I tried to allocate memory from guest node0 using the following code: #include memory.h #include numa.h #define MEM (1024*1024*300) int main(void) { char *p = numa_alloc_onnode(MEM, 0); memset(p, 0, MEM); sleep(1000); numa_free(p, MEM); return 0; } And printed the host numa state, it shows that this 300M memory is allocated from host node0, # numactl -H available: 2 nodes (0-1) node 0 cpus: 0 2 node 0 size: 4010 MB node 0 free: 2345 MB= reduced ~300M node 1 cpus: 1 3 node 1 size: 4030 MB node 1 free: 2767 MB node distances: node 0 1 0: 10 20 1: 20 10 Then, I tried the same method to allocate 300M memory from guest node1, and printed the host numa state: # numactl -H available: 2 nodes (0-1) node 0 cpus: 0 2 node 0 size: 4010 MB node 0 free: 2059 MB=== reduced ~300M node 1 cpus: 1 3 node 1 size: 4030 MB node 1 free: 2767 MB=== no change node distances: node 0 1 0: 10 20 1: 20 10 To see that this 300M memory is allocated from host node0 again, but not host node1 as I expected. We think that QEMU can't handled this numa memory allocation well, and it will cause the cross node memory access performance regression. Any thoughts? Or, am I missing something? Thanks, Wanlong Gao
Re: [Qemu-devel] [PATCH 02/12] qemu-thread: add QemuEvent
On Wed, May 15, 2013 at 05:48:47PM +0200, Paolo Bonzini wrote: This emulates Win32 manual-reset events using futexes or conditional variables. Typical ways to use them are with multi-producer, single-consumer data structures, to test for a complex condition whose elements come from different threads: for (;;) { qemu_event_reset(ev); ... test complex condition ... if (condition is true) { break; } qemu_event_wait(ev); } Alternatively: ... compute condition ... if (condition) { do { qemu_event_wait(ev); qemu_event_reset(ev); ... compute condition ... } while(condition); qemu_event_set(ev); } QemuEvent provides a very fast userspace path in the common case when no other thread is waiting, or the event is not changing state. It is used to report RCU quiescent states to the thread calling synchronize_rcu (the latter being the single consumer), and to report call_rcu invocations to the thread that receives them. It would be nice to describe the need for the Linux futex code. pthread mutex/condvars are implemented in terms of futexes already, so how much benefit is there - I thought they stay in userspace in the non-contended case too? Stefan
Re: [Qemu-devel] [PATCH 2/2] pc: reject do pc_acpi_init if acpi_enabled is false
On 05/16/13 02:14, li guang wrote: 在 2013-05-15三的 15:44 +0200,Laszlo Ersek写道: On 05/15/13 10:54, li guang wrote: 在 2013-05-15三的 10:38 +0200,Paolo Bonzini写道: Il 15/05/2013 06:01, liguang ha scritto: Signed-off-by: liguang lig.f...@cn.fujitsu.com --verbose, please. What problem does this patch fix? Oh, sorry to be lazy ... QEMU's option '-no-acpi' seems does not play a correct role, even started with this, ACPI tables will also be embedded into BIOS, and there's no different between with or without it for q35, as i can see. here, I'm assuming '-no-acpi' is to disable ACPI. -no-acpi disables a block of code in pc_init1() [hw/i386/pc_piix.c], namely piix4_pm_init() and smbus_eeprom_init(). pc_acpi_init() loads / exports a default DSDT for the boot firmware. If the -acpitable switch is passed, then that code doesn't run. Yes, if '-acpitable' '-no-acpi' passed, I think QEMU is valid to override '-no-acpi' by '-acpitable'. I think these flags are orthogonal. As far as I can see, the former turns off a piece of hardware emulation (like support for suspend / hibernate and the System Control Interrupt, and more). The latter enables the user to provide custom ACPI tables for SeaBIOS (or other boot firmware) in place of the default DSDT. It seems that originally acpi_enabled (-no-acpi) used to control both purposes, see commit 6515b20: the call to acpi_bios_init() used to depend on acpi_enabled != 0. However the two goals got separated with commit a5954d5 apparently. I think disabling PM but keeping the default DSDT from SeaBIOS is a valid use case; the DSDT seems to contain a bunch of non-PM functionality (see src/acpi-dsdt*.dsl in SeaBIOS). The -no-acpi switch is likely a misnomer (it should say -no-acpi-pm or some such), but in any case I believe it should not prevent exporting the DSDT. what's the purpose of only disable PM by '-no-acpi'? I mentioned the src/acpi-dsdt*.dsl files in SeaBIOS that get built into the default DSDT. With -no-acpi only, qemu keeps at least the following working: - the \DBUG ACPI method [src/acpi-dsdt-dbug.dsl], - definition of the \_SB\HPET (PNP0103) device [src/acpi-dsdt-hpet.dsl], - descriptions (interrupts, io-ports) of ISA devices (RTC, keyboard, mouse, floppy disk controller, parport, serial port) [src/acpi-dsdt-isa.dsl], - description of the PCI IO windows (PCI host bridge windows) [src/acpi-dsdt-pci-crs.dsl], whose lack can mess up VGA display for example (see the pci=nocrs kernel option), - the PCI interrupt routing table [acpi-dsdt.dsl]. Again, I believe -no-acpi has probably become a misnomer by now and it means -no-acpi-pm in reality. can we use -no-acpi to disable whole ACPI? I cannot *prove* it would be wrong, but I feel insecure about it, considering the commit history and the non-PM-related functionality in the default DSDT. Currently a user may expect the above contents to be present in ACPI tables, and only PM (plus maybe hotplug) not to work, when passing -no-acpi. Anyway I'm no authority on this -- anyone, feel free to chime in. I'm certainly not NACKing the patch, I just have concerns. Thanks, Laszlo
Re: [Qemu-devel] [Bug 1179731] [NEW] is networking broken on windows hosts?
On 05/16/2013 10:59 AM, Paolo Bonzini wrote: Il 16/05/2013 07:52, TeLeMan ha scritto: The patch is working on 134a03e0b3d34b01b68107104c525c3bff1211d4 and is not working from cbff4b342b000a7642125dbdabf61113e05eee44. Thanks. Fabien or Stefan, can you take a look? Unfortunately I don't have time to investigate these days. -- Fabien Chouteau
Re: [Qemu-devel] [PATCH v3 00/11] qapi: add support for lists of native types
On 05/15/13 21:13, mdroth wrote: On Wed, May 15, 2013 at 02:05:58PM -0400, Luiz Capitulino wrote: On Wed, 15 May 2013 12:42:24 -0500 mdroth mdr...@linux.vnet.ibm.com wrote: The only way I've managed to reproduce this is by having a stale qapi-types.h hanging around in $SRC_DIR while i'm building in a different $BUILD_DIR. Can you confirm that's not what's happening here? Yes, it was :( I had an old qapi-types.h in $SRC_DIR, but was building in a different $BUILD_DIR. I am really sorry for having wasted your time on this. I've applied this series to qmp-next branch, but I have no idea how I ended up with a qapi-types.h file in $SRC_DIR. I have an alias to build qemu that I use for several months now... No problem, only thought to check that scenario because it happens to me all the time :) Side question: what's a common use case for a separate $BUILD_DIR? I just use git clean -fdx in $SRC_DIR instead of make clean. The need to reconfigure (and to regenerate my tags file from scratch) is a small price to pay for the peace of mind. Thanks Laszlo
Re: [Qemu-devel] [PATCH v3 00/11] qapi: add support for lists of native types
On 16 May 2013 11:38, Laszlo Ersek ler...@redhat.com wrote: Side question: what's a common use case for a separate $BUILD_DIR? I just use git clean -fdx in $SRC_DIR instead of make clean. The need to reconfigure (and to regenerate my tags file from scratch) is a small price to pay for the peace of mind. If you're in the habit of building for more than one config (eg native x86 plus cross-compiled for ARM it's nice not to have to blow away the first set of object files to build the second. -- PMM
Re: [Qemu-devel] [PATCH 02/12] qemu-thread: add QemuEvent
Il 16/05/2013 12:15, Stefan Hajnoczi ha scritto: On Wed, May 15, 2013 at 05:48:47PM +0200, Paolo Bonzini wrote: This emulates Win32 manual-reset events using futexes or conditional variables. Typical ways to use them are with multi-producer, single-consumer data structures, to test for a complex condition whose elements come from different threads: for (;;) { qemu_event_reset(ev); ... test complex condition ... if (condition is true) { break; } qemu_event_wait(ev); } Alternatively: ... compute condition ... if (condition) { do { qemu_event_wait(ev); qemu_event_reset(ev); ... compute condition ... } while(condition); qemu_event_set(ev); } QemuEvent provides a very fast userspace path in the common case when no other thread is waiting, or the event is not changing state. It is used to report RCU quiescent states to the thread calling synchronize_rcu (the latter being the single consumer), and to report call_rcu invocations to the thread that receives them. It would be nice to describe the need for the Linux futex code. pthread mutex/condvars are implemented in terms of futexes already, so how much benefit is there - I thought they stay in userspace in the non-contended case too? Yes, but they are still measurably slower, around 20%. I don't have around the program I wrote for QemuEvent, because I did the measurement ~2 years ago. However, here is one that tests a similar synchronization primitive (not exactly the same as QemuEvent). You can run it like this: $ ./a.out -c 100 10 4 40 # with mutex/condvar 4 child processes, avg think time 100 msec Avg event distance 10 msec. Running for 40 sec waits1404 0.663 us (fast) slow waits 152 signal 3890 0.784 us fast path 0 (this is bogus) $ ./a.out 100 10 4 40 # with futex 4 child processes, avg think time 100 msec Avg event distance 10 msec. Running for 40 sec waits1383 0.635 us (fast) slow waits 147 signal 3924 0.640 us fast path 3791 Paolo #define _GNU_SOURCE 1 #include pthread.h #include limits.h #include string.h #include unistd.h #include stdio.h #include stdlib.h #include time.h #include math.h #ifdef __linux__ #include sys/syscall.h #include linux/futex.h #define HAVE_FUTEX 1 #endif #ifdef _WIN32 #include windows.h #include mmsystem.h #else #define HAVE_AFFINITY 1 #endif typedef struct EvCounter { int ctr; int waiters; int fast_signal; pthread_mutex_t lock; pthread_cond_t cond; } EvCounter; typedef int EvCounterState; void evcounter_get(EvCounterState *state, EvCounter *evcounter) { *state = evcounter-ctr; } #ifdef HAVE_FUTEX #define futex(...) syscall(__NR_futex, __VA_ARGS__) void _evcounter_init(EvCounter *evcounter) { evcounter-ctr = evcounter-waiters = 0; } int _evcounter_wait(EvCounterState *state, EvCounter *evcounter) { int fast = 1; __sync_fetch_and_add(evcounter-waiters, 1); while (*state == evcounter-ctr) { futex(evcounter-ctr, FUTEX_WAIT_PRIVATE, *state, NULL); fast = 0; } __sync_fetch_and_add(evcounter-waiters, -1); *state = evcounter-ctr; return fast; } void _evcounter_signal(EvCounter *evcounter) { __sync_fetch_and_add(evcounter-ctr, 1); if (evcounter-waiters != 0) { futex(evcounter-ctr, FUTEX_WAKE_PRIVATE, INT_MAX); } else { evcounter-fast_signal++; } } #endif void _evcounter_init_cond(EvCounter *evcounter) { evcounter-ctr = 0; pthread_mutex_init (evcounter-lock, NULL); pthread_cond_init (evcounter-cond, NULL); } int _evcounter_wait_cond(EvCounterState *state, EvCounter *evcounter) { int fast = 1; pthread_mutex_lock(evcounter-lock); while (*state == evcounter-ctr) { pthread_cond_wait(evcounter-cond, evcounter-lock); fast = 0; } pthread_mutex_unlock(evcounter-lock); *state = evcounter-ctr; return fast; } void _evcounter_signal_cond(EvCounter *evcounter) { pthread_mutex_lock(evcounter-lock); evcounter-ctr++; pthread_cond_broadcast(evcounter-cond); pthread_mutex_unlock(evcounter-lock); } void (*evcounter_init)(EvCounter *); int (*evcounter_wait)(EvCounterState *, EvCounter *); void (*evcounter_signal)(EvCounter *); EvCounter evc; volatile int stop = 0; int affinity = 0; long t_wait; long n_slow_wait; long n_wait; long t_signal; long n_signal; #define CLOCK_RES (sizeof(long) == 8 ? 10LL : 100LL) static inline long long getclk() { #ifndef _WIN32 struct timespec ts; clock_gettime(CLOCK_MONOTONIC, ts); return (ts.tv_sec * 10LL + ts.tv_nsec) / (10LL / CLOCK_RES); #else static LARGE_INTEGER freq, init; LARGE_INTEGER counter; if (!init.QuadPart) {
[Qemu-devel] [PATCH v2 0/2] mac programming over macvtap
This patchset introduces a QMP event and a monitor command. The event is used to notify management when mac-table configuration is changed by guest. Management can use the new monitor command to query mac-table information, and sync the changes to macvtap devices. There maybe exist an uncontrollable delay, guests normally expect rx-mode updates immediately, but it's another separate issue, we can investigate it after Libvirt work is done. Patches are based on Michael's patchset [1], you can also find patches in my github [2]. [1] [PATCH v3 00/11] qapi: add support for lists of native types [2] https://github.com/kongove/qemu/tree/01-query-mactable v2: add argument to filter mac-table info of single nic (Stefan) update the document add event notification Amos Kong (2): net: introduce MAC_TABLE_CHANGED event net: introduce command to query mac-table information QMP/qmp-events.txt| 14 + hmp-commands.hx | 2 ++ hmp.c | 71 hmp.h | 1 + hw/net/virtio-net.c | 75 +++ include/monitor/monitor.h | 1 + include/net/net.h | 2 ++ monitor.c | 9 ++ net/net.c | 38 qapi-schema.json | 57 +++ qmp-commands.hx | 53 + 11 files changed, 323 insertions(+) -- 1.8.1.4
[Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
Introduce this new QMP event to notify management after guest changes mac-table configuration. Signed-off-by: Amos Kong ak...@redhat.com --- QMP/qmp-events.txt| 14 ++ hw/net/virtio-net.c | 12 include/monitor/monitor.h | 1 + monitor.c | 1 + 4 files changed, 28 insertions(+) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index 92fe5fb..24d62df 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -154,6 +154,20 @@ Data: path: /machine/peripheral/virtio-net-pci-0 }, timestamp: { seconds: 1265044230, microseconds: 450486 } } +MAC_TABLE_CHANGED +- + +Emitted mac-table configuration is changed by the guest. + +Data: + +- name: net client name (json-string) + +{ event: MAC_TABLE_CHANGED, + data: { name: vnet0 }, + timestamp: { seconds: 1368697518, microseconds: 326866 }} +} + DEVICE_TRAY_MOVED - diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index bed0822..a9b8f53 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -21,6 +21,8 @@ #include hw/virtio/virtio-net.h #include net/vhost_net.h #include hw/virtio/virtio-bus.h +#include qapi/qmp/qjson.h +#include monitor/monitor.h #define VIRTIO_NET_VM_VERSION11 @@ -395,6 +397,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, { uint8_t on; size_t s; +QObject *event_data; s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on)); if (s != sizeof(on)) { @@ -417,6 +420,10 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_ERR; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } @@ -425,6 +432,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, { struct virtio_net_ctrl_mac mac_data; size_t s; +QObject *event_data; if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) { if (iov_size(iov, iov_cnt) != sizeof(n-mac)) { @@ -497,6 +505,10 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, n-mac_table.multi_overflow = 1; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h index 1a6cfcf..e88c70f 100644 --- a/include/monitor/monitor.h +++ b/include/monitor/monitor.h @@ -40,6 +40,7 @@ typedef enum MonitorEvent { QEVENT_BLOCK_JOB_ERROR, QEVENT_BLOCK_JOB_READY, QEVENT_DEVICE_DELETED, +QEVENT_MAC_TABLE_CHANGED, QEVENT_DEVICE_TRAY_MOVED, QEVENT_SUSPEND, QEVENT_SUSPEND_DISK, diff --git a/monitor.c b/monitor.c index 62aaebe..9e51545 100644 --- a/monitor.c +++ b/monitor.c @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = { [QEVENT_BLOCK_JOB_ERROR] = BLOCK_JOB_ERROR, [QEVENT_BLOCK_JOB_READY] = BLOCK_JOB_READY, [QEVENT_DEVICE_DELETED] = DEVICE_DELETED, +[QEVENT_MAC_TABLE_CHANGED] = MAC_TABLE_CHANGED, [QEVENT_DEVICE_TRAY_MOVED] = DEVICE_TRAY_MOVED, [QEVENT_SUSPEND] = SUSPEND, [QEVENT_SUSPEND_DISK] = SUSPEND_DISK, -- 1.8.1.4
[Qemu-devel] [PATCH v2 2/2] net: introduce command to query mac-table information
We want to implement mac programming over macvtap through Libvirt. The previous patch adds QMP event to notify management of mac-table change. This patch adds a monitor command to query rx mode information of mac-tables. (qemu) info mac-table vnet0 vnet0: \ promisc: on \ allmulti: off \ alluni: off \ nomulti: off \ nouni: off \ nobcast: off \ multi_overflow: off \ uni_overflow: off \ multicast: 01:00:5e:00:00:01 33:33:00:00:00:01 33:33:ff:12:34:56 Signed-off-by: Amos Kong ak...@redhat.com --- hmp-commands.hx | 2 ++ hmp.c | 71 + hmp.h | 1 + hw/net/virtio-net.c | 63 +++ include/net/net.h | 2 ++ monitor.c | 8 ++ net/net.c | 38 qapi-schema.json| 57 ++ qmp-commands.hx | 53 +++ 9 files changed, 295 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 9cea415..e5c1b14 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1639,6 +1639,8 @@ show qdev device model list show roms @item info tpm show the TPM device +@item info mac-table [net client name] +show the mac-table information for all nics (or for the given nic) @end table ETEXI diff --git a/hmp.c b/hmp.c index 4fb76ec..3e19df0 100644 --- a/hmp.c +++ b/hmp.c @@ -653,6 +653,77 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict) qapi_free_TPMInfoList(info_list); } +void hmp_info_mac_table(Monitor *mon, const QDict *qdict) +{ +MacTableInfoList *table_list, *entry; +strList *str_entry; +bool has_name = qdict_haskey(qdict, name); +const char *name = NULL; + +if (has_name) { +name = qdict_get_str(qdict, name); +} + +table_list = qmp_query_mac_table(has_name, name, NULL); +entry = table_list; + +while (entry) { +monitor_printf(mon, %s:\n, entry-value-name); +if (entry-value-has_promisc) { +monitor_printf(mon, \\ promisc: %s\n, + entry-value-promisc ? on : off); +} +if (entry-value-has_allmulti) { +monitor_printf(mon, \\ allmulti: %s\n, + entry-value-allmulti ? on : off); +} +if (entry-value-has_alluni) { +monitor_printf(mon, \\ alluni: %s\n, + entry-value-alluni ? on : off); +} +if (entry-value-has_nomulti) { +monitor_printf(mon, \\ nomulti: %s\n, + entry-value-nomulti ? on : off); +} +if (entry-value-has_nouni) { +monitor_printf(mon, \\ nouni: %s\n, + entry-value-nouni ? on : off); +} +if (entry-value-has_nobcast) { +monitor_printf(mon, \\ nobcast: %s\n, + entry-value-nobcast ? on : off); +} +if (entry-value-has_multi_overflow) { +monitor_printf(mon, \\ multi_overflow: %s\n, + entry-value-multi_overflow ? on : off); +} +if (entry-value-has_uni_overflow) { +monitor_printf(mon, \\ uni_overflow: %s\n, + entry-value-uni_overflow ? on : off); +} + +if (entry-value-has_unicast) { +str_entry = entry-value-unicast; +monitor_printf(mon, \\ unicast:\n); +while (str_entry) { +monitor_printf(mon, %s\n, str_entry-value); +str_entry = str_entry-next; +} +} +if (entry-value-has_multicast) { +str_entry = entry-value-multicast; +monitor_printf(mon, \\ multicast:\n); +while (str_entry) { +monitor_printf(mon, %s\n, str_entry-value); +str_entry = str_entry-next; +} +} + +entry = entry-next; +} +qapi_free_MacTableInfoList(table_list); +} + void hmp_quit(Monitor *mon, const QDict *qdict) { monitor_suspend(mon); diff --git a/hmp.h b/hmp.h index 95fe76e..278c722 100644 --- a/hmp.h +++ b/hmp.h @@ -37,6 +37,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict); void hmp_info_pci(Monitor *mon, const QDict *qdict); void hmp_info_block_jobs(Monitor *mon, const QDict *qdict); void hmp_info_tpm(Monitor *mon, const QDict *qdict); +void hmp_info_mac_table(Monitor *mon, const QDict *qdict); void hmp_quit(Monitor *mon, const QDict *qdict); void hmp_stop(Monitor *mon, const QDict *qdict); void hmp_system_reset(Monitor *mon, const QDict *qdict); diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index a9b8f53..e4b2358 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -194,6 +194,68 @@ static void virtio_net_set_link_status(NetClientState *nc) virtio_net_set_status(vdev, vdev-status); } +static MacTableInfo
[Qemu-devel] Cross-Compiling Qemu for Aarch64?
Hello Everyone, I am currently trying to compile qemu for Aarch64 but so far I haven't been able to configure qemu for this purpose. My first objective is to just configure and cross-compile qemu for Aarch64, which is currently blocked by the qemu's dependency on cross-compiled glib-2.12. For example, when I use the following configure command: ./configure --cross-prefix=aarch64-linux-gnu- --target-list=arm-softmmu --enable-fdt --static I get the following error: ERROR: glib-2.12 required to compile QEMU This is a well-known dependency and the following pages are linked to this issue (directly or indirectly) and I have tried all of them without any success: https://bugs.launchpad.net/linaro-oe/+bug/1097561 http://people.debian.org/~wookey/bootstrap.html http://wiki.xen.org/wiki/Xen_ARM_with_Virtualization_Extensions/CrossCompiling https://wiki.linaro.org/Platform/DevPlatform/CrossCompile/arm64bootstrap I have also tried the latest git repository from John Rigby but I get the same error: https://git.linaro.org/gitweb?p=people/jcrigby/qemu-aarch64.git;a=summary Anyways, I want to know how we can resolve this dependency ? Some of you might suggest to cross-compile the glib-2.12 from sources found at: http://www.linuxfromscratch.org/blfs/view/6.3/general/glib2.html http://www.gtk.org/api/2.6/glib/glib-cross-compiling.html but even this option doesn't work for me as the apparent lacking support for Aarch64 in glib-2.12. I have used the following configure command: ./configure --prefix=/my/prefix/path --host=aarch64-linux-gnu --cache-file=aarch64.cache with the following aarch64.cache file contents: glib_cv_long_long_format=I64 glib_cv_stack_grows=no Any pointers and/or directions for a possible solution will be highly appreciated. Thanks in advance, Hamayun
Re: [Qemu-devel] [PATCH RFC 00/13] qemu: generate acpi tables for the guest
On Mon, May 13, 2013 at 03:38:51PM -0500, Anthony Liguori wrote: I don't think it's a good idea to move BIOS functionality in QEMU. We don't frequently add firmware or chipsets so it seems like we're optimizing for an uncommon scenario here. Regards, Anthony Liguori By the way, on x86 we already load some ACPI tables through fw_cfg: hw/i386/pc_piix.c:pc_acpi_init(acpi-dsdt.aml); hw/i386/pc_q35.c:pc_acpi_init(q35-acpi-dsdt.aml); so putting all other tables in one place will just make things more consistent. Yes, ACPI is messy, and pc-specific, but IMO that's life: it's better to give guests as much help as we can if it's messy, not push the messiness out for guests to handle. -- MST
Re: [Qemu-devel] tap devices not receiving packets from a bridge
On Thu, 2013-05-16 at 11:40 +0300, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 09:20:55AM +0100, Nicholas Thomas wrote: Hi, On Thu, 2013-05-16 at 09:27 +0300, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 09:24:05AM +0300, Michael S. Tsirkin wrote: Is this with or without vhost-net in host? never mind, I see it's without. Try to enable vhost-net (you'll have to switch to -netdev syntax for that to work) and see if this help. If it does it's likely a qemu bug if not probably a guest bug. Switching to -netdev is non-trivial for me, unfortunately. Interesting. Why is that? Our setup is bond0 - vlanX - bridgeX - [ tap devices ] and we do all that outside of qemu at the moment, specifying -net tap,ifname=... - we also run some processes on the TAP interface and insert a bunch of ebtables rules between creating it and starting qemu. Duplicating that with -net bridge seemed close to impossible, and -netdev tap was throwing EBUSY from /dev/net/tun. I guess our external magic should be using ,fd= instead. Anyway, it's definitely a qemu bug - it happens on kernels 3.2 and 3.9 with 1.4.1, but doesn't happen with qemu 0.15.0 or 1.5.0rc1. I'll have a dig through git to see if I can identify the patch that resolves it. It feels-like qemu sometimes stops reading from the tap file descriptor between ipxe exiting and the linux kernel bringing up the network interface, and never recovers from that. /Nick You can try to bisect, yes. Work have decided to accept 1.5.0 when it arrives instead, so I'm afraid I won't be working on this after all. /Nick
Re: [Qemu-devel] Cross-Compiling Qemu for Aarch64?
On 16 May 2013 12:09, Mian M. Hamayun m.hama...@virtualopensystems.com wrote: Hello Everyone, I am currently trying to compile qemu for Aarch64 but so far I haven't been able to configure qemu for this purpose. My first objective is to just configure and cross-compile qemu for Aarch64, which is currently blocked by the qemu's dependency on cross-compiled glib-2.12. For example, when I use the following configure command: ./configure --cross-prefix=aarch64-linux-gnu- --target-list=arm-softmmu --enable-fdt --static I get the following error: ERROR: glib-2.12 required to compile QEMU This is really a crossbuilding for aarch64 rather than a QEMU issue; in general for this kind of thing I suspect you're more likely to find people who know the answer on the Linaro mailing lists, eg linaro-...@lists.linaro.org. You've clearly already found at least some resources, so your best bet for getting this question answered is to make a focused post to linaro-dev saying specifically I tried X (as suggested in url Y) to get a version of glib for aarch64 but it didn't work with error message Z. Anyways, I want to know how we can resolve this dependency ? Some of you might suggest to cross-compile the glib-2.12 When the message says 2.12 it doesn't mean exactly and only that version, it means 2.12 or any later version. You're probably more likely to be successful with something more recent. That said, I wouldn't try a cross-compile completely from scratch from upstream sources personally. You really want to be working with whatever is your cross toolchain's recommended method for getting dependencies and building against them. This can differ a lot depending on how your cross environment is supposed to work. thanks -- PMM
Re: [Qemu-devel] [PATCH 03/12] rcu: add rcu library
On Wed, May 15, 2013 at 05:48:48PM +0200, Paolo Bonzini wrote: +RCU PATTERNS + + +Many patterns using read-writer locks translate directly to RCU, with +the advantages of higher scalability and deadlock immunity. + +In general, RCU can be used whenever it is possible to create a new +version of a data structure every time the updater runs. This may +sound like a very strict restriction, however: + +- the updater does not mean everything that writes to a data structure, + but rather everything that involves a reclamation step. See the + array example below + +- in some cases, creating a new version of a data structure may actually + be very cheap. For example, modifying the next pointer of a singly + linked list is effectively creating a new version of the list. + + +them however are worth noting. ?
Re: [Qemu-devel] QEMU NUMA and memory allocation problem
Il 16/05/2013 11:50, Wanlong Gao ha scritto: To see that this 300M memory is allocated from host node0 again, but not host node1 as I expected. We think that QEMU can't handled this numa memory allocation well, and it will cause the cross node memory access performance regression. Any thoughts? Or, am I missing something? Pinning memory to host NUMA nodes is not implemented. Something like AutoNUMA would be able to balance the memory the right way. Paolo
Re: [Qemu-devel] tap devices not receiving packets from a bridge
On Thu, May 16, 2013 at 12:27:52PM +0100, Nicholas Thomas wrote: On Thu, 2013-05-16 at 11:40 +0300, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 09:20:55AM +0100, Nicholas Thomas wrote: Hi, On Thu, 2013-05-16 at 09:27 +0300, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 09:24:05AM +0300, Michael S. Tsirkin wrote: Is this with or without vhost-net in host? never mind, I see it's without. Try to enable vhost-net (you'll have to switch to -netdev syntax for that to work) and see if this help. If it does it's likely a qemu bug if not probably a guest bug. Switching to -netdev is non-trivial for me, unfortunately. Interesting. Why is that? Our setup is bond0 - vlanX - bridgeX - [ tap devices ] and we do all that outside of qemu at the moment, specifying -net tap,ifname=... - we also run some processes on the TAP interface and insert a bunch of ebtables rules between creating it and starting qemu. Duplicating that with -net bridge seemed close to impossible, and -netdev tap was throwing EBUSY from /dev/net/tun. I guess our external magic should be using ,fd= instead. I'm not sure what's wrong with -netdev tap. You don't have to use fd=, you can specify ifname= with netdev as well. Here's what I use: -net nic,model=virtio,netdev=foo -netdev tap,id=foo,ifname=msttap0,script=/home/mst/ifup,downscript=no,vhost=on the netdev/id pair above is almost the same as vlan=20 in your example, except there's always exactly one frontend and one backend in the netdev case, vlans let you connect more than 2 devices. Anyway, it's definitely a qemu bug - it happens on kernels 3.2 and 3.9 with 1.4.1, but doesn't happen with qemu 0.15.0 or 1.5.0rc1. I'll have a dig through git to see if I can identify the patch that resolves it. It feels-like qemu sometimes stops reading from the tap file descriptor between ipxe exiting and the linux kernel bringing up the network interface, and never recovers from that. /Nick You can try to bisect, yes. Work have decided to accept 1.5.0 when it arrives instead, so I'm afraid I won't be working on this after all. /Nick
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. Signed-off-by: Amos Kong ak...@redhat.com --- QMP/qmp-events.txt| 14 ++ hw/net/virtio-net.c | 12 include/monitor/monitor.h | 1 + monitor.c | 1 + 4 files changed, 28 insertions(+) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index 92fe5fb..24d62df 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -154,6 +154,20 @@ Data: path: /machine/peripheral/virtio-net-pci-0 }, timestamp: { seconds: 1265044230, microseconds: 450486 } } +MAC_TABLE_CHANGED +- + +Emitted mac-table configuration is changed by the guest. + +Data: + +- name: net client name (json-string) + +{ event: MAC_TABLE_CHANGED, + data: { name: vnet0 }, + timestamp: { seconds: 1368697518, microseconds: 326866 }} +} + It seems clear that if management wants to know about RX filter changes, it also cares about RX mode changes. So what's the plan for RX mode changes? Want to add more events or extend this one? I'd like to see how it all works together. DEVICE_TRAY_MOVED - diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index bed0822..a9b8f53 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -21,6 +21,8 @@ #include hw/virtio/virtio-net.h #include net/vhost_net.h #include hw/virtio/virtio-bus.h +#include qapi/qmp/qjson.h +#include monitor/monitor.h #define VIRTIO_NET_VM_VERSION11 @@ -395,6 +397,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, { uint8_t on; size_t s; +QObject *event_data; s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on)); if (s != sizeof(on)) { @@ -417,6 +420,10 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_ERR; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } @@ -425,6 +432,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, { struct virtio_net_ctrl_mac mac_data; size_t s; +QObject *event_data; if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) { if (iov_size(iov, iov_cnt) != sizeof(n-mac)) { @@ -497,6 +505,10 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, n-mac_table.multi_overflow = 1; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h index 1a6cfcf..e88c70f 100644 --- a/include/monitor/monitor.h +++ b/include/monitor/monitor.h @@ -40,6 +40,7 @@ typedef enum MonitorEvent { QEVENT_BLOCK_JOB_ERROR, QEVENT_BLOCK_JOB_READY, QEVENT_DEVICE_DELETED, +QEVENT_MAC_TABLE_CHANGED, QEVENT_DEVICE_TRAY_MOVED, QEVENT_SUSPEND, QEVENT_SUSPEND_DISK, diff --git a/monitor.c b/monitor.c index 62aaebe..9e51545 100644 --- a/monitor.c +++ b/monitor.c @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = { [QEVENT_BLOCK_JOB_ERROR] = BLOCK_JOB_ERROR, [QEVENT_BLOCK_JOB_READY] = BLOCK_JOB_READY, [QEVENT_DEVICE_DELETED] = DEVICE_DELETED, +[QEVENT_MAC_TABLE_CHANGED] = MAC_TABLE_CHANGED, [QEVENT_DEVICE_TRAY_MOVED] = DEVICE_TRAY_MOVED, [QEVENT_SUSPEND] = SUSPEND, [QEVENT_SUSPEND_DISK] = SUSPEND_DISK, -- 1.8.1.4
Re: [Qemu-devel] [PATCH] qemu-io: Fix 'map' output
On Thu, May 16, 2013 at 11:24:01AM +0200, Kevin Wolf wrote: Am 16.05.2013 um 11:14 hat Stefan Hajnoczi geschrieben: On Wed, May 15, 2013 at 01:47:12PM +0200, Kevin Wolf wrote: +static int map_is_allocated(int64_t sector_num, int64_t nb_sectors, int64_t *pnum) +{ +int num, num_checked; +int ret, firstret; + +num_checked = MIN(nb_sectors, INT_MAX); +ret = bdrv_is_allocated(bs, sector_num, num_checked, num); +if (ret 0) { +return ret; +} + +firstret = ret; +*pnum = num; + +while (nb_sectors 0 ret == firstret) { +sector_num += num; +nb_sectors -= num; + +num_checked = MIN(nb_sectors, INT_MAX); +ret = bdrv_is_allocated(bs, sector_num, num_checked, num); +if (ret == firstret) { +*pnum += num; +} else { +break; +} The break makes ret == firstret redundant above. I suggest just while (nb_sectors 0) { ... } which is easier to read. Okay. I wasn't sure which was better. Don't know though how it came that I have both checks now... Also, if you respin the patch please tweak the commit message. Coalesce 'map' output is more specific than Fix 'map' output - unless this really fixes a bug which you didn't mention in the commit description. I'll change the title. It makes different formats behave the same even if they work in different granularities. I think QED was bitten by this in qemu-iotests somwhere because it could give different results than qcow2, possibly also dependent on timing. Maybe I should mention that as well in the commit message. Yes, please. I didn't think of that.
[Qemu-devel] Qemu for ARM and MRS/MSR banked registers instructions
Hello, Did anybody pointed out that there may be problems with Qemu decoding these MRS/MSR banked registers ? In my code, I do several mrs %r0, sp_usr mrs %r0, lr_usr from SVC mode or IRQ mode, and the result I get is CPSR in r0 I took a look in Qemu - translate.c and op_helper.c, and it seem there is a function to access usr/fiq banked registers (get_user_reg) but the call hierarchy for this function is not very clear. About the same goes for MSR banked registers. Is it a known limitation or am I make mistakes in my code ? Thanks François
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. Signed-off-by: Amos Kong ak...@redhat.com --- QMP/qmp-events.txt| 14 ++ hw/net/virtio-net.c | 12 include/monitor/monitor.h | 1 + monitor.c | 1 + 4 files changed, 28 insertions(+) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index 92fe5fb..24d62df 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -154,6 +154,20 @@ Data: path: /machine/peripheral/virtio-net-pci-0 }, timestamp: { seconds: 1265044230, microseconds: 450486 } } +MAC_TABLE_CHANGED +- + +Emitted mac-table configuration is changed by the guest. + +Data: + +- name: net client name (json-string) + +{ event: MAC_TABLE_CHANGED, + data: { name: vnet0 }, + timestamp: { seconds: 1368697518, microseconds: 326866 }} +} + DEVICE_TRAY_MOVED - diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index bed0822..a9b8f53 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -21,6 +21,8 @@ #include hw/virtio/virtio-net.h #include net/vhost_net.h #include hw/virtio/virtio-bus.h +#include qapi/qmp/qjson.h +#include monitor/monitor.h #define VIRTIO_NET_VM_VERSION11 @@ -395,6 +397,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, { uint8_t on; size_t s; +QObject *event_data; s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on)); if (s != sizeof(on)) { @@ -417,6 +420,10 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_ERR; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } Sorry, pls ignore my previous mail, I see you actually emit this on rx mode change as well. I find the name misleading or at least it mislead me :) RX_FILTER_CHANGED? @@ -425,6 +432,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, { struct virtio_net_ctrl_mac mac_data; size_t s; +QObject *event_data; if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) { if (iov_size(iov, iov_cnt) != sizeof(n-mac)) { @@ -497,6 +505,10 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, n-mac_table.multi_overflow = 1; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } This makes it easy for guest to flood management with spurious events. How about we set a flag after this, and avoid sending any more events until management queries the filter status? diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h index 1a6cfcf..e88c70f 100644 --- a/include/monitor/monitor.h +++ b/include/monitor/monitor.h @@ -40,6 +40,7 @@ typedef enum MonitorEvent { QEVENT_BLOCK_JOB_ERROR, QEVENT_BLOCK_JOB_READY, QEVENT_DEVICE_DELETED, +QEVENT_MAC_TABLE_CHANGED, QEVENT_DEVICE_TRAY_MOVED, QEVENT_SUSPEND, QEVENT_SUSPEND_DISK, diff --git a/monitor.c b/monitor.c index 62aaebe..9e51545 100644 --- a/monitor.c +++ b/monitor.c @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = { [QEVENT_BLOCK_JOB_ERROR] = BLOCK_JOB_ERROR, [QEVENT_BLOCK_JOB_READY] = BLOCK_JOB_READY, [QEVENT_DEVICE_DELETED] = DEVICE_DELETED, +[QEVENT_MAC_TABLE_CHANGED] = MAC_TABLE_CHANGED, [QEVENT_DEVICE_TRAY_MOVED] = DEVICE_TRAY_MOVED, [QEVENT_SUSPEND] = SUSPEND, [QEVENT_SUSPEND_DISK] = SUSPEND_DISK, -- 1.8.1.4
Re: [Qemu-devel] [PATCH 7/7] block: dump to monitor for bdrv_snapshot_dump() and bdrv_image_info_dump()
On Thu, 16 May 2013 10:22:09 +0800 Wenchao Xia xiaw...@linux.vnet.ibm.com wrote: 于 2013-5-15 20:28, Luiz Capitulino 写道: On Wed, 15 May 2013 10:10:37 +0800 Wenchao Xia xiaw...@linux.vnet.ibm.com wrote: 于 2013-5-6 21:22, Luiz Capitulino 写道: On Mon, 06 May 2013 10:09:43 +0800 Wenchao Xia xiaw...@linux.vnet.ibm.com wrote: 于 2013-5-3 10:51, Wenchao Xia 写道: 于 2013-5-2 20:02, Luiz Capitulino 写道: On Thu, 02 May 2013 10:05:08 +0800 Wenchao Xia xiaw...@linux.vnet.ibm.com wrote: 于 2013-4-30 3:05, Luiz Capitulino 写道: On Fri, 26 Apr 2013 16:46:57 +0200 Stefan Hajnoczi stefa...@gmail.com wrote: On Fri, Apr 26, 2013 at 05:31:15PM +0800, Wenchao Xia wrote: @@ -2586,10 +2585,12 @@ void do_info_snapshots(Monitor *mon, const QDict *qdict) } if (total 0) { -monitor_printf(mon, %s\n, bdrv_snapshot_dump(buf, sizeof(buf), NULL)); +bdrv_snapshot_dump(NULL); +monitor_printf(mon, \n); Luiz: any issue with mixing monitor_printf(mon) and monitor_vprintf(cur_mon) calls? I guess there was a reason for explicitly passing mon instead of relying on cur_mon. where are they being mixed? bdrv_snapshot_dump() used a global variable cur_mon inside, instead of let caller pass in a explicit montior* mon, I guess that is the question. I'd have to see the code to tell, but yes, what Stefan described is the best practice for the Monitor. I think this would not be a problem until qemu wants more than one human monitor console, and then we may require a data structure to tell where to output the string: stdout, *mon, or even stderr, and error_printf() also need to be changed. Luiz, what is your idea? I'd like to respin v2 if no issues for it. As I said before, I'd have to see the code to tell. But answering your comment, the code does support multiple monitors. Hi Luiz, Sorry to ask again, do you think method above is OK now, waiting for your confirm. Can you point me to the code in question? Sure, it is + +/* + * Print to current monitor if we have one, else to stdout. It is similar with + * error_printf(). + * TODO just like error_vprintf() + */ +void message_printf(const char *fmt, ...) +{ +va_list ap; + +va_start(ap, fmt); +if (cur_mon) { +monitor_vprintf(cur_mon, fmt, ap); +} else { +vfprintf(stdout, fmt, ap); +} +va_end(ap); +} This function used global variable cur_mon instead of input parameter, similar to error_printf(). Why do you need it? Why can't you you use error_printf() for example?
Re: [Qemu-devel] [PATCH v2 2/2] net: introduce command to query mac-table information
On Thu, May 16, 2013 at 07:07:25PM +0800, Amos Kong wrote: We want to implement mac programming over macvtap through Libvirt. The previous patch adds QMP event to notify management of mac-table change. This patch adds a monitor command to query rx mode information of mac-tables. (qemu) info mac-table vnet0 vnet0: \ promisc: on \ allmulti: off \ alluni: off \ nomulti: off \ nouni: off \ nobcast: off \ multi_overflow: off \ uni_overflow: off \ multicast: 01:00:5e:00:00:01 33:33:00:00:00:01 33:33:ff:12:34:56 Signed-off-by: Amos Kong ak...@redhat.com It's an interesting example. There are no unicast addresses at all? I'm guessing you are missing the main MAC. --- hmp-commands.hx | 2 ++ hmp.c | 71 + hmp.h | 1 + hw/net/virtio-net.c | 63 +++ include/net/net.h | 2 ++ monitor.c | 8 ++ net/net.c | 38 qapi-schema.json| 57 ++ qmp-commands.hx | 53 +++ 9 files changed, 295 insertions(+) diff --git a/hmp-commands.hx b/hmp-commands.hx index 9cea415..e5c1b14 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -1639,6 +1639,8 @@ show qdev device model list show roms @item info tpm show the TPM device +@item info mac-table [net client name] +show the mac-table information for all nics (or for the given nic) @end table ETEXI diff --git a/hmp.c b/hmp.c index 4fb76ec..3e19df0 100644 --- a/hmp.c +++ b/hmp.c @@ -653,6 +653,77 @@ void hmp_info_tpm(Monitor *mon, const QDict *qdict) qapi_free_TPMInfoList(info_list); } +void hmp_info_mac_table(Monitor *mon, const QDict *qdict) +{ +MacTableInfoList *table_list, *entry; +strList *str_entry; +bool has_name = qdict_haskey(qdict, name); +const char *name = NULL; + +if (has_name) { +name = qdict_get_str(qdict, name); +} + +table_list = qmp_query_mac_table(has_name, name, NULL); +entry = table_list; + +while (entry) { +monitor_printf(mon, %s:\n, entry-value-name); +if (entry-value-has_promisc) { +monitor_printf(mon, \\ promisc: %s\n, + entry-value-promisc ? on : off); +} +if (entry-value-has_allmulti) { +monitor_printf(mon, \\ allmulti: %s\n, + entry-value-allmulti ? on : off); +} +if (entry-value-has_alluni) { +monitor_printf(mon, \\ alluni: %s\n, + entry-value-alluni ? on : off); +} +if (entry-value-has_nomulti) { +monitor_printf(mon, \\ nomulti: %s\n, + entry-value-nomulti ? on : off); +} +if (entry-value-has_nouni) { +monitor_printf(mon, \\ nouni: %s\n, + entry-value-nouni ? on : off); +} +if (entry-value-has_nobcast) { +monitor_printf(mon, \\ nobcast: %s\n, + entry-value-nobcast ? on : off); +} +if (entry-value-has_multi_overflow) { +monitor_printf(mon, \\ multi_overflow: %s\n, + entry-value-multi_overflow ? on : off); +} +if (entry-value-has_uni_overflow) { +monitor_printf(mon, \\ uni_overflow: %s\n, + entry-value-uni_overflow ? on : off); +} + +if (entry-value-has_unicast) { +str_entry = entry-value-unicast; +monitor_printf(mon, \\ unicast:\n); +while (str_entry) { +monitor_printf(mon, %s\n, str_entry-value); +str_entry = str_entry-next; +} +} +if (entry-value-has_multicast) { +str_entry = entry-value-multicast; +monitor_printf(mon, \\ multicast:\n); +while (str_entry) { +monitor_printf(mon, %s\n, str_entry-value); +str_entry = str_entry-next; +} +} + +entry = entry-next; +} +qapi_free_MacTableInfoList(table_list); +} + void hmp_quit(Monitor *mon, const QDict *qdict) { monitor_suspend(mon); diff --git a/hmp.h b/hmp.h index 95fe76e..278c722 100644 --- a/hmp.h +++ b/hmp.h @@ -37,6 +37,7 @@ void hmp_info_balloon(Monitor *mon, const QDict *qdict); void hmp_info_pci(Monitor *mon, const QDict *qdict); void hmp_info_block_jobs(Monitor *mon, const QDict *qdict); void hmp_info_tpm(Monitor *mon, const QDict *qdict); +void hmp_info_mac_table(Monitor *mon, const QDict *qdict); void hmp_quit(Monitor *mon, const QDict *qdict); void hmp_stop(Monitor *mon, const QDict *qdict); void hmp_system_reset(Monitor *mon, const QDict
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On Thu, 16 May 2013 15:17:45 +0300 Michael S. Tsirkin m...@redhat.com wrote: On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. Signed-off-by: Amos Kong ak...@redhat.com --- QMP/qmp-events.txt| 14 ++ hw/net/virtio-net.c | 12 include/monitor/monitor.h | 1 + monitor.c | 1 + 4 files changed, 28 insertions(+) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index 92fe5fb..24d62df 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -154,6 +154,20 @@ Data: path: /machine/peripheral/virtio-net-pci-0 }, timestamp: { seconds: 1265044230, microseconds: 450486 } } +MAC_TABLE_CHANGED +- + +Emitted mac-table configuration is changed by the guest. + +Data: + +- name: net client name (json-string) + +{ event: MAC_TABLE_CHANGED, + data: { name: vnet0 }, + timestamp: { seconds: 1368697518, microseconds: 326866 }} +} + DEVICE_TRAY_MOVED - diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index bed0822..a9b8f53 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -21,6 +21,8 @@ #include hw/virtio/virtio-net.h #include net/vhost_net.h #include hw/virtio/virtio-bus.h +#include qapi/qmp/qjson.h +#include monitor/monitor.h #define VIRTIO_NET_VM_VERSION11 @@ -395,6 +397,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, { uint8_t on; size_t s; +QObject *event_data; s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on)); if (s != sizeof(on)) { @@ -417,6 +420,10 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_ERR; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } Sorry, pls ignore my previous mail, I see you actually emit this on rx mode change as well. I find the name misleading or at least it mislead me :) RX_FILTER_CHANGED? @@ -425,6 +432,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, { struct virtio_net_ctrl_mac mac_data; size_t s; +QObject *event_data; if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) { if (iov_size(iov, iov_cnt) != sizeof(n-mac)) { @@ -497,6 +505,10 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, n-mac_table.multi_overflow = 1; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } This makes it easy for guest to flood management with spurious events. How about we set a flag after this, and avoid sending any more events until management queries the filter status? We have an API for that, look at monitor_protocol_event_init(). diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h index 1a6cfcf..e88c70f 100644 --- a/include/monitor/monitor.h +++ b/include/monitor/monitor.h @@ -40,6 +40,7 @@ typedef enum MonitorEvent { QEVENT_BLOCK_JOB_ERROR, QEVENT_BLOCK_JOB_READY, QEVENT_DEVICE_DELETED, +QEVENT_MAC_TABLE_CHANGED, QEVENT_DEVICE_TRAY_MOVED, QEVENT_SUSPEND, QEVENT_SUSPEND_DISK, diff --git a/monitor.c b/monitor.c index 62aaebe..9e51545 100644 --- a/monitor.c +++ b/monitor.c @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = { [QEVENT_BLOCK_JOB_ERROR] = BLOCK_JOB_ERROR, [QEVENT_BLOCK_JOB_READY] = BLOCK_JOB_READY, [QEVENT_DEVICE_DELETED] = DEVICE_DELETED, +[QEVENT_MAC_TABLE_CHANGED] = MAC_TABLE_CHANGED, [QEVENT_DEVICE_TRAY_MOVED] = DEVICE_TRAY_MOVED, [QEVENT_SUSPEND] = SUSPEND, [QEVENT_SUSPEND_DISK] = SUSPEND_DISK, -- 1.8.1.4
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On Thu, May 16, 2013 at 05:11:59PM +0800, Lei Li wrote: On 05/16/2013 03:35 PM, Amos Kong wrote: On Thu, May 16, 2013 at 03:23:21PM +0800, Lei Li wrote: On 05/16/2013 12:30 PM, Amos Kong wrote: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; Hi Lei, Does this check allow to distinguish the difference between auto-repeat and actual repeated entry by the user? Actual repeat by user: press event release event press event release event press event release event Auto-repeat example: press event press event press event release event Hi Lei, On what platform? Fedora 18 @ thinkpad t430s [root@t430s amos]# showkey (hold 'a') akeycode 30 press aaakeycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press aakeycode 30 press keycode 30 press keycode 30 release (one release event in the end) Qemu VM (rhel6, using vnc/ SDL) can get same result. AFAIK, the Auto-repeat event is like below on some GTK-based environments, keydown keypress keyup keydown keypress keyup| ... as reference link: https://developer.mozilla.org/zh-CN/docs/DOM/KeyboardEvent = Auto-repeat handling (it's also mentioned in mozilla page) When a key is pressed and held down, it begins to auto-repeat. This results in a sequence of events similar to the following being dispatched: keydown keypress keydown keypress repeating until the user releases the key keyup (only one up event in the end) And on Xwindows: keypress keyrelease keypress keyrelease ... as reference link: http://www.ypass.net/blog/2009/06/detecting-xlibs-keyboard-auto-repeat-functionality-and-how-to-fix-it/ Just what we’d expect, a bunch of KeyPress Events and one KeyRelease event. But that’s not how it works in X. So pppR is expected. From: http://www.computer-engineering.org/ps2keyboard/ When you press and hold down a key, that key becomes typematic, which means the keyboard will _keep_ sending that key's _make code_ until the key is released or another key is pressed. (repeatedly send the make code, only one break code in the end) This would cause it's hard to distinguish them. But looks like the links above is a little out of time, and I am not sure if the auto-repeat behaviour on such platforms has been changed. :) | Both of them works, but the repeated release events are redundant (no risk). -- Amos.
Re: [Qemu-devel] Qemu for ARM and MRS/MSR banked registers instructions
On 16 May 2013 13:15, François Legal francois.le...@thom.fr.eu.org wrote: Did anybody pointed out that there may be problems with Qemu decoding these MRS/MSR banked registers ? In my code, I do several mrs %r0, sp_usr mrs %r0, lr_usr from SVC mode or IRQ mode, and the result I get is CPSR in r0 I took a look in Qemu - translate.c and op_helper.c, and it seem there is a function to access usr/fiq banked registers (get_user_reg) but the call hierarchy for this function is not very clear. The MSR/MRS to/from banked register instructions are for ARMv7VE only (ie only cores with the Virtualization Extensions). Although QEMU implements a Cortex-A15 model, our A15 model doesn't support virtualization, and so we don't implement these virtualization only instructions. It's a bug that we don't UNDEF on them, but QEMU's instruction decoding has generally been rather looser than it should be. thanks -- PMM
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On Thu, May 16, 2013 at 08:24:03AM -0400, Luiz Capitulino wrote: On Thu, 16 May 2013 15:17:45 +0300 Michael S. Tsirkin m...@redhat.com wrote: On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. Signed-off-by: Amos Kong ak...@redhat.com --- QMP/qmp-events.txt| 14 ++ hw/net/virtio-net.c | 12 include/monitor/monitor.h | 1 + monitor.c | 1 + 4 files changed, 28 insertions(+) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index 92fe5fb..24d62df 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -154,6 +154,20 @@ Data: path: /machine/peripheral/virtio-net-pci-0 }, timestamp: { seconds: 1265044230, microseconds: 450486 } } +MAC_TABLE_CHANGED +- + +Emitted mac-table configuration is changed by the guest. + +Data: + +- name: net client name (json-string) + +{ event: MAC_TABLE_CHANGED, + data: { name: vnet0 }, + timestamp: { seconds: 1368697518, microseconds: 326866 }} +} + DEVICE_TRAY_MOVED - diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index bed0822..a9b8f53 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -21,6 +21,8 @@ #include hw/virtio/virtio-net.h #include net/vhost_net.h #include hw/virtio/virtio-bus.h +#include qapi/qmp/qjson.h +#include monitor/monitor.h #define VIRTIO_NET_VM_VERSION11 @@ -395,6 +397,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, { uint8_t on; size_t s; +QObject *event_data; s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on)); if (s != sizeof(on)) { @@ -417,6 +420,10 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_ERR; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } Sorry, pls ignore my previous mail, I see you actually emit this on rx mode change as well. I find the name misleading or at least it mislead me :) RX_FILTER_CHANGED? @@ -425,6 +432,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, { struct virtio_net_ctrl_mac mac_data; size_t s; +QObject *event_data; if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) { if (iov_size(iov, iov_cnt) != sizeof(n-mac)) { @@ -497,6 +505,10 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, n-mac_table.multi_overflow = 1; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } This makes it easy for guest to flood management with spurious events. How about we set a flag after this, and avoid sending any more events until management queries the filter status? We have an API for that, look at monitor_protocol_event_init(). You mean monitor_protocol_event_throttle? So what happens if guest triggers more MAC changes per second? diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h index 1a6cfcf..e88c70f 100644 --- a/include/monitor/monitor.h +++ b/include/monitor/monitor.h @@ -40,6 +40,7 @@ typedef enum MonitorEvent { QEVENT_BLOCK_JOB_ERROR, QEVENT_BLOCK_JOB_READY, QEVENT_DEVICE_DELETED, +QEVENT_MAC_TABLE_CHANGED, QEVENT_DEVICE_TRAY_MOVED, QEVENT_SUSPEND, QEVENT_SUSPEND_DISK, diff --git a/monitor.c b/monitor.c index 62aaebe..9e51545 100644 --- a/monitor.c +++ b/monitor.c @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = { [QEVENT_BLOCK_JOB_ERROR] = BLOCK_JOB_ERROR, [QEVENT_BLOCK_JOB_READY] = BLOCK_JOB_READY, [QEVENT_DEVICE_DELETED] = DEVICE_DELETED, +[QEVENT_MAC_TABLE_CHANGED] = MAC_TABLE_CHANGED, [QEVENT_DEVICE_TRAY_MOVED] = DEVICE_TRAY_MOVED, [QEVENT_SUSPEND] = SUSPEND, [QEVENT_SUSPEND_DISK] = SUSPEND_DISK, -- 1.8.1.4
Re: [Qemu-devel] [PATCH] ide-test: Fix endianness problems
Applied. Thanks. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH for-1.5 0/2] virtio-net: fix netclient id and type.
Applied. Thanks. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH for-1.5 0/3] hw/pci-host/versatile: Fix issues with newer kernels
Applied. Thanks. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH] vnc: Make ledstate comparison before modifiers updated
Applied. Thanks. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH for-1.5] configure: Detect uuid on MacOSX (fixes compile failure)
Applied. Thanks. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH for-1.5 0/2]
Applied. Thanks. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On Thu, 16 May 2013 15:45:17 +0300 Michael S. Tsirkin m...@redhat.com wrote: On Thu, May 16, 2013 at 08:24:03AM -0400, Luiz Capitulino wrote: On Thu, 16 May 2013 15:17:45 +0300 Michael S. Tsirkin m...@redhat.com wrote: On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. Signed-off-by: Amos Kong ak...@redhat.com --- QMP/qmp-events.txt| 14 ++ hw/net/virtio-net.c | 12 include/monitor/monitor.h | 1 + monitor.c | 1 + 4 files changed, 28 insertions(+) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index 92fe5fb..24d62df 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -154,6 +154,20 @@ Data: path: /machine/peripheral/virtio-net-pci-0 }, timestamp: { seconds: 1265044230, microseconds: 450486 } } +MAC_TABLE_CHANGED +- + +Emitted mac-table configuration is changed by the guest. + +Data: + +- name: net client name (json-string) + +{ event: MAC_TABLE_CHANGED, + data: { name: vnet0 }, + timestamp: { seconds: 1368697518, microseconds: 326866 }} +} + DEVICE_TRAY_MOVED - diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c index bed0822..a9b8f53 100644 --- a/hw/net/virtio-net.c +++ b/hw/net/virtio-net.c @@ -21,6 +21,8 @@ #include hw/virtio/virtio-net.h #include net/vhost_net.h #include hw/virtio/virtio-bus.h +#include qapi/qmp/qjson.h +#include monitor/monitor.h #define VIRTIO_NET_VM_VERSION11 @@ -395,6 +397,7 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, { uint8_t on; size_t s; +QObject *event_data; s = iov_to_buf(iov, iov_cnt, 0, on, sizeof(on)); if (s != sizeof(on)) { @@ -417,6 +420,10 @@ static int virtio_net_handle_rx_mode(VirtIONet *n, uint8_t cmd, return VIRTIO_NET_ERR; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } Sorry, pls ignore my previous mail, I see you actually emit this on rx mode change as well. I find the name misleading or at least it mislead me :) RX_FILTER_CHANGED? @@ -425,6 +432,7 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, { struct virtio_net_ctrl_mac mac_data; size_t s; +QObject *event_data; if (cmd == VIRTIO_NET_CTRL_MAC_ADDR_SET) { if (iov_size(iov, iov_cnt) != sizeof(n-mac)) { @@ -497,6 +505,10 @@ static int virtio_net_handle_mac(VirtIONet *n, uint8_t cmd, n-mac_table.multi_overflow = 1; } +event_data = qobject_from_jsonf({ 'name': %s }, n-netclient_name); +monitor_protocol_event(QEVENT_MAC_TABLE_CHANGED, event_data); +qobject_decref(event_data); + return VIRTIO_NET_OK; } This makes it easy for guest to flood management with spurious events. How about we set a flag after this, and avoid sending any more events until management queries the filter status? We have an API for that, look at monitor_protocol_event_init(). You mean monitor_protocol_event_throttle? So what happens if guest triggers more MAC changes per second? The QMP client will receive the newest one. diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h index 1a6cfcf..e88c70f 100644 --- a/include/monitor/monitor.h +++ b/include/monitor/monitor.h @@ -40,6 +40,7 @@ typedef enum MonitorEvent { QEVENT_BLOCK_JOB_ERROR, QEVENT_BLOCK_JOB_READY, QEVENT_DEVICE_DELETED, +QEVENT_MAC_TABLE_CHANGED, QEVENT_DEVICE_TRAY_MOVED, QEVENT_SUSPEND, QEVENT_SUSPEND_DISK, diff --git a/monitor.c b/monitor.c index 62aaebe..9e51545 100644 --- a/monitor.c +++ b/monitor.c @@ -490,6 +490,7 @@ static const char *monitor_event_names[] = { [QEVENT_BLOCK_JOB_ERROR] = BLOCK_JOB_ERROR, [QEVENT_BLOCK_JOB_READY] = BLOCK_JOB_READY, [QEVENT_DEVICE_DELETED] = DEVICE_DELETED, +[QEVENT_MAC_TABLE_CHANGED] = MAC_TABLE_CHANGED, [QEVENT_DEVICE_TRAY_MOVED] = DEVICE_TRAY_MOVED, [QEVENT_SUSPEND] = SUSPEND, [QEVENT_SUSPEND_DISK] = SUSPEND_DISK, -- 1.8.1.4
Re: [Qemu-devel] [PATCH buildfix for-1.5] qemu-common: Resolve vector build breakes for AltiVec
Applied. Thanks. Regards, Anthony Liguori
Re: [Qemu-devel] [RFC] block-trace Low Level Command Supporting Disk Introspection
[...] From my point of view, what I'm missing here is how would I use it. Ideally I'd like to issue some QMP commands which would set up the point-in-time snapshot, and then connect to this snapshot over (eg) NBD, then when I'm done, send some more QMP commands to tear down the snapshot. I think this document would be better with one or more examples showing how this would be used. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming blog: http://rwmj.wordpress.com Fedora now supports 80 OCaml packages (the OPEN alternative to F#)
Re: [Qemu-devel] [ANNOUNCE] QEMU 1.5.0-rc2 is now available
On Wed, May 15, 2013 at 06:53:47PM -0500, Anthony Liguori wrote: Hi, On behalf of the QEMU Team, I'd like to announce the availability of the third release candidate for the QEMU 1.5 release. This release is meant for testing purposes and should not be used in a production environment. http://wiki.qemu.org/download/qemu-1.5.0-rc2.tar.bz2 You can help improve the quality of the QEMU 1.5 release by testing this release and reporting bugs on Launchpad: Sorry to chime in on this so late in the cycle, but I just noticed what seems to be a pretty serious problem with migration to/from 1.4. This is the failure for 1.4 - 1.5-rc2 (qemu) migrate unix:/tmp/migrate.sock Unknown savevm section or instance ':00:03.0/virtio-net' 0 Configuration: source: v14/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -L v14-bios -M pc-i440fx-1.4 -m 512M -kernel boot/vmlinuz-x86_64 -initrd boot/test-initramfs-x86_64.img.gz -vga std -append seed=1234 -drive file=disk1.img,if=virtio -drive file=disk2.img,if=virtio -net nic,model=virtio -net user -monitor unix:/tmp/vm-hmp.sock,server,nowait -qmp unix:/tmp/vm-qmp.sock,server,nowait -vnc :100 target: v15rc2/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -L temp-bios -M pc-i440fx-1.4 -m 512M -kernel boot/vmlinuz-x86_64 -initrd boot/test-initramfs-x86_64.img.gz -vga std -append seed=1234 -drive file=disk1.img,if=virtio -drive file=disk2.img,if=virtio -net nic,model=virtio -net user -incoming unix:/tmp/migrate.sock -monitor unix:/tmp/vm-hmp-incoming.sock,server,nowait -qmp unix:/tmp/vm-qmp-incoming.sock,server,nowait -vnc :101 QEMU 1.4.0 monitor - type 'help' for more information This seems to have been introduced with the virtio refactoring: commit e37da3945fa2fde161e1b217f937fc318c4b7639 Author: KONRAD Frederic fred.kon...@greensocs.com Date: Thu Apr 11 16:29:58 2013 +0200 virtio-net-pci: switch to the new API. Here the virtio-net-pci is modified for the new API. The device virtio-net-pci extends virtio-pci. It creates and connects a virtio-net-device during the init. The properties are not changed. Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com Tested-by: Cornelia Huck cornelia.h...@de.ibm.com Message-id: 1365690602-22729-4-git-send-email-fred.kon...@greensocs.com Signed-off-by: Anthony Liguori aligu...@us.ibm.com And if we roll that back, we have similar failures for virtio-blk, and most likely the other virtio devices touched by the refactoring. The issue seems to be a change the way section id strings are generated in vmstate_register(). In v1.4.x we had: se-instance_id: 0, se-idstr: :00:03.0/virtio-net In v1.5.0-rc2 we have: se-instance_id: 0, se-idstr: virtio-net This seems to be due to the fact that these devices now sit on a TYPE_VIRTIO_BUS that has no implementation of TYPE_BUS's get_dev_path() interface, which is what savevm uses to calculate the id prefix for se-idstr. Prior to the refactoring, the device sat on a TYPE_PCI_BUS which used pcibus_get_dev_path() to calculate this. I'm not sure what the best fix is for this. I looking at implementing get_dev_path() for TYPE_VIRTIO_BUS, but to maintain migration compatibility we'd end up baking in PCI-specific stuff which from what I gather is exactly what we were trying to avoid there. I think adding a compat string property to TYPE_VIRTIO_DEVICE and having that get set somewhere like virtio_bus_plug_device() is a better approach, but vmstate_register() gets call during TYPE_VIRTIO_DEVICE init which I think happens before then. Still looking at it but if someone more familiar with this code has some ideas or wants to whip up a patch please jump right in. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH 3/3] tcg/aarch64: implement new TCG target for aarch64
On 14.05.2013 17:19, Richard Henderson wrote: On 05/14/2013 05:25 AM, Peter Maydell wrote: Yes, I agree. I could not find an image which triggered that code path for register rotation amounts. Try PPC : rlwmn will generate a rotl (as will other insns). rlwmn will only generate constant rotations; at issue are variable rotations. Those ought be be generatable with current sources and x86 32-bit or 64-bit rotate insns though. That cleanup was done during this release cycle, so if Claudio's testing was on a previous release... Indeed, I was able to test that codepath today after rebasing on current QEMU. We are working on a new patchset that tries to incorporate the changes discussed up to now. Thanks, Claudio
Re: [Qemu-devel] [ANNOUNCE] QEMU 1.5.0-rc2 is now available
Il 16/05/2013 16:21, mdroth ha scritto: commit e37da3945fa2fde161e1b217f937fc318c4b7639 Author: KONRAD Frederic fred.kon...@greensocs.com Date: Thu Apr 11 16:29:58 2013 +0200 virtio-net-pci: switch to the new API. Here the virtio-net-pci is modified for the new API. The device virtio-net-pci extends virtio-pci. It creates and connects a virtio-net-device during the init. The properties are not changed. Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com Tested-by: Cornelia Huck cornelia.h...@de.ibm.com Message-id: 1365690602-22729-4-git-send-email-fred.kon...@greensocs.com Signed-off-by: Anthony Liguori aligu...@us.ibm.com And if we roll that back, we have similar failures for virtio-blk, and most likely the other virtio devices touched by the refactoring. The issue seems to be a change the way section id strings are generated in vmstate_register(). In v1.4.x we had: se-instance_id: 0, se-idstr: :00:03.0/virtio-net In v1.5.0-rc2 we have: se-instance_id: 0, se-idstr: virtio-net This seems to be due to the fact that these devices now sit on a TYPE_VIRTIO_BUS that has no implementation of TYPE_BUS's get_dev_path() interface, which is what savevm uses to calculate the id prefix for se-idstr. Prior to the refactoring, the device sat on a TYPE_PCI_BUS which used pcibus_get_dev_path() to calculate this. I'm not sure what the best fix is for this. I looking at implementing get_dev_path() for TYPE_VIRTIO_BUS, but to maintain migration compatibility we'd end up baking in PCI-specific stuff which from what I gather is exactly what we were trying to avoid there. I think get_dev_path for TYPE_VIRTIO_BUS could simply forward to the parent device's parent bus. Paolo I think adding a compat string property to TYPE_VIRTIO_DEVICE and having that get set somewhere like virtio_bus_plug_device() is a better approach, but vmstate_register() gets call during TYPE_VIRTIO_DEVICE init which I think happens before then. Still looking at it but if someone more familiar with this code has some ideas or wants to whip up a patch please jump right in. Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On 05/16/2013 05:07 AM, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. Signed-off-by: Amos Kong ak...@redhat.com --- QMP/qmp-events.txt| 14 ++ hw/net/virtio-net.c | 12 include/monitor/monitor.h | 1 + monitor.c | 1 + 4 files changed, 28 insertions(+) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index 92fe5fb..24d62df 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -154,6 +154,20 @@ Data: path: /machine/peripheral/virtio-net-pci-0 }, timestamp: { seconds: 1265044230, microseconds: 450486 } } +MAC_TABLE_CHANGED +- + +Emitted mac-table configuration is changed by the guest. + +Data: + +- name: net client name (json-string) + +{ event: MAC_TABLE_CHANGED, + data: { name: vnet0 }, + timestamp: { seconds: 1368697518, microseconds: 326866 }} +} Is it worth trying to also provide details about the change as part of the event, to avoid having to do a round-trip query- command just to learn what the new values are? -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On 05/16/2013 06:17 AM, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. This makes it easy for guest to flood management with spurious events. How about we set a flag after this, and avoid sending any more events until management queries the filter status? Or use rate-limiting, similar to what we have done for other guest-triggered events (such as BALLOON_CHANGE), where management can then tweak the maximum frequency at which it is willing to receive events. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On Thu, May 16, 2013 at 08:56:42AM -0600, Eric Blake wrote: On 05/16/2013 05:07 AM, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. Signed-off-by: Amos Kong ak...@redhat.com --- QMP/qmp-events.txt| 14 ++ hw/net/virtio-net.c | 12 include/monitor/monitor.h | 1 + monitor.c | 1 + 4 files changed, 28 insertions(+) diff --git a/QMP/qmp-events.txt b/QMP/qmp-events.txt index 92fe5fb..24d62df 100644 --- a/QMP/qmp-events.txt +++ b/QMP/qmp-events.txt @@ -154,6 +154,20 @@ Data: path: /machine/peripheral/virtio-net-pci-0 }, timestamp: { seconds: 1265044230, microseconds: 450486 } } +MAC_TABLE_CHANGED +- + +Emitted mac-table configuration is changed by the guest. + +Data: + +- name: net client name (json-string) + +{ event: MAC_TABLE_CHANGED, + data: { name: vnet0 }, + timestamp: { seconds: 1368697518, microseconds: 326866 }} +} Is it worth trying to also provide details about the change as part of the event, to avoid having to do a round-trip query- command just to learn what the new values are? -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org That really depends on the device though. Some give you incremental add/delete mac commands, others might let you replace the whole rx filter in one go. So if yes I'd say we should dump the whole table not what changed. -- MST
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
Il 16/05/2013 06:30, Amos Kong ha scritto: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com --- hw/input/ps2.c | 42 ++ 1 file changed, 42 insertions(+) diff --git a/hw/input/ps2.c b/hw/input/ps2.c index 3412079..1cfe055 100644 --- a/hw/input/ps2.c +++ b/hw/input/ps2.c @@ -94,6 +94,9 @@ typedef struct { int translate; int scancode_set; /* 1=XT, 2=AT, 3=PS/2 */ int ledstate; +int repeat_period; /* typematic period, ms */ +int repeat_delay; /* typematic delay, ms */ These have to be migrated in a subsection. +int repeat_key; /* keycode to repeat */ } PS2KbdState; typedef struct { @@ -146,6 +149,22 @@ void ps2_queue(void *opaque, int b) s-update_irq(s-update_arg, 1); } +static QEMUTimer *repeat_timer; In theory, this timer and repeat_key also need to be migrated. But perhaps not migrating anything is fine. Migration will then stop the autorepeat, which is not a particularly bad thing and may even avoid stuck keys. However, please add a comment, and move the timer into Ps2KbdState instead of having a global. +static bool auto_repeat; See below about removing this other global. +static void repeat_ps2_queue(void *opaque) +{ +PS2KbdState *s = opaque; + +if (auto_repeat) { +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_period, + 1000)); +ps2_queue(s-common, s-repeat_key); +} +} + + /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; ps2_queue(s-common, keycode); + +if (auto_repeat) { +s-repeat_key = keycode; +/* delay a while before first repeat */ +qemu_mod_timer(repeat_timer, qemu_get_clock_ns(vm_clock) + + muldiv64(get_ticks_per_sec(), s-repeat_delay, 1000)); +} Please del_timer here so you can remove the if above. Paolo } uint32_t ps2_read_data(void *opaque) @@ -213,6 +242,11 @@ static void ps2_reset_keyboard(PS2KbdState *s) void ps2_write_keyboard(void *opaque, int val) { +/* repeat period/delay table from kernel (drivers/input/keyboard/atkbd.c) */ +const short period[32] = { 33, 37, 42, 46, 50, 54, 58, 63, 67, 75, +83, 92, 100, 109, 116, 125, 133, 149, 167, 182, 200, 217, 232, +250, 270, 303, 333, 370, 400, 435, 470, 500 }; +const short delay[4] = { 250, 500, 750, 1000 }; PS2KbdState *s = (PS2KbdState *)opaque; switch(s-common.write_cmd) { @@ -288,6 +322,11 @@ void ps2_write_keyboard(void *opaque, int val) s-common.write_cmd = -1; break; case KBD_CMD_SET_RATE: + /* Bit0-4 specifies the repeat rate */ +s-repeat_period = period[val 0x1f]; + /* Bit5-6 bit specifies the delay time */ +s-repeat_delay = delay[(val 0x60) 5]; + ps2_queue(s-common, KBD_REPLY_ACK); s-common.write_cmd = -1; break; @@ -536,6 +575,8 @@ static void ps2_kbd_reset(void *opaque) s-scan_enabled = 0; s-translate = 0; s-scancode_set = 0; +s-repeat_period = 92; +s-repeat_delay = 500; } static void ps2_mouse_reset(void *opaque) @@ -660,6 +701,7 @@ void *ps2_kbd_init(void (*update_irq)(void *, int), void *update_arg) vmstate_register(NULL, 0, vmstate_ps2_keyboard, s); qemu_add_kbd_event_handler(ps2_put_keycode, s); qemu_register_reset(ps2_kbd_reset, s); +repeat_timer = qemu_new_timer_ns(vm_clock, repeat_ps2_queue, s); return s; }
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On Thu, May 16, 2013 at 08:58:38AM -0600, Eric Blake wrote: On 05/16/2013 06:17 AM, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. This makes it easy for guest to flood management with spurious events. How about we set a flag after this, and avoid sending any more events until management queries the filter status? Or use rate-limiting, similar to what we have done for other guest-triggered events (such as BALLOON_CHANGE), where management can then tweak the maximum frequency at which it is willing to receive events. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org I'm not sure how would management set the rate though, and any throttling here might hurt the guest, unlike the balloon. OTOH what I proposed kind of moderates itself automatically. -- MST
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
Il 16/05/2013 08:58, Amos Kong ha scritto: theoretically, we have to check if the typematic key is in break now, if so, we will not do repeat anymore. You mean key is released? I checked by '~keycode 0x80', stop repeat when key is release. BTW, !(keycode 0x80) is more readable. Paolo
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On Thu, May 16, 2013 at 06:03:26PM +0300, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 08:58:38AM -0600, Eric Blake wrote: On 05/16/2013 06:17 AM, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. This makes it easy for guest to flood management with spurious events. How about we set a flag after this, and avoid sending any more events until management queries the filter status? Or use rate-limiting, similar to what we have done for other guest-triggered events (such as BALLOON_CHANGE), where management can then tweak the maximum frequency at which it is willing to receive events. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org I'm not sure how would management set the rate though, and any throttling here might hurt the guest, unlike the balloon. OTOH what I proposed kind of moderates itself automatically. To clarify the issue: - guest might be changing macs a lot not because it is malicious, but because it has a reason to do it. delaying the filter update for such a guest would drop lots of packets. To clarify what I am proposing: - on info mac-table - clear flag - on mac-table change - test and set flag if was not set - send event to management if was set - do not send event This way management does not get events faster than it can handle them. -- MST
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
Il 16/05/2013 22:37, Amos Kong ha scritto: On Thu, May 16, 2013 at 05:11:59PM +0800, Lei Li wrote: On 05/16/2013 03:35 PM, Amos Kong wrote: On Thu, May 16, 2013 at 03:23:21PM +0800, Lei Li wrote: On 05/16/2013 12:30 PM, Amos Kong wrote: Guest driver sets repeat rate and delay time by KBD_CMD_SET_RATE, but ps2 backend doesn't process it and no auto-repeat implementation. This patch adds support of auto-repeat feature. Guest ps2 driver sets autorepeat to fastest possible in reset, period: 250ms, delay: 33ms Tested by 'sendkey' monitor command. referenced: http://www.computer-engineering.org/ps2keyboard/ Signed-off-by: Amos Kong ak...@redhat.com /* keycode is expressed as follow: bit 7- 0 key pressed, 1 = key released @@ -167,7 +186,17 @@ static void ps2_put_keycode(void *opaque, int keycode) keycode = ps2_raw_keycode_set3[keycode 0x7f]; } } + +/* only auto-repeat press event */ +auto_repeat = ~keycode 0x80; Hi Lei, Does this check allow to distinguish the difference between auto-repeat and actual repeated entry by the user? Actual repeat by user: press event release event press event release event press event release event Auto-repeat example: press event press event press event release event Hi Lei, On what platform? Fedora 18 @ thinkpad t430s [root@t430s amos]# showkey (hold 'a') akeycode 30 press aaakeycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press keycode 30 press aakeycode 30 press keycode 30 press keycode 30 release (one release event in the end) Qemu VM (rhel6, using vnc/ SDL) can get same result. AFAIK, the Auto-repeat event is like below on some GTK-based environments, keydown keypress keyup keydown keypress keyup| ... as reference link: https://developer.mozilla.org/zh-CN/docs/DOM/KeyboardEvent = Auto-repeat handling (it's also mentioned in mozilla page) When a key is pressed and held down, it begins to auto-repeat. This results in a sequence of events similar to the following being dispatched: keydown keypress keydown keypress repeating until the user releases the key keyup (only one up event in the end) And on Xwindows: keypress keyrelease keypress keyrelease ... as reference link: http://www.ypass.net/blog/2009/06/detecting-xlibs-keyboard-auto-repeat-functionality-and-how-to-fix-it/ Just what we’d expect, a bunch of KeyPress Events and one KeyRelease event. But that’s not how it works in X. ... In XWindows, you get a KeyRelease for every KeyPress Event. In X, it looks something like this: PRPRPRPRPRPRPRPR Can you test your patch with all of VNC, SDL and GTK+? Paolo
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On 05/16/2013 09:03 AM, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 08:58:38AM -0600, Eric Blake wrote: On 05/16/2013 06:17 AM, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. This makes it easy for guest to flood management with spurious events. How about we set a flag after this, and avoid sending any more events until management queries the filter status? Or use rate-limiting, similar to what we have done for other guest-triggered events (such as BALLOON_CHANGE), where management can then tweak the maximum frequency at which it is willing to receive events. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org I'm not sure how would management set the rate though, and any throttling here might hurt the guest, unlike the balloon. If I understand how memballoon throttling works, throttling is NOT guest-visible; it merely controls how frequently the guest can trigger an event to the host. The host always gets the latest guest status, but only after a timeout has occurred since the last event sent (therefore, 2 back-to-back changes mean that the second event isn't sent until the timeout elapses; 3 back-to-back means that the 2nd is dropped and only the first and third changes get sent, with the 3rd waiting until after the timeout). That is, not all changes reach the host, the first change always happens immediately, but subsequent changes may be deferred until the timeout elapses, but the host will eventually see the final change, and no slower than the frequency it configures for the throttling. Or are you are arguing that if the guest makes a request, but the host waits until a second has elapsed before it even gets the event to act on the request, then the guest has suffered a performance loss? OTOH what I proposed kind of moderates itself automatically. Your approach (no more events until the host has acknowleged) has a potential problem if the host misses the event (because of a libvirtd restart, for example - but then again, on a libvirtd restart, libvirt should probably query current state to get itself back in sync); and also means that the host sees stale event data if subsequent events were squelched because the host hasn't reacted to the first event yet. The existing throttling approach ensures that if the event includes latest guest information, then the host doesn't even have to do do a query, and is guaranteed that reacting to the final event will always see the most recent request. But most importantly, if the existing throttling works, why do we have to invent a one-off approach for this event instead of reusing existing code? -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On Thu, May 16, 2013 at 09:12:36AM -0600, Eric Blake wrote: On 05/16/2013 09:03 AM, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 08:58:38AM -0600, Eric Blake wrote: On 05/16/2013 06:17 AM, Michael S. Tsirkin wrote: On Thu, May 16, 2013 at 07:07:24PM +0800, Amos Kong wrote: Introduce this new QMP event to notify management after guest changes mac-table configuration. This makes it easy for guest to flood management with spurious events. How about we set a flag after this, and avoid sending any more events until management queries the filter status? Or use rate-limiting, similar to what we have done for other guest-triggered events (such as BALLOON_CHANGE), where management can then tweak the maximum frequency at which it is willing to receive events. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org I'm not sure how would management set the rate though, and any throttling here might hurt the guest, unlike the balloon. If I understand how memballoon throttling works, throttling is NOT guest-visible; it merely controls how frequently the guest can trigger an event to the host. The host always gets the latest guest status, but only after a timeout has occurred since the last event sent (therefore, 2 back-to-back changes mean that the second event isn't sent until the timeout elapses; 3 back-to-back means that the 2nd is dropped and only the first and third changes get sent, with the 3rd waiting until after the timeout). That is, not all changes reach the host, the first change always happens immediately, but subsequent changes may be deferred until the timeout elapses, but the host will eventually see the final change, and no slower than the frequency it configures for the throttling. Or are you are arguing that if the guest makes a request, but the host waits until a second has elapsed before it even gets the event to act on the request, then the guest has suffered a performance loss? Yes, that's what I'm saying. OTOH what I proposed kind of moderates itself automatically. Your approach (no more events until the host has acknowleged) has a potential problem if the host misses the event (because of a libvirtd restart, for example - but then again, on a libvirtd restart, libvirt should probably query current state to get itself back in sync); exactly and also means that the host sees stale event data if subsequent events were squelched because the host hasn't reacted to the first event yet. So, let's not send any data in the event. Amos's patch does exafctly that. The existing throttling approach ensures that if the event includes latest guest information, then the host doesn't even have to do do a query, and is guaranteed that reacting to the final event will always see the most recent request. But most importantly, if the existing throttling works, why do we have to invent a one-off approach for this event instead of reusing existing code? Because of the 1st issue above. A large delay because we exceed an arbitrary throttling rate would be bad for the guest. Contrast with delay in e.g. device delete event. The throttling mechanism is good for events that host cares about, not for events that guest cares about. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org
Re: [Qemu-devel] [ANNOUNCE] QEMU 1.5.0-rc2 is now available
On 16/05/2013 16:21, mdroth wrote: On Wed, May 15, 2013 at 06:53:47PM -0500, Anthony Liguori wrote: Hi, On behalf of the QEMU Team, I'd like to announce the availability of the third release candidate for the QEMU 1.5 release. This release is meant for testing purposes and should not be used in a production environment. http://wiki.qemu.org/download/qemu-1.5.0-rc2.tar.bz2 You can help improve the quality of the QEMU 1.5 release by testing this release and reporting bugs on Launchpad: Sorry to chime in on this so late in the cycle, but I just noticed what seems to be a pretty serious problem with migration to/from 1.4. This is the failure for 1.4 - 1.5-rc2 (qemu) migrate unix:/tmp/migrate.sock Unknown savevm section or instance ':00:03.0/virtio-net' 0 Configuration: source: v14/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -L v14-bios -M pc-i440fx-1.4 -m 512M -kernel boot/vmlinuz-x86_64 -initrd boot/test-initramfs-x86_64.img.gz -vga std -append seed=1234 -drive file=disk1.img,if=virtio -drive file=disk2.img,if=virtio -net nic,model=virtio -net user -monitor unix:/tmp/vm-hmp.sock,server,nowait -qmp unix:/tmp/vm-qmp.sock,server,nowait -vnc :100 target: v15rc2/x86_64-softmmu/qemu-system-x86_64 -enable-kvm -L temp-bios -M pc-i440fx-1.4 -m 512M -kernel boot/vmlinuz-x86_64 -initrd boot/test-initramfs-x86_64.img.gz -vga std -append seed=1234 -drive file=disk1.img,if=virtio -drive file=disk2.img,if=virtio -net nic,model=virtio -net user -incoming unix:/tmp/migrate.sock -monitor unix:/tmp/vm-hmp-incoming.sock,server,nowait -qmp unix:/tmp/vm-qmp-incoming.sock,server,nowait -vnc :101 QEMU 1.4.0 monitor - type 'help' for more information This seems to have been introduced with the virtio refactoring: commit e37da3945fa2fde161e1b217f937fc318c4b7639 Author: KONRAD Frederic fred.kon...@greensocs.com Date: Thu Apr 11 16:29:58 2013 +0200 virtio-net-pci: switch to the new API. Here the virtio-net-pci is modified for the new API. The device virtio-net-pci extends virtio-pci. It creates and connects a virtio-net-device during the init. The properties are not changed. Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com Tested-by: Cornelia Huck cornelia.h...@de.ibm.com Message-id: 1365690602-22729-4-git-send-email-fred.kon...@greensocs.com Signed-off-by: Anthony Liguori aligu...@us.ibm.com And if we roll that back, we have similar failures for virtio-blk, and most likely the other virtio devices touched by the refactoring. The issue seems to be a change the way section id strings are generated in vmstate_register(). In v1.4.x we had: se-instance_id: 0, se-idstr: :00:03.0/virtio-net In v1.5.0-rc2 we have: se-instance_id: 0, se-idstr: virtio-net This seems to be due to the fact that these devices now sit on a TYPE_VIRTIO_BUS that has no implementation of TYPE_BUS's get_dev_path() interface, which is what savevm uses to calculate the id prefix for se-idstr. Prior to the refactoring, the device sat on a TYPE_PCI_BUS which used pcibus_get_dev_path() to calculate this. I'm not sure what the best fix is for this. I looking at implementing get_dev_path() for TYPE_VIRTIO_BUS, but to maintain migration compatibility we'd end up baking in PCI-specific stuff which from what I gather is exactly what we were trying to avoid there. I think adding a compat string property to TYPE_VIRTIO_DEVICE and having that get set somewhere like virtio_bus_plug_device() is a better approach, but vmstate_register() gets call during TYPE_VIRTIO_DEVICE init which I think happens before then. Still looking at it but if someone more familiar with this code has some ideas or wants to whip up a patch please jump right in. Sorry for that. Have you made progress? I'm trying to add get_dev_path function to virtio-pci-bus in virtio-pci.c as Paolo suggests. How do you get those instance_id to check It's working? Fred Regards, Anthony Liguori
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
On 16 May 2013 16:09, Paolo Bonzini pbonz...@redhat.com wrote: ... In XWindows, you get a KeyRelease for every KeyPress Event. In X, it looks something like this: PRPRPRPRPRPRPRPR Shouldn't we be abstracting this platform difference out in the ui layer, rather than having to deal with it in the ps2 device model? That is, we should define what our key-repeat model is for the QEMU keyboard-event-handler API, and then make sure all our UI frontends (gtk, sdl, cocoa, etc) do what we require... thanks -- PMM
Re: [Qemu-devel] [PATCH v2 1/2] net: introduce MAC_TABLE_CHANGED event
On 05/16/2013 09:17 AM, Michael S. Tsirkin wrote: The existing throttling approach ensures that if the event includes latest guest information, then the host doesn't even have to do do a query, and is guaranteed that reacting to the final event will always see the most recent request. But most importantly, if the existing throttling works, why do we have to invent a one-off approach for this event instead of reusing existing code? Because of the 1st issue above. A large delay because we exceed an arbitrary throttling rate would be bad for the guest. Contrast with delay in e.g. device delete event. The throttling mechanism is good for events that host cares about, not for events that guest cares about. Alright, your argument has me convinced :) Looks like we DO want to react to the guest as fast as possible, for less missed traffic in the guest, but also without overwhelming the host with events. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] ps2: add support of auto-repeat
Il 16/05/2013 17:17, Peter Maydell ha scritto: On 16 May 2013 16:09, Paolo Bonzini pbonz...@redhat.com wrote: ... In XWindows, you get a KeyRelease for every KeyPress Event. In X, it looks something like this: PRPRPRPRPRPRPRPR Shouldn't we be abstracting this platform difference out in the ui layer, rather than having to deal with it in the ps2 device model? That is, we should define what our key-repeat model is for the QEMU keyboard-event-handler API, and then make sure all our UI frontends (gtk, sdl, cocoa, etc) do what we require... Yes, I am asking Amos to check which of our frontends comply. It needs to be checked in the host, because Linux guests emulate autorepeat anyway. Or you can test with FreeDOS. Paolo