Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Sat, Jul 18, 2015 at 10:21:28PM +0200, Paolo Bonzini wrote: The change from v2 is very small, notify_me is decremented as soon as ppoll returns. Paolo v1-v2 Split some changes to the tests to a separate patch Fix commit message [Laszlo] Clarify do...while loop in aio-win32.c [Kevin] v2-v3 Decrement notify_me a little earlier in aio-posix.c. Paolo Bonzini (3): tests: remove irrelevant assertions from test-aio aio-win32: reorganize polling loop AioContext: fix broken ctx-dispatching optimization aio-posix.c | 18 +--- aio-win32.c | 41 ++ async.c | 21 +- docs/aio_notify.promela | 77 + include/block/aio.h | 29 +++ tests/test-aio.c| 26 +++-- 6 files changed, 97 insertions(+), 115 deletions(-) Thanks, applied to my block tree: https://github.com/stefanha/qemu/commits/block Stefan pgpNsAjTUOfHW.pgp Description: PGP signature
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
I tested this patch series, plus 'AioContext: fix missing wakeups due to event_notifier_test_and_clear' overnight, and there were no failures after something like 2500 iterations. So: Tested-by: Richard W.M. Jones rjo...@redhat.com Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
[Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
The change from v2 is very small, notify_me is decremented as soon as ppoll returns. Paolo v1-v2 Split some changes to the tests to a separate patch Fix commit message [Laszlo] Clarify do...while loop in aio-win32.c [Kevin] v2-v3 Decrement notify_me a little earlier in aio-posix.c. Paolo Bonzini (3): tests: remove irrelevant assertions from test-aio aio-win32: reorganize polling loop AioContext: fix broken ctx-dispatching optimization aio-posix.c | 18 +--- aio-win32.c | 41 ++ async.c | 21 +- docs/aio_notify.promela | 77 + include/block/aio.h | 29 +++ tests/test-aio.c| 26 +++-- 6 files changed, 97 insertions(+), 115 deletions(-) -- 2.4.3
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 17/07/2015 15:28, Marc Zyngier wrote: Marc, does it ring any bell? Well, this is an example of a guest accessing non-memory using an instruction that we cannot safely emulate - not an IO accessor (load multiple, for example). In this case, we kill the guest (we could as well inject a fault). Yup, I later found this in the dmesg: [1875219.903969] kvm [14843]: load/store instruction decoding not implemented This vcpu seems to be accessing 0x918 (the mmio structure points there), but I can't immediately relate it to the content of the registers. 0x918 is the pl011, which makes some sense. Have you ever seen a corrupted register dump? The guest RAM goes from 0x4000 to 0xbfff, so SP is pointing outside memory. PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) If the register dump is not corrupted, execution went in the weeds... I don't have the guest anymore but it's just firmware so the memory contents are well reproducible: 0xbf671200: a97d6ffa ldmdbge sp!, {r1, r3, r4, r5, r6, r7, r8, r9, sl, fp, sp, lr}^ 0xbf671204: a97e77fc ldmdbge lr!, {r2, r3, r4, r5, r6, r7, r8, r9, sl, ip, sp, lr}^ 0xbf671208: f85f03fe undefined instruction 0xf85f03fe 0xbf67120c: 910803ff strdls r0, [r8, -pc] 0xbf671210: ad7007e0 ldclge 7, cr0, [r0, #-896]! 0xbf671214: ad710fe2 ldclge 15, cr0, [r1, #-904]! 0xbf671218: ad7217e4 ldclge 7, cr1, [r2, #-912]! 0xbf67121c: ad731fe6 ldclge 15, cr1, [r3, #-920]! 0xbf671220: ad7427e8 ldclge 7, cr2, [r4, #-928]! 0xbf671224: ad752fea ldclge 15, cr2, [r5, #-936]! What looks a bit weird is that all the registers are clamped to 32bit, but the PSTATE indicates it is running in 64bit (EL1h, which makes the PC being utterly wrong). The PC can be okay since UEFI code doesn't really use virtual addressing, but the other registers are weird indeed. What are you running there? Just firmware. It's a UEFI reboot loop (as soon as you get to the UEFI shell QEMU exits and the script starts a new one). The kernel is an old 3.19 one, I'll upgrade and retest. Paolo
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, 17 Jul 2015 15:04:27 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 15:28, Marc Zyngier wrote: Marc, does it ring any bell? Well, this is an example of a guest accessing non-memory using an instruction that we cannot safely emulate - not an IO accessor (load multiple, for example). In this case, we kill the guest (we could as well inject a fault). Yup, I later found this in the dmesg: [1875219.903969] kvm [14843]: load/store instruction decoding not implemented This vcpu seems to be accessing 0x918 (the mmio structure points there), but I can't immediately relate it to the content of the registers. 0x918 is the pl011, which makes some sense. Have you ever seen a corrupted register dump? The guest RAM goes from 0x4000 to 0xbfff, so SP is pointing outside memory. I've never seen such a corruption - so far. PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) If the register dump is not corrupted, execution went in the weeds... I don't have the guest anymore but it's just firmware so the memory contents are well reproducible: 0xbf671200: a97d6ffa ldmdbgesp!, {r1, r3, r4, r5, r6, r7, r8, r9, sl, fp, sp, lr}^ 0xbf671204: a97e77fc ldmdbgelr!, {r2, r3, r4, r5, r6, r7, r8, r9, sl, ip, sp, lr}^ 0xbf671208: f85f03fe undefined instruction 0xf85f03fe 0xbf67120c: 910803ff strdls r0, [r8, -pc] 0xbf671210: ad7007e0 ldclge 7, cr0, [r0, #-896]! 0xbf671214: ad710fe2 ldclge 15, cr0, [r1, #-904]! 0xbf671218: ad7217e4 ldclge 7, cr1, [r2, #-912]! 0xbf67121c: ad731fe6 ldclge 15, cr1, [r3, #-920]! 0xbf671220: ad7427e8 ldclge 7, cr2, [r4, #-928]! 0xbf671224: ad752fea ldclge 15, cr2, [r5, #-936]! But that's all 32bit code, and your guest was running in 64bit. What does it look like as A64? What looks a bit weird is that all the registers are clamped to 32bit, but the PSTATE indicates it is running in 64bit (EL1h, which makes the PC being utterly wrong). The PC can be okay since UEFI code doesn't really use virtual addressing, but the other registers are weird indeed. It definitely looks like something tramped on your register file. KVM doesn't do that at all (we use the whole AArch64 register file irrespective of the execution state). Is your UEFI guest 32 or 64bit? M. -- Jazz is not dead. It just smells funny.
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 07/17/15 15:28, Marc Zyngier wrote: On Fri, 17 Jul 2015 10:30:38 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 06:44, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. So it may not be useful, but still successful test is successful. :) The following patch, which also includes the delta between v2 and v3 of this series, survived 674 reboots before hitting a definitely unrelated problem: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) For the record, this is the kvm_run struct: $6 = {request_interrupt_window = 0 '\000', padding1 = \000\000\000\000\000\000, exit_reason = 0, ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, count = 0, data_offset = 144}, debug = {arch = {No data fields}}, mmio = {phys_addr = 150994968, data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 4, 0 repeats 29 times}}, papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc = 0 '\000', sel1 = 0 '\000', sel2 = 0}, padding = \030\000\000\t\000\000\000\000\220\000\000\000\000\000\000\000\004, '\000' repeats 238 times}, kvm_valid_regs = 0,
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, 17 Jul 2015 14:39:55 +0100 Laszlo Ersek ler...@redhat.com wrote: On 07/17/15 15:28, Marc Zyngier wrote: On Fri, 17 Jul 2015 10:30:38 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 06:44, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. So it may not be useful, but still successful test is successful. :) The following patch, which also includes the delta between v2 and v3 of this series, survived 674 reboots before hitting a definitely unrelated problem: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) For the record, this is the kvm_run struct: $6 = {request_interrupt_window = 0 '\000', padding1 = \000\000\000\000\000\000, exit_reason = 0, ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, count = 0, data_offset = 144}, debug = {arch = {No data fields}}, mmio = {phys_addr = 150994968, data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 4, 0 repeats 29 times}}, papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000',
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, Jul 17, 2015 at 02:48:40PM +0100, Marc Zyngier wrote: Still: there is nothing in the registers that remotely points to that area. X0 is the closest, but it'd take a big negative offset to get there. Is that a Linux kernel? or something else? You're sure it's not this one? https://bugzilla.redhat.com/show_bug.cgi?id=1194366 That was caused by ftrace screwing up guest memory, so it was effectively running random code. It is also fixed (by you in fact). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-top is 'top' for virtual machines. Tiny program with many powerful monitoring features, net stats, disk stats, logging, etc. http://people.redhat.com/~rjones/virt-top
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 07/17/15 15:48, Marc Zyngier wrote: On Fri, 17 Jul 2015 14:39:55 +0100 Laszlo Ersek ler...@redhat.com wrote: On 07/17/15 15:28, Marc Zyngier wrote: On Fri, 17 Jul 2015 10:30:38 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 06:44, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. So it may not be useful, but still successful test is successful. :) The following patch, which also includes the delta between v2 and v3 of this series, survived 674 reboots before hitting a definitely unrelated problem: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) For the record, this is the kvm_run struct: $6 = {request_interrupt_window = 0 '\000', padding1 = \000\000\000\000\000\000, exit_reason = 0, ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, count = 0, data_offset = 144}, debug = {arch = {No data fields}}, mmio = {phys_addr = 150994968, data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 4, 0 repeats 29 times}}, papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc = 0 '\000', sel1 = 0 '\000', sel2 = 0}, padding =
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, 17 Jul 2015 14:53:06 +0100 Richard W.M. Jones rjo...@redhat.com wrote: On Fri, Jul 17, 2015 at 02:48:40PM +0100, Marc Zyngier wrote: Still: there is nothing in the registers that remotely points to that area. X0 is the closest, but it'd take a big negative offset to get there. Is that a Linux kernel? or something else? You're sure it's not this one? https://bugzilla.redhat.com/show_bug.cgi?id=1194366 That was caused by ftrace screwing up guest memory, so it was effectively running random code. It is also fixed (by you in fact). Don't think so. The bug you quote was the guest kernel being buggy, and touching non-memory space. This new issue seems different - this is not a Linux kernel, by the look of it. M. -- Jazz is not dead. It just smells funny.
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, Jul 17, 2015 at 11:30:38AM +0200, Paolo Bonzini wrote: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) Vaguely reminiscent of this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1194366 (See comment 7 in particular) diff --git a/aio-posix.c b/aio-posix.c index 268d14d..d2011d0 100644 --- a/aio-posix.c +++ b/aio-posix.c @@ -273,6 +273,13 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_context_acquire(ctx); } +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + +if (blocking) { +atomic_sub(ctx-notify_me, 2); +} + /* if we have any readable fds, dispatch event */ if (ret 0) { for (i = 0; i npfd; i++) { @@ -283,10 +290,6 @@ bool aio_poll(AioContext *ctx, bool blocking) npfd = 0; ctx-walking_handlers--; -if (blocking) { -atomic_sub(ctx-notify_me, 2); -} - /* Run dispatch even if there were no readable fds to run timers */ if (aio_dispatch(ctx)) { progress = true; diff --git a/aio-win32.c b/aio-win32.c index 2bfd5f8..33809fd 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -326,6 +326,10 @@ bool aio_poll(AioContext *ctx, bool blocking) if (timeout) { aio_context_acquire(ctx); } + +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + if (blocking) { assert(first); atomic_sub(ctx-notify_me, 2); diff --git a/async.c b/async.c index 9204907..120e183 100644 --- a/async.c +++ b/async.c @@ -202,6 +202,9 @@ aio_ctx_check(GSource *source) AioContext *ctx = (AioContext *) source; QEMUBH *bh; +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + atomic_and(ctx-notify_me, ~1); for (bh = ctx-first_bh; bh; bh = bh-next) { if (!bh-deleted bh-scheduled) { @@ -280,6 +280,10 @@ static void aio_rfifolock_cb(void *opaque) aio_notify(opaque); } +static void event_notifier_dummy_cb(EventNotifier *e) +{ +} + AioContext *aio_context_new(Error **errp) { int ret; @@ -292,7 +296,7 @@ AioContext *aio_context_new(Error **errp) return NULL; } g_source_set_can_recurse(ctx-source, true); -aio_set_event_notifier(ctx, ctx-notifier, NULL); +aio_set_event_notifier(ctx, ctx-notifier, event_notifier_dummy_cb); ctx-thread_pool = NULL; qemu_mutex_init(ctx-bh_lock); rfifolock_init(ctx-lock, aio_rfifolock_cb, ctx); With this patch, I've got to 500 iterations without seeing the error. Still testing it however ... Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com virt-df lists disk usage of guests without needing to install any software inside the virtual machine. Supports Linux and Windows. http://people.redhat.com/~rjones/virt-df/
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On Fri, 17 Jul 2015 10:30:38 +0100 Paolo Bonzini pbonz...@redhat.com wrote: On 17/07/2015 06:44, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. So it may not be useful, but still successful test is successful. :) The following patch, which also includes the delta between v2 and v3 of this series, survived 674 reboots before hitting a definitely unrelated problem: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) For the record, this is the kvm_run struct: $6 = {request_interrupt_window = 0 '\000', padding1 = \000\000\000\000\000\000, exit_reason = 0, ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, count = 0, data_offset = 144}, debug = {arch = {No data fields}}, mmio = {phys_addr = 150994968, data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 4, 0 repeats 29 times}}, papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc = 0 '\000', sel1 = 0 '\000', sel2 = 0}, padding = \030\000\000\t\000\000\000\000\220\000\000\000\000\000\000\000\004, '\000' repeats 238
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 17/07/2015 14:58, Richard W.M. Jones wrote: On Fri, Jul 17, 2015 at 11:30:38AM +0200, Paolo Bonzini wrote: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) Vaguely reminiscent of this bug: https://bugzilla.redhat.com/show_bug.cgi?id=1194366 (See comment 7 in particular) Must be it, I was using an old kernel. Thanks! Paolo diff --git a/aio-posix.c b/aio-posix.c index 268d14d..d2011d0 100644 --- a/aio-posix.c +++ b/aio-posix.c @@ -273,6 +273,13 @@ bool aio_poll(AioContext *ctx, bool blocking) aio_context_acquire(ctx); } +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + +if (blocking) { +atomic_sub(ctx-notify_me, 2); +} + /* if we have any readable fds, dispatch event */ if (ret 0) { for (i = 0; i npfd; i++) { @@ -283,10 +290,6 @@ bool aio_poll(AioContext *ctx, bool blocking) npfd = 0; ctx-walking_handlers--; -if (blocking) { -atomic_sub(ctx-notify_me, 2); -} - /* Run dispatch even if there were no readable fds to run timers */ if (aio_dispatch(ctx)) { progress = true; diff --git a/aio-win32.c b/aio-win32.c index 2bfd5f8..33809fd 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -326,6 +326,10 @@ bool aio_poll(AioContext *ctx, bool blocking) if (timeout) { aio_context_acquire(ctx); } + +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + if (blocking) { assert(first); atomic_sub(ctx-notify_me, 2); diff --git a/async.c b/async.c index 9204907..120e183 100644 --- a/async.c +++ b/async.c @@ -202,6 +202,9 @@ aio_ctx_check(GSource *source) AioContext *ctx = (AioContext *) source; QEMUBH *bh; +/* This should be optimized... */ +event_notifier_test_and_clear(ctx-notifier); + atomic_and(ctx-notify_me, ~1); for (bh = ctx-first_bh; bh; bh = bh-next) { if (!bh-deleted bh-scheduled) { @@ -280,6 +280,10 @@ static void aio_rfifolock_cb(void *opaque) aio_notify(opaque); } +static void event_notifier_dummy_cb(EventNotifier *e) +{ +} + AioContext *aio_context_new(Error **errp) { int ret; @@ -292,7 +296,7 @@ AioContext *aio_context_new(Error **errp) return NULL; } g_source_set_can_recurse(ctx-source, true); -aio_set_event_notifier(ctx, ctx-notifier, NULL); +aio_set_event_notifier(ctx, ctx-notifier, event_notifier_dummy_cb); ctx-thread_pool = NULL; qemu_mutex_init(ctx-bh_lock); rfifolock_init(ctx-lock, aio_rfifolock_cb, ctx); With this patch, I've got to 500 iterations without seeing the error. Still testing it however ... Rich.
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 17/07/2015 06:44, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. So it may not be useful, but still successful test is successful. :) The following patch, which also includes the delta between v2 and v3 of this series, survived 674 reboots before hitting a definitely unrelated problem: error: kvm run failed Function not implemented PC=bf671210 SP=c1f0 X00=0a003e70 X01= X02=bf680548 X03=0030 X04=bbb5fc18 X05=004b7770 X06=bf721930 X07=009a X08=bf716858 X09=0090 X10= X11=0046 X12=a007e03a X13= X14= X15= X16=bf716df0 X17= X18= X19=bf6f5f18 X20= X21= X22= X23= X24= X25= X26= X27= X28= X29= X30= PSTATE=6305 (flags -ZC-) For the record, this is the kvm_run struct: $6 = {request_interrupt_window = 0 '\000', padding1 = \000\000\000\000\000\000, exit_reason = 0, ready_for_interrupt_injection = 0 '\000', if_flag = 0 '\000', flags = 0, cr8 = 0, apic_base = 0, {hw = { hardware_exit_reason = 150994968}, fail_entry = {hardware_entry_failure_reason = 150994968}, ex = { exception = 150994968, error_code = 0}, io = {direction = 24 '\030', size = 0 '\000', port = 2304, count = 0, data_offset = 144}, debug = {arch = {No data fields}}, mmio = {phys_addr = 150994968, data = \220\000\000\000\000\000\000, len = 4, is_write = 0 '\000'}, hypercall = {nr = 150994968, args = {144, 4, 0, 0, 0, 0}, ret = 0, longmode = 0, pad = 0}, tpr_access = {rip = 150994968, is_write = 144, pad = 0}, s390_sieic = {icptcode = 24 '\030', ipa = 2304, ipb = 0}, s390_reset_flags = 150994968, s390_ucontrol = {trans_exc_code = 150994968, pgm_code = 144}, dcr = { dcrn = 150994968, data = 0, is_write = 144 '\220'}, internal = {suberror = 150994968, ndata = 0, data = {144, 4, 0 repeats 14 times}}, osi = {gprs = {150994968, 144, 4, 0 repeats 29 times}}, papr_hcall = {nr = 150994968, ret = 144, args = {4, 0, 0, 0, 0, 0, 0, 0, 0}}, s390_tsch = { subchannel_id = 24, subchannel_nr = 2304, io_int_parm = 0, io_int_word = 144, ipb = 0, dequeued = 4 '\004'}, epr = {epr = 150994968}, system_event = {type = 150994968, flags = 144}, s390_stsi = {addr = 150994968, ar = 144 '\220', reserved = 0 '\000', fc = 0 '\000', sel1 = 0 '\000', sel2 = 0}, padding = \030\000\000\t\000\000\000\000\220\000\000\000\000\000\000\000\004, '\000' repeats 238 times}, kvm_valid_regs = 0, kvm_dirty_regs = 0, s = {regs = {No data fields}, padding = '\000' repeats 2047 times}} Marc, does it ring any bell? Paolo diff --git a/aio-posix.c
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. I would guess instead that there are two separate bugs, but it's not impossible that it's still there. Paolo
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 17/07/2015 00:06, Paolo Bonzini wrote: On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. I would guess instead that there are two separate bugs, but it's not impossible that it's still there. Reproduced after ~80 runs... Paolo
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
Am 16.07.2015 um 11:56 hat Paolo Bonzini geschrieben: Apart from an additional assertion this is exactly the same code as v1, but split across three patches so that the important one focuses on the optimization. Paolo v1-v2 Split some changes to the tests to a separate patch Fix commit message [Laszlo] Clarify do...while loop in aio-win32.c [Kevin] Paolo Bonzini (3): tests: remove irrelevant assertions from test-aio aio-win32: reorganize polling loop AioContext: fix broken ctx-dispatching optimization Reviewed-by: Kevin Wolf kw...@redhat.com
[Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
Apart from an additional assertion this is exactly the same code as v1, but split across three patches so that the important one focuses on the optimization. Paolo v1-v2 Split some changes to the tests to a separate patch Fix commit message [Laszlo] Clarify do...while loop in aio-win32.c [Kevin] Paolo Bonzini (3): tests: remove irrelevant assertions from test-aio aio-win32: reorganize polling loop AioContext: fix broken ctx-dispatching optimization aio-posix.c | 19 ++-- aio-win32.c | 41 ++ async.c | 21 +- docs/aio_notify.promela | 77 + include/block/aio.h | 29 +++ tests/test-aio.c| 26 +++-- 6 files changed, 98 insertions(+), 115 deletions(-) -- 2.4.3 diff from v1: diff --git a/aio-win32.c b/aio-win32.c index ae7c6cf..9d6c12f 100644 --- a/aio-win32.c +++ b/aio-win32.c @@ -312,7 +312,13 @@ bool aio_poll(AioContext *ctx, bool blocking) ctx-walking_handlers--; first = true; -/* wait until next event */ +/* ctx-notifier is always registered. */ +assert(count 0); + +/* Multiple iterations, all of them non-blocking except the first, + * may be necessary to process all pending events. After the first + * WaitForMultipleObjects call ctx-notify_me will be decremented. + */ do { HANDLE event; int ret;
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
Sorry for the very long delay in replying to this. I wanted to be absolutely sure I was reproducing the bug. Unfortunately I'm only able to reproduce the bug with qemu 2.3.0 (both the version in Fedora Rawhide, or the tagged v2.3.0 from git). I cannot currently reproduce it at all with upstream qemu from git. The patches understandably only apply to upstream qemu from git, and have quite a few tricky conflicts with v2.3.0. I'll keep trying on this one. It may be that the window for the bug to reproduce with qemu.git has got smaller. Rich. -- Notes on how I try to reproduce this: (1) Using Fedora Rawhide aarch64 (2) libguestfs checked out and compiled from git (3) kraxel's edk2.git-aarch64-0-20150713.b1115.g2ad9cf3.noarch (4) heisenscsi.pl (attached). $ export LIBGUESTFS_HV=/path/to/qemu/aarch64-softmmu/qemu-system-aarch64 $ while true; do echo .; ./run ./heisenscsi.pl ; done -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org heisenscsi.pl Description: Perl program
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Re: [Qemu-devel] [PATCH v2 0/3] AioContext: ctx-dispatching is dead, all hail ctx-notify_me
On 16/07/2015 21:05, Richard W.M. Jones wrote: Sorry to spoil things, but I'm still seeing this bug, although it is now a lot less frequent with your patch. I would estimate it happens more often than 1 in 5 runs with qemu.git, and probably 1 in 200 runs with qemu.git + the v2 patch series. It's the exact same hang in both cases. Is it possible that this patch doesn't completely close any race? Still, it is an improvement, so there is that. Would seem at first glance like a different bug. Interestingly, adding some tracing (qemu_clock_get_ns) makes the bug more likely: now it reproduces in about 10 tries. Of course :) adding other kinds of tracing instead make it go away again (50 tries). Perhaps this: i/o thread vcpu thread worker thread - lock_iothread notify_me = 1 ... unlock_iothread lock_iothread notify_me = 3 ppoll notify_me = 1 bh-scheduled = 1 event_notifier_set event_notifier_test_and_clear ppoll ^^ hang In the exact shape above, it doesn't seem too likely to happen, but perhaps there's another simpler case. Still, the bug exists. The above is not really related to notify_me. Here the notification is not being optimized away! So I wonder if this one has been there forever. Fam suggested putting the event_notifier_test_and_clear before aio_bh_poll(), but it does not work. I'll look more close However, an unconditional event_notifier_test_and_clear is pretty expensive. On one hand, obviously correctness comes first. On the other hand, an expensive operation at the wrong place can mask the race very easily; I'll let the fix run for a while, but I'm not sure if a successful test really says anything useful. Paolo