Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-14 Thread Alberto Garcia
On Mon 14 Jan 2019 05:31:17 PM CET, Stefan Hajnoczi  wrote:
> On Mon, Jan 14, 2019 at 05:26:48PM +0100, Alberto Garcia wrote:
>> On Mon 14 Jan 2019 05:15:25 PM CET, Stefan Hajnoczi wrote:
>> >> > I've been able to reproduce this in an iotest, please see v2 of this
>> >> > series.
>> >> 
>> >> That iotest doesn't crash for me :-?
>> >
>> > Does my iotest pass for you?
>> 
>> Yes, it does. I'm trying to figure out why because if I run the QMP
>> commands by hand then it does crash.
>
> I ran the iotest 20 times on my machine and it segfaulted every time
> (with the fix not yet applied).

Yeah I can also reproduce it all the time if I run it by hand...

I was debugging it and although I don't know why this is different when
I run it through tests/qemu-iotests/check, here's why it doesn't crash:

After the ThrottleGroupMember is unregistered and its BlockBackend is
destroyed, the throttle_group_co_restart_queue() coroutine takes
control.

The first thing that it does is lock tgm->throttled_reqs_lock. It turns
out that although this memory has been freed (it's part of the
BlockBackend struct) it is still accessible but contains pure
gargabe. 'Garbage' here means that the mutex counter contains some
random value != 0, so the thread waits, it doesn't have a chance to
crash the process, and QEMU shuts down cleanly.

So if my understanding is correct QEMU can be shut down when there are
iothreads waiting for a mutex. Is that something that we should be
worried about?

Berto



Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-14 Thread Stefan Hajnoczi
On Mon, Jan 14, 2019 at 05:26:48PM +0100, Alberto Garcia wrote:
> On Mon 14 Jan 2019 05:15:25 PM CET, Stefan Hajnoczi wrote:
> >> > I've been able to reproduce this in an iotest, please see v2 of this
> >> > series.
> >> 
> >> That iotest doesn't crash for me :-?
> >
> > Does my iotest pass for you?
> 
> Yes, it does. I'm trying to figure out why because if I run the QMP
> commands by hand then it does crash.

I ran the iotest 20 times on my machine and it segfaulted every time
(with the fix not yet applied).

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-14 Thread Alberto Garcia
On Mon 14 Jan 2019 05:15:25 PM CET, Stefan Hajnoczi wrote:
>> > I've been able to reproduce this in an iotest, please see v2 of this
>> > series.
>> 
>> That iotest doesn't crash for me :-?
>
> Does my iotest pass for you?

Yes, it does. I'm trying to figure out why because if I run the QMP
commands by hand then it does crash.

Berto



Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-14 Thread Stefan Hajnoczi
On Mon, Jan 14, 2019 at 03:40:39PM +0100, Alberto Garcia wrote:
> On Mon 14 Jan 2019 02:35:53 PM CET, Stefan Hajnoczi wrote:
> > On Fri, Jan 11, 2019 at 03:14:08PM +0100, Alberto Garcia wrote:
> >> On Fri 11 Jan 2019 02:24:16 PM CET, Kevin Wolf wrote:
> >> >> >> The following QMP command leads to a crash when iothreads are used:
> >> >> >>
> >> >> >>   { 'execute': 'device_del', 'arguments': {'id': 'data'} }
> >> >> >
> >> >> > How did you reproduce this? Do you have a test case?
> >> >> 
> >> >> Ok, I finally reproduced it, the patch looks good to me.
> >> >
> >> > Can it be turned into a qemu-iotests case or is it too complicated for
> >> > that?
> >> 
> >> I can reproduce the problem reliably with this:
> >> 
> >> { "execute": "qmp_capabilities" }
> >> { "execute": "blockdev-add",
> >>   "arguments": {"driver": "null-co", "node-name": "hd0"}}
> >> { "execute": "object-add",
> >>   "arguments": {"qom-type": "iothread", "id": "iothread0"}}
> >> { "execute": "device_add",
> >>   "arguments": {"id": "scsi0", "driver": "virtio-scsi-pci",
> >> "iothread": "iothread0"}}
> >> { "execute": "device_add",
> >>   "arguments": {"id": "scsi-hd0", "driver": "scsi-hd", "drive": "hd0"}}
> >> { "execute": "block_set_io_throttle",
> >>   "arguments": {"id": "scsi-hd0", "bps": 0, "bps_rd": 0, "bps_wr": 0,
> >> "iops": 1000, "iops_rd": 0, "iops_wr": 0}}
> >> { "execute": "device_del", "arguments": {"id": "scsi-hd0"}}
> >> 
> >> But this doesn't crash if I put it in an iotest.
> >
> > I've been able to reproduce this in an iotest, please see v2 of this
> > series.
> 
> That iotest doesn't crash for me :-?

Does my iotest pass for you?

On my machine the diff fails because signal 11 killed QEMU (and this
prints a warning to the test output).

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-14 Thread Alberto Garcia
On Mon 14 Jan 2019 02:35:53 PM CET, Stefan Hajnoczi wrote:
> On Fri, Jan 11, 2019 at 03:14:08PM +0100, Alberto Garcia wrote:
>> On Fri 11 Jan 2019 02:24:16 PM CET, Kevin Wolf wrote:
>> >> >> The following QMP command leads to a crash when iothreads are used:
>> >> >>
>> >> >>   { 'execute': 'device_del', 'arguments': {'id': 'data'} }
>> >> >
>> >> > How did you reproduce this? Do you have a test case?
>> >> 
>> >> Ok, I finally reproduced it, the patch looks good to me.
>> >
>> > Can it be turned into a qemu-iotests case or is it too complicated for
>> > that?
>> 
>> I can reproduce the problem reliably with this:
>> 
>> { "execute": "qmp_capabilities" }
>> { "execute": "blockdev-add",
>>   "arguments": {"driver": "null-co", "node-name": "hd0"}}
>> { "execute": "object-add",
>>   "arguments": {"qom-type": "iothread", "id": "iothread0"}}
>> { "execute": "device_add",
>>   "arguments": {"id": "scsi0", "driver": "virtio-scsi-pci",
>> "iothread": "iothread0"}}
>> { "execute": "device_add",
>>   "arguments": {"id": "scsi-hd0", "driver": "scsi-hd", "drive": "hd0"}}
>> { "execute": "block_set_io_throttle",
>>   "arguments": {"id": "scsi-hd0", "bps": 0, "bps_rd": 0, "bps_wr": 0,
>> "iops": 1000, "iops_rd": 0, "iops_wr": 0}}
>> { "execute": "device_del", "arguments": {"id": "scsi-hd0"}}
>> 
>> But this doesn't crash if I put it in an iotest.
>
> I've been able to reproduce this in an iotest, please see v2 of this
> series.

That iotest doesn't crash for me :-?

Berto



Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-14 Thread Stefan Hajnoczi
On Fri, Jan 11, 2019 at 03:14:08PM +0100, Alberto Garcia wrote:
> On Fri 11 Jan 2019 02:24:16 PM CET, Kevin Wolf wrote:
> >> >> The following QMP command leads to a crash when iothreads are used:
> >> >>
> >> >>   { 'execute': 'device_del', 'arguments': {'id': 'data'} }
> >> >
> >> > How did you reproduce this? Do you have a test case?
> >> 
> >> Ok, I finally reproduced it, the patch looks good to me.
> >
> > Can it be turned into a qemu-iotests case or is it too complicated for
> > that?
> 
> I can reproduce the problem reliably with this:
> 
> { "execute": "qmp_capabilities" }
> { "execute": "blockdev-add",
>   "arguments": {"driver": "null-co", "node-name": "hd0"}}
> { "execute": "object-add",
>   "arguments": {"qom-type": "iothread", "id": "iothread0"}}
> { "execute": "device_add",
>   "arguments": {"id": "scsi0", "driver": "virtio-scsi-pci",
> "iothread": "iothread0"}}
> { "execute": "device_add",
>   "arguments": {"id": "scsi-hd0", "driver": "scsi-hd", "drive": "hd0"}}
> { "execute": "block_set_io_throttle",
>   "arguments": {"id": "scsi-hd0", "bps": 0, "bps_rd": 0, "bps_wr": 0,
> "iops": 1000, "iops_rd": 0, "iops_wr": 0}}
> { "execute": "device_del", "arguments": {"id": "scsi-hd0"}}
> 
> But this doesn't crash if I put it in an iotest.

I've been able to reproduce this in an iotest, please see v2 of this
series.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-11 Thread Paolo Bonzini
On 09/01/19 12:01, Stefan Hajnoczi wrote:
>  g_free(data);
> +
> +tgm->restart_pending--;
> +aio_wait_kick();
>  }
>  
>  static void throttle_group_restart_queue(ThrottleGroupMember *tgm, bool 
> is_write)
> @@ -430,6 +433,8 @@ static void 
> throttle_group_restart_queue(ThrottleGroupMember *tgm, bool is_write
>   * be no timer pending on this tgm at this point */
>  assert(!timer_pending(tgm->throttle_timers.timers[is_write]));
>  
> +tgm->restart_pending++;
> +
>  co = qemu_coroutine_create(throttle_group_restart_queue_entry, rd);
>  aio_co_enter(tgm->aio_context, co);
>  }
> @@ -538,6 +543,7 @@ void throttle_group_register_tgm(ThrottleGroupMember *tgm,
>  
>  tgm->throttle_state = ts;
>  tgm->aio_context = ctx;
> +tgm->restart_pending = 0;
>  
>  qemu_mutex_lock(>lock);
>  /* If the ThrottleGroup is new set this ThrottleGroupMember as the token 
> */
> @@ -584,6 +590,9 @@ void throttle_group_unregister_tgm(ThrottleGroupMember 
> *tgm)
>  return;
>  }
>  
> +/* Wait for throttle_group_restart_queue_entry() coroutines to finish */
> +AIO_WAIT_WHILE(tgm->aio_context, tgm->restart_pending > 0);
> +

Could you change this to atomic_inc/dec, and atomic_read here?  It would
be nice to avoid more uses of the AioContext lock.

Paolo



Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-11 Thread Alberto Garcia
On Fri 11 Jan 2019 02:24:16 PM CET, Kevin Wolf wrote:
>> >> The following QMP command leads to a crash when iothreads are used:
>> >>
>> >>   { 'execute': 'device_del', 'arguments': {'id': 'data'} }
>> >
>> > How did you reproduce this? Do you have a test case?
>> 
>> Ok, I finally reproduced it, the patch looks good to me.
>
> Can it be turned into a qemu-iotests case or is it too complicated for
> that?

I can reproduce the problem reliably with this:

{ "execute": "qmp_capabilities" }
{ "execute": "blockdev-add",
  "arguments": {"driver": "null-co", "node-name": "hd0"}}
{ "execute": "object-add",
  "arguments": {"qom-type": "iothread", "id": "iothread0"}}
{ "execute": "device_add",
  "arguments": {"id": "scsi0", "driver": "virtio-scsi-pci",
"iothread": "iothread0"}}
{ "execute": "device_add",
  "arguments": {"id": "scsi-hd0", "driver": "scsi-hd", "drive": "hd0"}}
{ "execute": "block_set_io_throttle",
  "arguments": {"id": "scsi-hd0", "bps": 0, "bps_rd": 0, "bps_wr": 0,
"iops": 1000, "iops_rd": 0, "iops_wr": 0}}
{ "execute": "device_del", "arguments": {"id": "scsi-hd0"}}

But this doesn't crash if I put it in an iotest.

Berto



Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-11 Thread Kevin Wolf
Am 11.01.2019 um 14:19 hat Alberto Garcia geschrieben:
> On Wed 09 Jan 2019 04:34:10 PM CET, Alberto Garcia wrote:
> > On Wed 09 Jan 2019 12:01:44 PM CET, Stefan Hajnoczi wrote:
> >> The following QMP command leads to a crash when iothreads are used:
> >>
> >>   { 'execute': 'device_del', 'arguments': {'id': 'data'} }
> >
> > How did you reproduce this? Do you have a test case?
> 
> Ok, I finally reproduced it, the patch looks good to me.

Can it be turned into a qemu-iotests case or is it too complicated for
that?

Kevin



Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-11 Thread Alberto Garcia
On Wed 09 Jan 2019 04:34:10 PM CET, Alberto Garcia wrote:
> On Wed 09 Jan 2019 12:01:44 PM CET, Stefan Hajnoczi wrote:
>> The following QMP command leads to a crash when iothreads are used:
>>
>>   { 'execute': 'device_del', 'arguments': {'id': 'data'} }
>
> How did you reproduce this? Do you have a test case?

Ok, I finally reproduced it, the patch looks good to me.

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-09 Thread Alberto Garcia
On Wed 09 Jan 2019 12:01:44 PM CET, Stefan Hajnoczi wrote:
> The following QMP command leads to a crash when iothreads are used:
>
>   { 'execute': 'device_del', 'arguments': {'id': 'data'} }

How did you reproduce this? Do you have a test case?

Berto



Re: [Qemu-devel] [PATCH] throttle-groups: fix restart coroutine iothread race

2019-01-09 Thread Alberto Garcia
On Wed 09 Jan 2019 12:01:44 PM CET, Stefan Hajnoczi wrote:
> The following QMP command leads to a crash when iothreads are used:
>
>   { 'execute': 'device_del', 'arguments': {'id': 'data'} }

I was trying to reproduce this and I found this crashing in master:

$ qemu-system-x86_64 -enable-kvm -qmp stdio -display none
{ "execute": "qmp_capabilities" }
{ "execute": "blockdev-add", "arguments": {"driver": "qcow2", "file": 
{"driver": "file", "filename": "hd.qcow2"}, "node-name": "hd0"}}
{ "execute": "object-add", "arguments": {"qom-type": "iothread", "id": 
"iothread0"}}
{ "execute": "x-blockdev-set-iothread", "arguments": {"node-name": "hd0", 
"iothread": "iothread0"}}
{ "execute": "device_add", "arguments": {"id": "virtio0", "driver": 
"virtio-blk-pci", "drive": "hd0"}}

Thread 1 "qemu-system-x86" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  [...] in __GI_abort () at abort.c:89
#2  [...] in error_exit () at util/qemu-thread-posix.c:36
#3  [...] in qemu_mutex_unlock_impl () at util/qemu-thread-posix.c:96
#4  [...] in aio_context_release () at util/async.c:516
#5  [...] in blk_prw () at block/block-backend.c:1262
#6  [...] in blk_pread () at block/block-backend.c:1424
#7  [...] in blk_pread_unthrottled () at block/block-backend.c:1279
#8  [...] in guess_disk_lchs () at hw/block/hd-geometry.c:71
#9  [...] in hd_geometry_guess () at hw/block/hd-geometry.c:136
#10 [...] in blkconf_geometry () at hw/block/block.c:99
#11 [...] in virtio_blk_device_realize () at hw/block/virtio-blk.c:944
#12 [...] in virtio_device_realize () at hw/virtio/virtio.c:2538
[...]

Berto