Il 11/02/2013 13:29, Kevin Wolf ha scritto:
> The bug is not in this function but in process_incoming_migration(). It
> should never blindly enter a coroutine which is in an unknown state.

process_incoming_migration() is the function that _creates_ the
coroutine and enters it for the first time, so there shouldn't be any
problem there.

The function we're looking at should be, as you write correctly,
enter_migration_coroutine().

> If it can reenter here, it can reenter anywhere in block drivers, and
> adding a workaround to one place that just yields again if it got an
> early reentrance doesn't fix the real bug.
> 
> Which is the yield that corresponds to the enter in
> enter_migration_coroutine()? We need to add some state that can be used
> to make sure the enter happens only for this specific yield.

In this case the corresponding yield is in qemu-coroutine-io.c, and it
is driven by read-availability of the migration file descriptor.  So it
is possible to test before entering the coroutine, but really ugly (not
just because it would cost one extra system call).

But why is enter_migration_coroutine() being called at all?  bdrv_write
should be a synchronous call, it should run its own qemu_aio_wait()
event loop and that loop doesn't include the migration file descriptor.

Paolo

Reply via email to