On 12/21/2015 04:05 PM, Max Reitz wrote: >> The situation is even worse than I have feared. >>
> Thanks for finding this! > > Well, if qcow2_get_specific_info() is the only place that can actually > cause issues in that case (i.e. calling some QMP function which uses the > qcow2 image while the incoming_migration coroutine yields before the > image has been fully reopened), I think the simplest way to fix this > would be to just return NULL from qcow2_get_specific_info() in the else > branch (which currently aborts), adding a comment how we can end up there. > > However, it seems hard to believe this is the only problematic path... > If the coroutine can yield between the BDRVQcow2State getting memset() > to 0 and qcow2_open() having initialized it again, then any QMP command > which makes use of the qcow2 image should fail (not necessarily > gracefully) at that point. I wonder if Kevin's patch will help us pinpoint culprits: https://lists.gnu.org/archive/html/qemu-devel/2015-12/msg04096.html > > I feel like qcow2_invalidate_cache() is the root cause. Basically > completely reopening the qcow2 image just feels wrong. Maybe we could at > least atomically replace the BDRVQcow2State, i.e. call qcow2_open() > first (but with a new BDRVQcow2State), then swap out the BDRVQcow2State > in the BDS, and then call qcow2_close() on the old BDRVQcow2State. That > doesn't feel right still, but it should fix this issue, at least. > > Max > -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature
