Hi Kevin, On Wed, Oct 23, 2019 at 05:37:49PM +0200, Kevin Wolf wrote:
> > qcow2_cache_do_get() requires that s->lock is locked because it can > > yield between picking a cache entry and actually taking ownership of it > > by setting offset and increasing the reference count. > > > > Add an assertion to make sure the caller really holds the lock. The > > function can be called outside of coroutine context, where bdrv_pread > > and flushes become synchronous operations. The lock cannot and need not > > be taken in this case. > I'm still running tests to see if any other code paths trigger the > assertion, but image creation calls this without the lock held (which is > harmless because nobody else knows about the image so there won't be > concurrent requests). The following patch is needed additionally to make > image creation work with the new assertion. I can confirm that with all four patches corruption does no longer occur as of commit 69f47505ee66afaa513305de0c1895a224e52c45. Removing only 3/3 (qcow2: Fix corruption bug in qcow2_detect_metadata_preallocation()) the assertion triggers after a few seconds, leaving behind a few leaked clusters but no errors in the image. (qemu) qemu-system-x86_64:qemu/include/qemu/coroutine.h:175: qemu_co_mutex_assert_locked: Assertion `mutex->locked && mutex->holder == qemu_coroutine_self()' failed. Aborted (core dumped) $ qemu-img check qtest.qcow2 Leaked cluster 169257 refcount=3 reference=2 Leaked cluster 172001 refcount=1 reference=0 Leaked cluster 172002 refcount=1 reference=0 Leaked cluster 172003 refcount=1 reference=0 Leaked cluster 172004 refcount=1 reference=0 Leaked cluster 172005 refcount=1 reference=0 Leaked cluster 172006 refcount=1 reference=0 Leaked cluster 172007 refcount=1 reference=0 Leaked cluster 172008 refcount=1 reference=0 Leaked cluster 172009 refcount=1 reference=0 Leaked cluster 172010 refcount=1 reference=0 Leaked cluster 172011 refcount=1 reference=0 Leaked cluster 172012 refcount=1 reference=0 13 leaked clusters were found on the image. This means waste of disk space, but no harm to data. 255525/327680 = 77.98% allocated, 3.22% fragmented, 0.00% compressed clusters Image end offset: 17106403328 I was going to test with master as well but got overtaken by v2. Will move on to test v2 now. :) Series: Tested-by: Michael Weiser <michael.wei...@gmx.de> No biggie but if there's a chance could you switch my address to the above? -- Thanks, Michael