lock

Michael Weiser Fri, 25 Oct 2019 04:06:55 -0700

Hi Kevin,

On Wed, Oct 23, 2019 at 05:37:49PM +0200, Kevin Wolf wrote:


> > qcow2_cache_do_get() requires that s->lock is locked because it can
> > yield between picking a cache entry and actually taking ownership of it
> > by setting offset and increasing the reference count.
> > 
> > Add an assertion to make sure the caller really holds the lock. The
> > function can be called outside of coroutine context, where bdrv_pread
> > and flushes become synchronous operations. The lock cannot and need not
> > be taken in this case.
> I'm still running tests to see if any other code paths trigger the
> assertion, but image creation calls this without the lock held (which is
> harmless because nobody else knows about the image so there won't be
> concurrent requests). The following patch is needed additionally to make
> image creation work with the new assertion.

I can confirm that with all four patches corruption does no longer
occur as of commit 69f47505ee66afaa513305de0c1895a224e52c45. Removing
only 3/3 (qcow2: Fix corruption bug in
qcow2_detect_metadata_preallocation()) the assertion triggers after a
few seconds, leaving behind a few leaked clusters but no errors in the
image.

(qemu) qemu-system-x86_64:qemu/include/qemu/coroutine.h:175:
qemu_co_mutex_assert_locked: Assertion `mutex->locked && mutex->holder
== qemu_coroutine_self()' failed.
Aborted (core dumped)

$ qemu-img check qtest.qcow2 
Leaked cluster 169257 refcount=3 reference=2
Leaked cluster 172001 refcount=1 reference=0
Leaked cluster 172002 refcount=1 reference=0
Leaked cluster 172003 refcount=1 reference=0
Leaked cluster 172004 refcount=1 reference=0
Leaked cluster 172005 refcount=1 reference=0
Leaked cluster 172006 refcount=1 reference=0
Leaked cluster 172007 refcount=1 reference=0
Leaked cluster 172008 refcount=1 reference=0
Leaked cluster 172009 refcount=1 reference=0
Leaked cluster 172010 refcount=1 reference=0
Leaked cluster 172011 refcount=1 reference=0
Leaked cluster 172012 refcount=1 reference=0

13 leaked clusters were found on the image.
This means waste of disk space, but no harm to data.
255525/327680 = 77.98% allocated, 3.22% fragmented, 0.00% compressed
clusters
Image end offset: 17106403328

I was going to test with master as well but got overtaken by v2. Will
move on to test v2 now. :)

Series:
Tested-by: Michael Weiser <michael.wei...@gmx.de>

No biggie but if there's a chance could you switch my address to the
above?
-- 
Thanks,
Michael

Re: [PATCH 2/3] qcow2: Assert that qcow2_cache_get() callers hold s->lock

Reply via email to