On 02.09.20 19:42, Alberto Garcia wrote: > When a write request needs to allocate new clusters (or change the L2 > bitmap of existing ones) a QCowL2Meta structure is created so the L2 > metadata can be later updated and any copy-on-write can be performed > if necessary. > > A write request can span a region consisting of an arbitrary > combination of previously unallocated and allocated clusters, and if > the unallocated ones can be put contiguous to the existing ones then > QEMU will do so in order to minimize the number of write operations. > > In practice this means that a write request has not just one but a > number of QCowL2Meta structures. All of them are added to the > cluster_allocs list that is stored in BDRVQcow2State and is used to > detect overlapping requests. After the write request finishes all its > associated QCowL2Meta are removed from that list. calculate_l2_meta() > takes care of creating and putting those structures in the list, and > qcow2_handle_l2meta() takes care of removing them. > > The problem is that the error path in handle_alloc() also tries to > remove an item in that list, a remnant from the time when this was > handled there (that code would not even be correct anymore because > it only removes one struct and not all the ones from the same write > request). > > This can trigger a double removal of the same item from the list, > causing a crash. This is not easy to reproduce in practice because > it requires that do_alloc_cluster_offset() fails after a successful > previous allocation during the same write request, but it can be > reproduced with the included test case. > > As a last thing, this patch also removes the condition that > l2meta->nb_clusters is not 0 in qcow2_handle_l2meta(). This was only > necessary when that structure was allocated in the stack in order > to detect if it had been initialized or not. This changed in commit > f50f88b9fe and nowadays nb_clusters is guaranteed to be > 0.
On my search for /(\.|->)nb_clusters/ to verify this, I noticed this comment for qcow2_alloc_cluster_offset(): * If the cluster was already allocated, m->nb_clusters is set to 0 and * other fields in m are meaningless. Should that be dropped, too? > Signed-off-by: Alberto Garcia <be...@igalia.com> > --- > block/qcow2-cluster.c | 3 -- > block/qcow2.c | 4 +- > tests/qemu-iotests/305 | 75 ++++++++++++++++++++++++++++++++++++++ > tests/qemu-iotests/305.out | 16 ++++++++ > tests/qemu-iotests/group | 1 + > 5 files changed, 93 insertions(+), 6 deletions(-) > create mode 100755 tests/qemu-iotests/305 > create mode 100644 tests/qemu-iotests/305.out [...] > diff --git a/tests/qemu-iotests/305 b/tests/qemu-iotests/305 > new file mode 100755 > index 0000000000..6de3180c17 > --- /dev/null > +++ b/tests/qemu-iotests/305 > @@ -0,0 +1,75 @@ [...] > +_supported_fmt qcow2 > +_supported_proto file > +_supported_os Linux > +_unsupported_imgopts cluster_size refcount_bits extended_l2 data_file, too, because the refcounts work differently there. (As in: Not at all for data clusters.) Also, compat(=0.10), because that wouldn’t allow -o refcount_bits=64. > + > +refcount_table_offset=$((0x400)) I would like to suggest $(peek_file_be "$TEST_IMG" 48 8), to set an example for future generations; but not strictly necessary, of course. O:) Anyway, at least with the _unsupported_imgopts line completed: Reviewed-by: Max Reitz <mre...@redhat.com>
signature.asc
Description: OpenPGP digital signature