On 10/31/23 18:33, Hanna Czenczek wrote: > (Sorry, opened another reply window, forgot I already had one open...) > > On 20.10.23 23:56, Andrey Drobyshev wrote: >> This commit makes the discard operation work on the subcluster level >> rather than cluster level. It introduces discard_l2_subclusters() >> function and makes use of it in qcow2 discard implementation, much like >> it's done with zero_in_l2_slice() / zero_l2_subclusters(). It also >> changes the qcow2 driver pdiscard_alignment to subcluster_size. That >> way subcluster-aligned discards lead to actual fallocate(PUNCH_HOLE) >> operation and free host disk space. >> >> This feature will let us gain additional disk space on guest >> TRIM/discard requests, especially when using large enough clusters >> (1M, 2M) with subclusters enabled. >> >> Signed-off-by: Andrey Drobyshev <andrey.drobys...@virtuozzo.com> >> --- >> block/qcow2-cluster.c | 100 ++++++++++++++++++++++++++++++++++++++++-- >> block/qcow2.c | 8 ++-- >> 2 files changed, 101 insertions(+), 7 deletions(-) >> >> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c >> index 7c6fa5524c..cf40f2dc12 100644 >> --- a/block/qcow2-cluster.c >> +++ b/block/qcow2-cluster.c >> @@ -2042,6 +2042,74 @@ discard_in_l2_slice(BlockDriverState *bs, >> uint64_t offset, uint64_t nb_clusters, >> return nb_clusters; >> } >> +static int coroutine_fn GRAPH_RDLOCK >> +discard_l2_subclusters(BlockDriverState *bs, uint64_t offset, >> + uint64_t nb_subclusters, >> + enum qcow2_discard_type type, >> + bool full_discard, >> + SubClusterRangeInfo *pscri) >> +{ >> + BDRVQcow2State *s = bs->opaque; >> + uint64_t new_l2_bitmap, l2_bitmap_mask; >> + int ret, sc = offset_to_sc_index(s, offset); >> + SubClusterRangeInfo scri = { 0 }; >> + >> + if (!pscri) { >> + ret = get_sc_range_info(bs, offset, nb_subclusters, &scri); >> + if (ret < 0) { >> + goto out; >> + } >> + } else { >> + scri = *pscri; > > Allowing to takes this from the caller sounds dangerous, considering we > need to track who takes care of freeing scri.l2_slice. > >> + } >> + >> + l2_bitmap_mask = QCOW_OFLAG_SUB_ALLOC_RANGE(sc, sc + >> nb_subclusters); >> + new_l2_bitmap = scri.l2_bitmap; >> + new_l2_bitmap &= ~l2_bitmap_mask; >> + >> + /* >> + * If there're no allocated subclusters left, we might as well >> discard >> + * the entire cluster. That way we'd also update the refcount >> table. >> + */ >> + if (!(new_l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC)) { > > What if there are subclusters in the cluster that are marked as zero, > outside of the discarded range? It sounds wrong to apply a discard with > either full_discard set or cleared to them. >
Hmmm, then I think before falling to this path we should either: 1. Make sure full_discard=false. That is directly implied from what you said in your other mail: discarding with full_discard=false is equivalent to zeroizing; 2. Check that no subclusters within this cluster are explicitly marked as zeroes. 3. Check that either 1) or 2) is true (if ((1) || (2))). For now I like option 3) better. >> + return discard_in_l2_slice(bs, >> + QEMU_ALIGN_DOWN(offset, >> s->cluster_size), >> + 1, type, full_discard); >> + } >> + >> + /* >> + * Full discard means we fall through to the backing file, thus >> we only >> + * need to mark the subclusters as deallocated. > > I think it also means we need to clear the zero bits. > Yes, that seems right. > Hanna > > [...]