On 10/31/23 18:33, Hanna Czenczek wrote:
> (Sorry, opened another reply window, forgot I already had one open...)
> 
> On 20.10.23 23:56, Andrey Drobyshev wrote:
>> This commit makes the discard operation work on the subcluster level
>> rather than cluster level.  It introduces discard_l2_subclusters()
>> function and makes use of it in qcow2 discard implementation, much like
>> it's done with zero_in_l2_slice() / zero_l2_subclusters().  It also
>> changes the qcow2 driver pdiscard_alignment to subcluster_size.  That
>> way subcluster-aligned discards lead to actual fallocate(PUNCH_HOLE)
>> operation and free host disk space.
>>
>> This feature will let us gain additional disk space on guest
>> TRIM/discard requests, especially when using large enough clusters
>> (1M, 2M) with subclusters enabled.
>>
>> Signed-off-by: Andrey Drobyshev <andrey.drobys...@virtuozzo.com>
>> ---
>>   block/qcow2-cluster.c | 100 ++++++++++++++++++++++++++++++++++++++++--
>>   block/qcow2.c         |   8 ++--
>>   2 files changed, 101 insertions(+), 7 deletions(-)
>>
>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
>> index 7c6fa5524c..cf40f2dc12 100644
>> --- a/block/qcow2-cluster.c
>> +++ b/block/qcow2-cluster.c
>> @@ -2042,6 +2042,74 @@ discard_in_l2_slice(BlockDriverState *bs,
>> uint64_t offset, uint64_t nb_clusters,
>>       return nb_clusters;
>>   }
>>   +static int coroutine_fn GRAPH_RDLOCK
>> +discard_l2_subclusters(BlockDriverState *bs, uint64_t offset,
>> +                       uint64_t nb_subclusters,
>> +                       enum qcow2_discard_type type,
>> +                       bool full_discard,
>> +                       SubClusterRangeInfo *pscri)
>> +{
>> +    BDRVQcow2State *s = bs->opaque;
>> +    uint64_t new_l2_bitmap, l2_bitmap_mask;
>> +    int ret, sc = offset_to_sc_index(s, offset);
>> +    SubClusterRangeInfo scri = { 0 };
>> +
>> +    if (!pscri) {
>> +        ret = get_sc_range_info(bs, offset, nb_subclusters, &scri);
>> +        if (ret < 0) {
>> +            goto out;
>> +        }
>> +    } else {
>> +        scri = *pscri;
> 
> Allowing to takes this from the caller sounds dangerous, considering we
> need to track who takes care of freeing scri.l2_slice.
> 
>> +    }
>> +
>> +    l2_bitmap_mask = QCOW_OFLAG_SUB_ALLOC_RANGE(sc, sc +
>> nb_subclusters);
>> +    new_l2_bitmap = scri.l2_bitmap;
>> +    new_l2_bitmap &= ~l2_bitmap_mask;
>> +
>> +    /*
>> +     * If there're no allocated subclusters left, we might as well
>> discard
>> +     * the entire cluster.  That way we'd also update the refcount
>> table.
>> +     */
>> +    if (!(new_l2_bitmap & QCOW_L2_BITMAP_ALL_ALLOC)) {
> 
> What if there are subclusters in the cluster that are marked as zero,
> outside of the discarded range?  It sounds wrong to apply a discard with
> either full_discard set or cleared to them.
> 

Hmmm, then I think before falling to this path we should either:

1. Make sure full_discard=false.  That is directly implied from what you
said in your other mail: discarding with full_discard=false is
equivalent to zeroizing;
2. Check that no subclusters within this cluster are explicitly marked
as zeroes.
3. Check that either 1) or 2) is true (if ((1) || (2))).

For now I like option 3) better.

>> +        return discard_in_l2_slice(bs,
>> +                                   QEMU_ALIGN_DOWN(offset,
>> s->cluster_size),
>> +                                   1, type, full_discard);
>> +    }
>> +
>> +    /*
>> +     * Full discard means we fall through to the backing file, thus
>> we only
>> +     * need to mark the subclusters as deallocated.
> 
> I think it also means we need to clear the zero bits.
>

Yes, that seems right.

> Hanna
> 
> [...]

Reply via email to