Hello Hanna,

Sorry for the delay and thanks for your thorough and detailed review.

On 10/31/23 17:53, Hanna Czenczek wrote:
> On 20.10.23 23:56, Andrey Drobyshev wrote:
>> This helper simply obtains the l2 table parameters of the cluster which
>> contains the given subclusters range.  Right now this info is being
>> obtained and used by zero_l2_subclusters().  As we're about to introduce
>> the subclusters discard operation, this helper would let us avoid code
>> duplication.
>>
>> Also introduce struct SubClusterRangeInfo, which would contain all the
>> needed params.
>>
>> Signed-off-by: Andrey Drobyshev <andrey.drobys...@virtuozzo.com>
>> ---
>>   block/qcow2-cluster.c | 90 +++++++++++++++++++++++++++++--------------
>>   1 file changed, 62 insertions(+), 28 deletions(-)
>>
>> diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c
>> index 904f00d1b3..8801856b93 100644
>> --- a/block/qcow2-cluster.c
>> +++ b/block/qcow2-cluster.c
>> @@ -32,6 +32,13 @@
>>   #include "qemu/memalign.h"
>>   #include "trace.h"
>>   +typedef struct SubClusterRangeInfo {
>> +    uint64_t *l2_slice;
> 
> We should document that this is a strong reference that must be returned
> via qcow2_cache_put().  Maybe you could also define a clean-up function
> using G_DEFINE_AUTO_CLEANUP_CLEAR_FUNC() that does this, allowing this
> type to be used with g_auto().
> 
>> +    int l2_index;
>> +    uint64_t l2_entry;
>> +    uint64_t l2_bitmap;
>> +} SubClusterRangeInfo;
>> +
>>   int coroutine_fn qcow2_shrink_l1_table(BlockDriverState *bs,
>>                                          uint64_t exact_size)
>>   {
>> @@ -1892,6 +1899,50 @@ again:
>>       return 0;
>>   }
>>   +static int get_sc_range_info(BlockDriverState *bs, uint64_t offset,
>> +                             unsigned nb_subclusters,
>> +                             SubClusterRangeInfo *scri)
> 
> It would be good to have documentation for this function, for example
> that it only works on a single cluster, i.e. that the range denoted by
> @offset and @nb_subclusters must not cross a cluster boundary, and that
> @offset must be aligned to subclusters.
> 

I figured out those restrictions are derived from the number of asserts
in the function body itself, much like it's done in
zero_l2_subclusters() and friends.  But of course I don't mind
documenting this.

> In general, it is unclear to me at this point what this function does.

The sole purpose of this function is to avoid the code duplication,
since both zero_l2_subclusters() (pre-existing) and
discard_l2_subclusters() (newly introduced) need to obtain the same info
about the subclusters range they're working with.
 
> OK, it gets the SCRI for all subclusters in the cluster at @offset (this
> is what its name implies), but then it also has this loop that checks
> whether there are compressed clusters among the @nb_subclusters.

Look at the pre-existing implementation of zero_l2_subclusters(): it
also checks that the subcluster range we're dealing with isn't contained
within a compressed cluster (otherwise there's no point zeroizing
individual subclusters).  So the logic isn't changed here.  The only
reason I decided to loop through the subclusters (instead of calling
qcow2_get_cluster_type() for the whole cluster) is so that I could
detect invalid subclusters and return -EINVAL right away.

> It has a comment about being unable to zero/discard subclusters in compressed
> clusters, but the function name says nothing about this scope of
> zeroing/discarding.
> 

Maybe rename it then to stress out that we're dealing with the regular
subclusters only? get_normal_sc_range_info()?

>> +{
>> +    BDRVQcow2State *s = bs->opaque;
>> +    int ret, sc_cleared = 0, sc_index = offset_to_sc_index(s, offset);
>> +    QCow2SubclusterType sctype;
>> +
>> +    /* Here we only work with the subclusters within single cluster. */
>> +    assert(nb_subclusters > 0 && nb_subclusters <
>> s->subclusters_per_cluster);
>> +    assert(sc_index + nb_subclusters <= s->subclusters_per_cluster);
>> +    assert(offset_into_subcluster(s, offset) == 0);
>> +
>> +    ret = get_cluster_table(bs, offset, &scri->l2_slice,
>> &scri->l2_index);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +
>> +    scri->l2_entry = get_l2_entry(s, scri->l2_slice, scri->l2_index);
>> +    scri->l2_bitmap = get_l2_bitmap(s, scri->l2_slice, scri->l2_index);
>> +
>> +    do {
>> +        qcow2_get_subcluster_range_type(bs, scri->l2_entry,
>> scri->l2_bitmap,
>> +                                        sc_index, &sctype);
> 
> I think there’s a `ret = ` missing here.
> 

Of course there is, thanks for catching this.

>> +        if (ret < 0) {
>> +            return ret;
>> +        }
>> +
>> +        switch (sctype) {
>> +        case QCOW2_SUBCLUSTER_COMPRESSED:
>> +            /* We cannot partially zeroize/discard compressed
>> clusters. */
>> +            return -ENOTSUP;
>> +        case QCOW2_SUBCLUSTER_INVALID:
>> +            return -EINVAL;
>> +        default:
>> +            break;
>> +        }
>> +
>> +        sc_cleared += ret;
>> +    } while (sc_cleared < nb_subclusters);
>> +
>> +    return 0;
>> +}
>> +
>>   /*
>>    * This discards as many clusters of nb_clusters as possible at once
>> (i.e.
>>    * all clusters in the same L2 slice) and returns the number of
>> discarded
>> @@ -2097,44 +2148,27 @@ zero_l2_subclusters(BlockDriverState *bs,
>> uint64_t offset,
>>                       unsigned nb_subclusters)
>>   {
>>       BDRVQcow2State *s = bs->opaque;
>> -    uint64_t *l2_slice;
>> -    uint64_t old_l2_bitmap, l2_bitmap;
>> -    int l2_index, ret, sc = offset_to_sc_index(s, offset);
>> +    uint64_t new_l2_bitmap;
>> +    int ret, sc = offset_to_sc_index(s, offset);
>> +    SubClusterRangeInfo scri = { 0 };
>>   -    /* For full clusters use zero_in_l2_slice() instead */
>> -    assert(nb_subclusters > 0 && nb_subclusters <
>> s->subclusters_per_cluster);
>> -    assert(sc + nb_subclusters <= s->subclusters_per_cluster);
>> -    assert(offset_into_subcluster(s, offset) == 0);
>> -
>> -    ret = get_cluster_table(bs, offset, &l2_slice, &l2_index);
>> +    ret = get_sc_range_info(bs, offset, nb_subclusters, &scri);
>>       if (ret < 0) {
>> -        return ret;
>> -    }
>> -
>> -    switch (qcow2_get_cluster_type(bs, get_l2_entry(s, l2_slice,
>> l2_index))) {
>> -    case QCOW2_CLUSTER_COMPRESSED:
>> -        ret = -ENOTSUP; /* We cannot partially zeroize compressed
>> clusters */
>>           goto out;
>> -    case QCOW2_CLUSTER_NORMAL:
>> -    case QCOW2_CLUSTER_UNALLOCATED:
>> -        break;
>> -    default:
>> -        g_assert_not_reached();
>>       }
>>   -    old_l2_bitmap = l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index);
>> -
>> -    l2_bitmap |=  QCOW_OFLAG_SUB_ZERO_RANGE(sc, sc + nb_subclusters);
>> -    l2_bitmap &= ~QCOW_OFLAG_SUB_ALLOC_RANGE(sc, sc + nb_subclusters);
>> +    new_l2_bitmap = scri.l2_bitmap;
>> +    new_l2_bitmap |=  QCOW_OFLAG_SUB_ZERO_RANGE(sc, sc +
>> nb_subclusters);
>> +    new_l2_bitmap &= ~QCOW_OFLAG_SUB_ALLOC_RANGE(sc, sc +
>> nb_subclusters);
>>   -    if (old_l2_bitmap != l2_bitmap) {
>> -        set_l2_bitmap(s, l2_slice, l2_index, l2_bitmap);
>> -        qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_slice);
>> +    if (new_l2_bitmap != scri.l2_bitmap) {
>> +        set_l2_bitmap(s, scri.l2_slice, scri.l2_index, new_l2_bitmap);
>> +        qcow2_cache_entry_mark_dirty(s->l2_table_cache, scri.l2_slice);
>>       }
>>         ret = 0;
>>   out:
>> -    qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice);
>> +    qcow2_cache_put(s->l2_table_cache, (void **) &scri.l2_slice);
> 
> qcow2_cache_put() doesn’t look like it’s safe to be called if the table
> is NULL (i.e. `scri.l2_slice == NULL`).  We can get here if
> get_sc_range_info() fails, so that may happen.  I think you should either:
> 
> (A) Implement G_DEFINE_AUTO_CLEANUP_CLEAR_FUNC() for the SCRI type so it
> isn’t an issue (at least I assume that could solve it), or
> 
> (B) Call qcow2_cache_put() here only if `scri.l2_slice != NULL`.
> 

Since in patch 5 I add the code path zero_l2_subclusters() ->
discard_l2_subclusters(), option (A) would only be valid if we manage to
skip SCRI param on this call.  If I understand correctly, you're
suggesting adding a destructor for the SCRI type so that we call
qcow2_cache_put() in one place and one place only when this structure
goes out of the current scope.  But even in this case we'd have to
perform the '!= NULL' check in that destructor, the only benefit is that
we do this in one place.

> In any case, I think it also makes sense to have get_sc_range_info()
> only return a valid table if it returns success, i.e. if there’s any
> error in get_sc_range_info(), it should clean up `scri.l2_slice` itself.
> 

Yes, this seems like a sane behaviour.

> Hanna
> 
>>         return ret;
>>   }
> 


Reply via email to