Re: [PATCH v5 20/31] qcow2: Add subcluster support to qcow2_get_host_offset()

2020-05-08 Thread Alberto Garcia
On Wed 06 May 2020 07:55:29 PM CEST, Eric Blake wrote:
> On 5/5/20 12:38 PM, Alberto Garcia wrote:
>> The logic of this function remains pretty much the same, except that
>> it uses count_contiguous_subclusters(), which combines the logic of
>> count_contiguous_clusters() / count_contiguous_clusters_unallocated()
>> and checks individual subclusters.
>
> Maybe mention that qcow2_cluster_to_subcluster_type() is now inlined 
> into its lone remaining caller.

Ok.

>> +static int count_contiguous_subclusters(BlockDriverState *bs, int 
>> nb_clusters,
>> +unsigned sc_index, uint64_t 
>> *l2_slice,
>> +int l2_index)
>>   {
>
>> +
>> +assert(type != QCOW2_SUBCLUSTER_INVALID); /* The caller should check 
>> this */
>> +assert(l2_index + nb_clusters <= s->l2_size);
>> +
>> +if (type == QCOW2_SUBCLUSTER_COMPRESSED) {
>> +/* Compressed clusters are always processed one by one */
>> +return s->subclusters_per_cluster - sc_index;
>
> Should this assert(sc_index == 0)?

No. The documentation of the function says "Compressed clusters are
always processed one by one but for the purpose of this count they are
treated as if they were divided into subclusters of size
s->subcluster_size".

Let's say we have a compressed cluster at offset 0 (size 64k) and we
perform a read request with offset=32k, size=8k.

qcow2_co_preadv_part() calls qcow2_get_host_offset() and asks "How many
bytes up to 8k can we read in one go at offset 32k?".

The offset is 32k so the first subcluster is #16. And this function
(count_contiguous_subclusters()) is asked "how many contiguous
subclusters do we have starting at subcluster #16?" and the answer is
32 - 16 = 16 subclusters.

qcow2_get_host_offset() only needs 8k/2k = 4 subclusters, so it tells
the original caller (qcow2_co_preadv_part()) "you can read all 8k in one
go and the subcluster type is _COMPRESSED".

>>   for (i = 0; i < nb_clusters; i++) {
>> -uint64_t entry = get_l2_entry(s, l2_slice, l2_index + i);
>> -QCow2ClusterType type = qcow2_get_cluster_type(bs, entry);
>> -
>> -if (type != wanted_type) {
>> -break;
>> +l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
>> +l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index + i);
>> +if (check_offset && expected_offset != (l2_entry & 
>> L2E_OFFSET_MASK)) {
>> +return count;
>> +}
>> +for (j = (i == 0) ? sc_index : 0; j < s->subclusters_per_cluster; 
>> j++) {
>> +if (qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, j) != 
>> type) {
>> +return count;
>> +}
>
> This really is checking that sub-clusters have the exact same type.

Correct.

>> @@ -604,24 +604,17 @@ int qcow2_get_host_offset(BlockDriverState *bs, 
>> uint64_t offset,
>>   ret = -EIO;
>>   goto fail;
>>   }
>> -/* Compressed clusters can only be processed one by one */
>> -c = 1;
>>   *host_offset = l2_entry & L2E_COMPRESSED_OFFSET_SIZE_MASK;
>>   break;
>> -case QCOW2_CLUSTER_ZERO_PLAIN:
>> -case QCOW2_CLUSTER_UNALLOCATED:
>> -/* how many empty clusters ? */
>> -c = count_contiguous_clusters_unallocated(bs, nb_clusters,
>> -  l2_slice, l2_index, type);
>
> The old code was counting how many contiguous clusters has similar 
> types, coalescing ranges of two different cluster types into one 
> nb_clusters result.

Not really. The old code had three different cases in the switch()
block:

- Compressed clusters: return always 1.

- Unallocated / zero_plain: count the number of clusters of the exact
  same type (one of the parameters was 'wanted_type').

- Normal / zero_alloc: count the number of clusters of type normal or
  zero_alloc that are contiguous on the image file **but stop** if the
  QCOW_OFLAG_ZERO flag changes. In other words, count contiguous
  clusters of the same type.

The new code simply merges all three cases in the same function. It
could be done even without having subclusters.

Plus, the old function was also returning the cluster type, so it had to
be the same one for the whole region.

>>   if (offset_into_cluster(s, host_cluster_offset)) {
>>   qcow2_signal_corruption(bs, true, -1, -1,
>>   "Cluster allocation offset %#"
>> @@ -647,9 +640,11 @@ int qcow2_get_host_offset(BlockDriverState *bs, 
>> uint64_t offset,
>>   abort();
>>   }
>>   
>> +sc = count_contiguous_subclusters(bs, nb_clusters, sc_index,
>> +  l2_slice, l2_index);
>
> But the new code is stopping at the first different subcluster type, 
> rather than trying to coalesce as many compatible types into one larger 
> nb_clusters.  When coupled with patch 19, that factors into my concern 
> over whether patch 19 needed to 

Re: [PATCH v5 20/31] qcow2: Add subcluster support to qcow2_get_host_offset()

2020-05-06 Thread Eric Blake

On 5/5/20 12:38 PM, Alberto Garcia wrote:

The logic of this function remains pretty much the same, except that
it uses count_contiguous_subclusters(), which combines the logic of
count_contiguous_clusters() / count_contiguous_clusters_unallocated()
and checks individual subclusters.



Maybe mention that qcow2_cluster_to_subcluster_type() is now inlined 
into its lone remaining caller.



Signed-off-by: Alberto Garcia 
---
  block/qcow2.h |  38 +---
  block/qcow2-cluster.c | 141 --
  2 files changed, 82 insertions(+), 97 deletions(-)




+++ b/block/qcow2-cluster.c
@@ -376,66 +376,58 @@ fail:



+static int count_contiguous_subclusters(BlockDriverState *bs, int nb_clusters,
+unsigned sc_index, uint64_t *l2_slice,
+int l2_index)
  {



+
+assert(type != QCOW2_SUBCLUSTER_INVALID); /* The caller should check this 
*/
+assert(l2_index + nb_clusters <= s->l2_size);
+
+if (type == QCOW2_SUBCLUSTER_COMPRESSED) {
+/* Compressed clusters are always processed one by one */
+return s->subclusters_per_cluster - sc_index;


Should this assert(sc_index == 0)?


  for (i = 0; i < nb_clusters; i++) {
-uint64_t entry = get_l2_entry(s, l2_slice, l2_index + i);
-QCow2ClusterType type = qcow2_get_cluster_type(bs, entry);
-
-if (type != wanted_type) {
-break;
+l2_entry = get_l2_entry(s, l2_slice, l2_index + i);
+l2_bitmap = get_l2_bitmap(s, l2_slice, l2_index + i);
+if (check_offset && expected_offset != (l2_entry & L2E_OFFSET_MASK)) {
+return count;
+}
+for (j = (i == 0) ? sc_index : 0; j < s->subclusters_per_cluster; j++) 
{
+if (qcow2_get_subcluster_type(bs, l2_entry, l2_bitmap, j) != type) 
{
+return count;
+}


This really is checking that sub-clusters have the exact same type.


@@ -604,24 +604,17 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t 
offset,
  ret = -EIO;
  goto fail;
  }
-/* Compressed clusters can only be processed one by one */
-c = 1;
  *host_offset = l2_entry & L2E_COMPRESSED_OFFSET_SIZE_MASK;
  break;
-case QCOW2_CLUSTER_ZERO_PLAIN:
-case QCOW2_CLUSTER_UNALLOCATED:
-/* how many empty clusters ? */
-c = count_contiguous_clusters_unallocated(bs, nb_clusters,
-  l2_slice, l2_index, type);


The old code was counting how many contiguous clusters has similar 
types, coalescing ranges of two different cluster types into one 
nb_clusters result.



+case QCOW2_SUBCLUSTER_ZERO_PLAIN:
+case QCOW2_SUBCLUSTER_UNALLOCATED_PLAIN:
  *host_offset = 0;
  break;
-case QCOW2_CLUSTER_ZERO_ALLOC:
-case QCOW2_CLUSTER_NORMAL: {
+case QCOW2_SUBCLUSTER_ZERO_ALLOC:
+case QCOW2_SUBCLUSTER_NORMAL:
+case QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC: {
  uint64_t host_cluster_offset = l2_entry & L2E_OFFSET_MASK;
  *host_offset = host_cluster_offset + offset_in_cluster;
-/* how many allocated clusters ? */
-c = count_contiguous_clusters(bs, nb_clusters, s->cluster_size,
-  l2_slice, l2_index, QCOW_OFLAG_ZERO);


and here coalescing three different cluster types into one nb_clusters 
result.



  if (offset_into_cluster(s, host_cluster_offset)) {
  qcow2_signal_corruption(bs, true, -1, -1,
  "Cluster allocation offset %#"
@@ -647,9 +640,11 @@ int qcow2_get_host_offset(BlockDriverState *bs, uint64_t 
offset,
  abort();
  }
  
+sc = count_contiguous_subclusters(bs, nb_clusters, sc_index,

+  l2_slice, l2_index);


But the new code is stopping at the first different subcluster type, 
rather than trying to coalesce as many compatible types into one larger 
nb_clusters.  When coupled with patch 19, that factors into my concern 
over whether patch 19 needed to check for INVALID clusters in the 
middle, or whether its fail: label was unreachable.  But it also means 
that you are potentially fragmenting the write in more places (every 
time a subcluster status changes) rather than coalescing similar status, 
the way the old code used to operate.


I don't think the extra fragmentation causes any correctness issues, but 
it may cause performance issues.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org