Re: [Qemu-devel] [PATCH 1/2] qcow2: Give the refcount cache the minimum possible size by default

2018-03-13 Thread Eric Blake

On 03/13/2018 01:48 PM, Alberto Garcia wrote:

On Tue 13 Mar 2018 07:23:36 PM CET, Eric Blake wrote:

+*refcount_cache_size =
+MIN(combined_cache_size, min_refcount_cache);


but here, if combined_cache_size is smaller than min_refcount_cache,


+*l2_cache_size = combined_cache_size - *refcount_cache_size;


then l2_cache_size is set to a negative value.


No, it's set to 0.

If combined == 4k and min_refcount == 256, then

refcount_cache_size = MIN(4k, 256k) // 4k
l2_cache_size = 4k - 4k; // 0


Ah. Mental breakdown on my part in trying to compute (x - MIN()).



Then the caller ensures that it's always set to the minimum (as it did
with the previous code).


So the caller will use larger than the requested limits if the requested 
limits are too small, and we are okay with calculations resulting in 0 
here.  All right, thanks for stepping me through my error; you're good 
to go with:


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH 1/2] qcow2: Give the refcount cache the minimum possible size by default

2018-03-13 Thread Alberto Garcia
On Tue 13 Mar 2018 07:23:36 PM CET, Eric Blake wrote:
>> +*refcount_cache_size =
>> +MIN(combined_cache_size, min_refcount_cache);
>
> but here, if combined_cache_size is smaller than min_refcount_cache,
>
>> +*l2_cache_size = combined_cache_size - *refcount_cache_size;
>
> then l2_cache_size is set to a negative value.

No, it's set to 0.

If combined == 4k and min_refcount == 256, then

   refcount_cache_size = MIN(4k, 256k) // 4k
   l2_cache_size = 4k - 4k; // 0

Then the caller ensures that it's always set to the minimum (as it did
with the previous code).

Berto



Re: [Qemu-devel] [PATCH 1/2] qcow2: Give the refcount cache the minimum possible size by default

2018-03-13 Thread Eric Blake

On 03/13/2018 10:02 AM, Alberto Garcia wrote:

The L2 and refcount caches have default sizes that can be overriden
using the l2-cache-size and refcount-cache-size (an additional
parameter named cache-size sets the combined size of both caches).

Unless forced by one of the aforementioned parameters, QEMU will set
the unspecified sizes so that the L2 cache is 4 times larger than the
refcount cache.





However this patch takes a completely different approach and instead
of keeping a ratio between both cache sizes it assigns as much as
possible to the L2 cache and the remainder to the refcount cache.

The reason is that L2 tables are used for every single I/O request
from the guest and the effect of increasing the cache is significant
and clearly measurable. Refcount blocks are however only used for
cluster allocation and internal snapshots and in practice are accessed
sequentially in most cases, so the effect of increasing the cache is
negligible (even when doing random writes from the guest).

So, make the refcount cache as small as possible unless the user
explicitly asks for a larger one.


I like the reasoning given here.

I'd count this as a bugfix, safe even during freeze (but it's ultimately 
the maintainer's call)




Signed-off-by: Alberto Garcia 
---
  block/qcow2.c  | 31 +++
  block/qcow2.h  |  4 
  tests/qemu-iotests/137.out |  2 +-
  3 files changed, 20 insertions(+), 17 deletions(-)



+++ b/block/qcow2.c
@@ -802,23 +802,30 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  } else if (refcount_cache_size_set) {
  *l2_cache_size = combined_cache_size - *refcount_cache_size;
  } else {
-*refcount_cache_size = combined_cache_size
- / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
-*l2_cache_size = combined_cache_size - *refcount_cache_size;


In the old code, refcount_cache_size and l2_cache_size are both set to 
fractions of the combined size, so both are positive (even if 
combined_cache_size is too small for the minimums required)



+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
+uint64_t min_refcount_cache =
+(uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+
+/* Assign as much memory as possible to the L2 cache, and
+ * use the remainder for the refcount cache */
+if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
+*l2_cache_size = max_l2_cache;
+*refcount_cache_size = combined_cache_size - *l2_cache_size;
+} else {
+*refcount_cache_size =
+MIN(combined_cache_size, min_refcount_cache);


but here, if combined_cache_size is smaller than min_refcount_cache,


+*l2_cache_size = combined_cache_size - *refcount_cache_size;


then l2_cache_size is set to a negative value.

I think you're missing bounds validations that the combined cache size 
is large enough for the minimums required.  Or maybe a slight tweak, if 
it is okay for one of the two caches to be sized at 0 (that is, if 
combined_cache_size is too small for refcount, can it instead be given 
100% to the l2 cache and let refcount be uncached)?



+}
  }
  } else {
-if (!l2_cache_size_set && !refcount_cache_size_set) {
+if (!l2_cache_size_set) {
  *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
   (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
   * s->cluster_size);
-*refcount_cache_size = *l2_cache_size
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-} else if (!l2_cache_size_set) {
-*l2_cache_size = *refcount_cache_size
-   * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-} else if (!refcount_cache_size_set) {
-*refcount_cache_size = *l2_cache_size
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
+}
+if (!refcount_cache_size_set) {
+*refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
  }
  }
  

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[Qemu-devel] [PATCH 1/2] qcow2: Give the refcount cache the minimum possible size by default

2018-03-13 Thread Alberto Garcia
The L2 and refcount caches have default sizes that can be overriden
using the l2-cache-size and refcount-cache-size (an additional
parameter named cache-size sets the combined size of both caches).

Unless forced by one of the aforementioned parameters, QEMU will set
the unspecified sizes so that the L2 cache is 4 times larger than the
refcount cache.

This is based on the premise that the refcount metadata needs to be
only a fourth of the L2 metadata to cover the same amount of disk
space. This is incorrect for two reasons:

 a) The amount of disk covered by an L2 table depends solely on the
cluster size, but in the case of a refcount block it depends on
the cluster size *and* the width of each refcount entry.
The 4/1 ratio is only valid with 16-bit entries (the default).

 b) When we talk about disk space and L2 tables we are talking about
guest space (L2 tables map guest clusters to host clusters),
whereas refcount blocks are used for host clusters (including
L1/L2 tables and the refcount blocks themselves). On a fully
populated (and uncompressed) qcow2 file, image size > virtual size
so there are more refcount entries than L2 entries.

Problem (a) could be fixed by adjusting the algorithm to take into
account the refcount entry width. Problem (b) could be fixed by
increasing a bit the refcount cache size to account for the clusters
used for qcow2 metadata.

However this patch takes a completely different approach and instead
of keeping a ratio between both cache sizes it assigns as much as
possible to the L2 cache and the remainder to the refcount cache.

The reason is that L2 tables are used for every single I/O request
from the guest and the effect of increasing the cache is significant
and clearly measurable. Refcount blocks are however only used for
cluster allocation and internal snapshots and in practice are accessed
sequentially in most cases, so the effect of increasing the cache is
negligible (even when doing random writes from the guest).

So, make the refcount cache as small as possible unless the user
explicitly asks for a larger one.

Signed-off-by: Alberto Garcia 
---
 block/qcow2.c  | 31 +++
 block/qcow2.h  |  4 
 tests/qemu-iotests/137.out |  2 +-
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7472af6931..8342b0186f 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -802,23 +802,30 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 } else if (refcount_cache_size_set) {
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 } else {
-*refcount_cache_size = combined_cache_size
- / (DEFAULT_L2_REFCOUNT_SIZE_RATIO + 1);
-*l2_cache_size = combined_cache_size - *refcount_cache_size;
+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
+uint64_t min_refcount_cache =
+(uint64_t) MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+
+/* Assign as much memory as possible to the L2 cache, and
+ * use the remainder for the refcount cache */
+if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
+*l2_cache_size = max_l2_cache;
+*refcount_cache_size = combined_cache_size - *l2_cache_size;
+} else {
+*refcount_cache_size =
+MIN(combined_cache_size, min_refcount_cache);
+*l2_cache_size = combined_cache_size - *refcount_cache_size;
+}
 }
 } else {
-if (!l2_cache_size_set && !refcount_cache_size_set) {
+if (!l2_cache_size_set) {
 *l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
-*refcount_cache_size = *l2_cache_size
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-} else if (!l2_cache_size_set) {
-*l2_cache_size = *refcount_cache_size
-   * DEFAULT_L2_REFCOUNT_SIZE_RATIO;
-} else if (!refcount_cache_size_set) {
-*refcount_cache_size = *l2_cache_size
- / DEFAULT_L2_REFCOUNT_SIZE_RATIO;
+}
+if (!refcount_cache_size_set) {
+*refcount_cache_size = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
 }
 }
 
diff --git a/block/qcow2.h b/block/qcow2.h
index ccb92a9696..cdf41055ae 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -77,10 +77,6 @@
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
 #define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
 
-/* The refblock cache needs only a fourth of the L2 cache size to cover as many
- * clusters */