Re: [Qemu-block] [PATCH] qcow2: Fix the calculation of the maximum L2 cache size

2019-08-18 Thread Leonid Bloch
Thanks, Berto!

On 8/16/19 3:17 PM, Alberto Garcia wrote:
> The size of the qcow2 L2 cache defaults to 32 MB, which can be easily
> larger than the maximum amount of L2 metadata that the image can have.
> For example: with 64 KB clusters the user would need a qcow2 image
> with a virtual size of 256 GB in order to have 32 MB of L2 metadata.
> 
> Because of that, since commit b749562d9822d14ef69c9eaa5f85903010b86c30
> we forbid the L2 cache to become larger than the maximum amount of L2
> metadata for the image, calculated using this formula:
> 
>  uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
> 
> The problem with this formula is that the result should be rounded up
> to the cluster size because an L2 table on disk always takes one full
> cluster.
> 
> For example, a 1280 MB qcow2 image with 64 KB clusters needs exactly
> 160 KB of L2 metadata, but we need 192 KB on disk (3 clusters) even if
> the last 32 KB of those are not going to be used.
> 
> However QEMU rounds the numbers down and only creates 2 cache tables
> (128 KB), which is not enough for the image.
> 
> A quick test doing 4KB random writes on a 1280 MB image gives me
> around 500 IOPS, while with the correct cache size I get 16K IOPS.
> 
> Signed-off-by: Alberto Garcia 
> ---
>   block/qcow2.c | 6 +-
>   1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index 039bdc2f7e..865839682c 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -826,7 +826,11 @@ static void read_cache_sizes(BlockDriverState *bs, 
> QemuOpts *opts,
>   bool l2_cache_entry_size_set;
>   int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
>   uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
> -uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
> +uint64_t max_l2_entries = DIV_ROUND_UP(virtual_disk_size, 
> s->cluster_size);
> +/* An L2 table is always one cluster in size so the max cache size
> + * should be a multiple of the cluster size. */
> +uint64_t max_l2_cache = ROUND_UP(max_l2_entries * sizeof(uint64_t),
> + s->cluster_size);
>   
>   combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
>   l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
> 


Re: [Qemu-block] [PATCH 0/1] block: Eliminate the S_1KiB, S_2KiB, ... macros

2019-01-13 Thread Leonid Bloch
On 1/11/19 9:14 PM, Markus Armbruster wrote:
> Back in September, Leonid Block added a whole bunch of macros (commit

* Bloch. :)

> 540b8492618) to improve readability of qcow2.h a bit (commit
> b6a95c6d100).  He later used them to fix the "vdi" driver's parameter
> cluster_size's default value (commit 3dd5b8f4718).  He has now
> proposed a further patch[1] to auto-generate these macros.  That patch
> feels overengineered to me.
> 
> On closer examination, I found I dislike the macros before his new
> patch.  So did Eric Blake.
> 
> The macros exist because the common KiB, MiB, ... macros aren't usable
> when you need a literal rather than a constant expression.
> stringify() does, and we use it to define the QemuOpts default value.
> 
> Eric proposed to improve QemuOpts to accept integer default values,
> too[2].  Before I review that patch series, I want to establish a
> "stupidest solution that can possibly work" baseline.  And that's what
> this patch is.
> 
> [1] [PATCH v2 0/1] include: Auto-generate the sizes lookup table
> Message-ID: <20190103213320.2653-1-lbl...@janustech.com>
> 
> [2] [PATCH v3 0/6] include: Auto-generate the sizes lookup table
> Message-Id: <20190110191901.5082-1-ebl...@redhat.com>
> 
> Markus Armbruster (1):
>block: Eliminate the S_1KiB, S_2KiB, ... macros
> 
>   block/qcow2.h| 10 +++---
>   block/vdi.c  |  3 +-
>   include/qemu/units.h | 73 
>   3 files changed, 7 insertions(+), 79 deletions(-)
> 


Re: [Qemu-block] [PATCH 1/1] block: Eliminate the S_1KiB, S_2KiB, ... macros

2019-01-13 Thread Leonid Bloch
Hi,

On 1/11/19 9:14 PM, Markus Armbruster wrote:
> We define 54 macros for the powers of two >= 1024.  We use six, in six
> macro definitions.  Four of them could just as well use the common MiB
> macro, so do that.  The remaining two can't, because they get passed
> to stringify.  Replace the macro by the literal number there.
> Slightly harder to read in one instance (1048576 vs. S_1MiB), so add a
> comment there.  The other instance is a wash: 65536 vs S_64KiB.  65536
> has been good enough for more than seven years there.
> 
> This effectively reverts commit 540b8492618 and 1240ac558d3.
> 
> Signed-off-by: Markus Armbruster 
> ---
>   block/qcow2.h| 10 +++---
>   block/vdi.c  |  3 +-
>   include/qemu/units.h | 73 
>   3 files changed, 7 insertions(+), 79 deletions(-)
> 
> diff --git a/block/qcow2.h b/block/qcow2.h
> index a98d24500b..2380cbfb41 100644
> --- a/block/qcow2.h
> +++ b/block/qcow2.h
> @@ -50,11 +50,11 @@
>   
>   /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
>* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
> -#define QCOW_MAX_REFTABLE_SIZE S_8MiB
> +#define QCOW_MAX_REFTABLE_SIZE (8 * MiB)
>   
>   /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
>* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
> -#define QCOW_MAX_L1_SIZE S_32MiB
> +#define QCOW_MAX_L1_SIZE (32 * MiB)
>   
>   /* Allow for an average of 1k per snapshot table entry, should be plenty of
>* space for snapshot names and IDs */
> @@ -81,15 +81,15 @@
>   #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
>   
>   #ifdef CONFIG_LINUX
> -#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
> +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
>   #define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
>   #else
> -#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
> +#define DEFAULT_L2_CACHE_MAX_SIZE (8 * MiB)
>   /* Cache clean interval is currently available only on Linux, so must be 0 
> */
>   #define DEFAULT_CACHE_CLEAN_INTERVAL 0
>   #endif
>   
> -#define DEFAULT_CLUSTER_SIZE S_64KiB
> +#define DEFAULT_CLUSTER_SIZE 65536

/* Note: can't use 64 * KiB here, because it's passed to stringify() */

Otherwise fine with me. The other solutions (including mine) indeed seem 
overengineered compared to this.

Leonid.

>   
>   #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
>   #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
> diff --git a/block/vdi.c b/block/vdi.c
> index 2380daa583..bf1d19dd68 100644
> --- a/block/vdi.c
> +++ b/block/vdi.c
> @@ -85,7 +85,8 @@
>   #define BLOCK_OPT_STATIC "static"
>   
>   #define SECTOR_SIZE 512
> -#define DEFAULT_CLUSTER_SIZE S_1MiB
> +#define DEFAULT_CLUSTER_SIZE 1048576
> +/* Note: can't use 1 * MiB, because it's passed to stringify() */
>   
>   #if defined(CONFIG_VDI_DEBUG)
>   #define VDI_DEBUG 1
> diff --git a/include/qemu/units.h b/include/qemu/units.h
> index 1c959d182e..692db3fbb2 100644
> --- a/include/qemu/units.h
> +++ b/include/qemu/units.h
> @@ -17,77 +17,4 @@
>   #define PiB (INT64_C(1) << 50)
>   #define EiB (INT64_C(1) << 60)
>   
> -/*
> - * The following lookup table is intended to be used when a literal string of
> - * the number of bytes is required (for example if it needs to be 
> stringified).
> - * It can also be used for generic shortcuts of power-of-two sizes.
> - * This table is generated using the AWK script below:
> - *
> - *  BEGIN {
> - *  suffix="KMGTPE";
> - *  for(i=10; i<64; i++) {
> - *  val=2**i;
> - *  s=substr(suffix, int(i/10), 1);
> - *  n=2**(i%10);
> - *  pad=21-int(log(n)/log(10));
> - *  printf("#define S_%d%siB %*d\n", n, s, pad, val);
> - *  }
> - *  }
> - */
> -
> -#define S_1KiB  1024
> -#define S_2KiB  2048
> -#define S_4KiB  4096
> -#define S_8KiB  8192
> -#define S_16KiB16384
> -#define S_32KiB32768
> -#define S_64KiB65536
> -#define S_128KiB  131072
> -#define S_256KiB  262144
> -#define S_512KiB  524288
> -#define S_1MiB   1048576
> -#define S_2MiB   2097152
> -#define S_4MiB   4194304
> -#define S_8MiB   8388608
> -#define S_16MiB 16777216
> -#define S_32MiB 33554432
> -#define S_64MiB 67108864
> -#define S_128MiB   134217728
> -#define S_256MiB   268435456
> -#define S_512MiB   536870912
> -#define S_1GiB1073741824
> -#define S_2GiB2147483648
> -#define S_4GiB4294967296
> -#define S_8GiB8589934592
> -#define S_16GiB  17179869184
> -#define S_32GiB  34359738368
> -#define S_64GiB  68719476736
> -#define S_128GiB137438953472
> -#define S_256GiB274877906944
> -#define S_512GiB549755813888
> -#define S_1TiB 

Re: [Qemu-block] [PATCH v3 1/6] qemu-option: Allow integer defaults

2019-01-11 Thread Leonid Bloch
On 1/11/19 6:23 PM, Eric Blake wrote:
> On 1/11/19 8:14 AM, Leonid Bloch wrote:
>> On 1/10/19 9:18 PM, Eric Blake wrote:
>>> Set the framework up for declaring integer options with an integer
>>> default, instead of our current insane approach of requiring the
>>> default value to be given as a string (which then has to be reparsed
>>> at every use that wants a number).  git grep '[^.]def_value_str' says
>>> that we have done a good job of NOT abusing the internal field
>>> outside of the implementation in qemu-option.c; therefore, it is not
>>> too hard to audit that all code can either handle the new integer
>>> defaults correctly or abort because a caller violated constraints.
>>> Sadly, we DO have a constraint that qemu_opt_get() should not be
>>> called on an option that has an integer type, because we have no
>>> where to stash a cached const char * result; but callers that want
>>> an integer should be using qemu_opt_get_number() and friends
>>> anyways.
>>>
>>> Signed-off-by: Eric Blake 
>>> ---
>>>include/qemu/option.h | 12 
>>>util/qemu-option.c| 69 +--
>>>2 files changed, 72 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/include/qemu/option.h b/include/qemu/option.h
>>> index 844587cab39..46b80d5a6e1 100644
>>> --- a/include/qemu/option.h
>>> +++ b/include/qemu/option.h
>>> @@ -46,6 +46,18 @@ typedef struct QemuOptDesc {
>>>const char *name;
>>>enum QemuOptType type;
>>>const char *help;
>>> +/*
>>> + * For QEMU_OPT_STRING: Leave def_value_int 0, and set def_value_str
>>> + * to a default value or leave NULL for no default.
>>> + *
>>> + * For other types: Initialize at most non-zero def_value_int or a
>>> + * parseable def_value_str for a default (must use a string for an
>>> + * explicit default of 0, although an implicit default generally
>>> + * works).  If setting def_value_int, calling qemu_opt_get() on
>>> + * that option will abort(); instead, call qemu_opt_get_del() or a
>>> + * typed getter.
>>> + */
>>> +uint64_t def_value_int;
>>>const char *def_value_str;
>>>} QemuOptDesc;
>>>
> 
>>
>> If I understand correctly, the number will still be compiled into the
>> binary as an expression string, but will be printed correctly during
>> runtime?
> 
> Anyone that uses .def_value_int will compile into the binary as an
> 8-byte integer (regardless of how that number was spelled in C, either
> as a single constant, or as a constant expression), and NOT as a decimal
> string.  String conversions for code that asks for a string will happen
> by a runtime printf() (ideally, such code is limited to the case of
> printing out information during help output).  Code that is smart enough
> to request a number now gets the default value directly, rather than the
> old way of having to strtoll() decode a decimal string back into a
> number.  No one should ever be using .def_value_str = stringify(macro),
> when they can instead just use .def_value_int = macro (which is what the
> next patch fixes).

Yes, you're right. Thanks for the explanation!


Re: [Qemu-block] [PATCH v3 1/6] qemu-option: Allow integer defaults

2019-01-11 Thread Leonid Bloch
On 1/10/19 9:18 PM, Eric Blake wrote:
> Set the framework up for declaring integer options with an integer
> default, instead of our current insane approach of requiring the
> default value to be given as a string (which then has to be reparsed
> at every use that wants a number).  git grep '[^.]def_value_str' says
> that we have done a good job of NOT abusing the internal field
> outside of the implementation in qemu-option.c; therefore, it is not
> too hard to audit that all code can either handle the new integer
> defaults correctly or abort because a caller violated constraints.
> Sadly, we DO have a constraint that qemu_opt_get() should not be
> called on an option that has an integer type, because we have no
> where to stash a cached const char * result; but callers that want
> an integer should be using qemu_opt_get_number() and friends
> anyways.
> 
> Signed-off-by: Eric Blake 
> ---
>   include/qemu/option.h | 12 
>   util/qemu-option.c| 69 +--
>   2 files changed, 72 insertions(+), 9 deletions(-)
> 
> diff --git a/include/qemu/option.h b/include/qemu/option.h
> index 844587cab39..46b80d5a6e1 100644
> --- a/include/qemu/option.h
> +++ b/include/qemu/option.h
> @@ -46,6 +46,18 @@ typedef struct QemuOptDesc {
>   const char *name;
>   enum QemuOptType type;
>   const char *help;
> +/*
> + * For QEMU_OPT_STRING: Leave def_value_int 0, and set def_value_str
> + * to a default value or leave NULL for no default.
> + *
> + * For other types: Initialize at most non-zero def_value_int or a
> + * parseable def_value_str for a default (must use a string for an
> + * explicit default of 0, although an implicit default generally
> + * works).  If setting def_value_int, calling qemu_opt_get() on
> + * that option will abort(); instead, call qemu_opt_get_del() or a
> + * typed getter.
> + */
> +uint64_t def_value_int;
>   const char *def_value_str;
>   } QemuOptDesc;
> 
> diff --git a/util/qemu-option.c b/util/qemu-option.c
> index de42e2a406a..06c4e8102a8 100644
> --- a/util/qemu-option.c
> +++ b/util/qemu-option.c
> @@ -321,7 +321,8 @@ const char *qemu_opt_get(QemuOpts *opts, const char *name)
>   opt = qemu_opt_find(opts, name);
>   if (!opt) {
>   const QemuOptDesc *desc = find_desc_by_name(opts->list->desc, name);
> -if (desc && desc->def_value_str) {
> +if (desc) {
> +assert(!desc->def_value_int);
>   return desc->def_value_str;
>   }
>   }
> @@ -364,8 +365,22 @@ char *qemu_opt_get_del(QemuOpts *opts, const char *name)
>   opt = qemu_opt_find(opts, name);
>   if (!opt) {
>   desc = find_desc_by_name(opts->list->desc, name);
> -if (desc && desc->def_value_str) {
> -str = g_strdup(desc->def_value_str);
> +if (desc) {
> +if (desc->def_value_str) {
> +str = g_strdup(desc->def_value_str);
> +} else if (desc->def_value_int) {
> +switch (desc->type) {
> +case QEMU_OPT_BOOL:
> +str = g_strdup("on");
> +break;
> +case QEMU_OPT_NUMBER:
> +case QEMU_OPT_SIZE:
> +str = g_strdup_printf("%" PRId64, desc->def_value_int);
> +break;
> +default:
> +abort();
> +}
> +}
>   }
>   return str;
>   }
> @@ -400,8 +415,15 @@ static bool qemu_opt_get_bool_helper(QemuOpts *opts, 
> const char *name,
>   opt = qemu_opt_find(opts, name);
>   if (opt == NULL) {
>   const QemuOptDesc *desc = find_desc_by_name(opts->list->desc, name);
> -if (desc && desc->def_value_str) {
> -parse_option_bool(name, desc->def_value_str, , _abort);
> +if (desc) {
> +if (desc->def_value_int) {
> +assert(desc->type != QEMU_OPT_STRING);
> +return true;
> +}
> +if (desc->def_value_str) {
> +parse_option_bool(name, desc->def_value_str, ,
> +  _abort);
> +}
>   }
>   return ret;
>   }
> @@ -436,8 +458,15 @@ static uint64_t qemu_opt_get_number_helper(QemuOpts 
> *opts, const char *name,
>   opt = qemu_opt_find(opts, name);
>   if (opt == NULL) {
>   const QemuOptDesc *desc = find_desc_by_name(opts->list->desc, name);
> -if (desc && desc->def_value_str) {
> -parse_option_number(name, desc->def_value_str, , 
> _abort);
> +if (desc) {
> +if (desc->def_value_int) {
> +assert(desc->type != QEMU_OPT_STRING);
> +return desc->def_value_int;
> +}
> +if (desc->def_value_str) {
> +parse_option_number(name, desc->def_value_str, ,
> +

Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] include: Auto-generate the sizes lookup table

2019-01-10 Thread Leonid Bloch
On 1/10/19 2:51 PM,  wrote:
> Leonid Bloch  writes:
> 
>> Hi,
>>
>> On 1/8/19 2:20 PM, Kevin Wolf wrote:
>>> Am 03.01.2019 um 22:33 hat Leonid Bloch geschrieben:
>>>> The lookup table for power-of-two sizes is now auto-generated during the
>>>> build, and not hard-coded into the units.h file.
>>>>
>>>> This partially reverts commit 540b8492618eb.
>>>>
>>>> Signed-off-by: Leonid Bloch 
>>>
>>> During a downstream review, Max found a problem with the table that we
>>> could fix while we're touching it:
>>>
>>>   Upstream: All >= S_2GiB are not valid ints.  (qemu assumes that
>>>   sizeof(int) == 4, right?)  So S_2GiB should be 2147483648u and all
>>>   above should be ...ull or better UINT64_C().
> 
> So their (integer) type can vary.  Whether that's a problem depends on
> how the macros are used.
> 
>> But the initial reasoning for this table was to have a pure number
>> there. If there will be strings like "2147483648u/ull" or
>> "UINT64_C(...)" there, they will be stringified, literally, and will
>> appear as such inside the binary.
> 
> "Macro that expands to an integer constant that stringify() turns into a
> decimal number" is a rather peculiar contract.
> 
>>If specifying the unit64 type is
>> really needed, one can always use, e.g., 2 * GiB, from units.h.
> 
> Right now I suspect these S_ macros should generally be avoided.
> 
> We have 54 of them.  I count six uses:
> 
>  block/qcow2.h:#define QCOW_MAX_REFTABLE_SIZE S_8MiB
>  block/qcow2.h:#define QCOW_MAX_L1_SIZE S_32MiB
>  block/qcow2.h:#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
>  block/qcow2.h:#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
>  block/qcow2.h:#define DEFAULT_CLUSTER_SIZE S_64KiB
>  block/vdi.c:#define DEFAULT_CLUSTER_SIZE S_1MiB
> 
> Which if them truly need stringify-able integers?
> 

These two:

block/qcow2.h:#define DEFAULT_CLUSTER_SIZE
block/vdi.c:#define DEFAULT_CLUSTER_SIZE.


Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] include: Auto-generate the sizes lookup table

2019-01-10 Thread Leonid Bloch
On 1/10/19 2:40 PM, Markus Armbruster wrote:
> Leonid Bloch  writes:
> 
>> Hi,
>>
>> On 1/8/19 11:31 AM, Markus Armbruster wrote:
>>> I'd leave it hard-coded.  Replacing a few trivial defines by an arguably
>>> less trivial script doesn't feel like an improvement.  In this case, it
>>> doesn't even save lines.
>>
>> As I've said, I'm fine with that. The autogeneration sounds the right
>> thing to do here, as the table is autogenertaed anyway, but indeed it is
>> already there, explained, and even the script for generating it appears
>> in the comments for reproducibility.
>>
>>>
>>> I'm not sure I'd use shell for this, but since you already wrote it and
>>> it works...
>>>
>>
>> If you've noticed, the original script was in AWK. But to be as generic
>> as possible, I didn't write the generation script in AWK because even
>> AWK is not guaranteed to be installed on the build system. The only
>> interpreted language that is guaranteed to be there is shell (most basic
>> shell) because .configure itself needs it.
> 
> Well,
> 
>  $ grep -w awk configure
>  maj=`libgcrypt-config --version | awk -F . '{print $1}'`
>  min=`libgcrypt-config --version | awk -F . '{print $2}'`
> 
> The build also requires Python.

Not for critical things as this. I mean it can still build OK without 
Python.

> 
> [...]
> 


Re: [Qemu-block] [PATCH v2 1/1] include: Auto-generate the sizes lookup table

2019-01-10 Thread Leonid Bloch
Hi,

On 1/8/19 2:20 PM, Kevin Wolf wrote:
> Am 03.01.2019 um 22:33 hat Leonid Bloch geschrieben:
>> The lookup table for power-of-two sizes is now auto-generated during the
>> build, and not hard-coded into the units.h file.
>>
>> This partially reverts commit 540b8492618eb.
>>
>> Signed-off-by: Leonid Bloch 
> 
> During a downstream review, Max found a problem with the table that we
> could fix while we're touching it:
> 
>  Upstream: All >= S_2GiB are not valid ints.  (qemu assumes that
>  sizeof(int) == 4, right?)  So S_2GiB should be 2147483648u and all
>  above should be ...ull or better UINT64_C().

But the initial reasoning for this table was to have a pure number 
there. If there will be strings like "2147483648u/ull" or 
"UINT64_C(...)" there, they will be stringified, literally, and will 
appear as such inside the binary. If specifying the unit64 type is 
really needed, one can always use, e.g., 2 * GiB, from units.h.

Leonid.


Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] include: Auto-generate the sizes lookup table

2019-01-10 Thread Leonid Bloch
Hi,

On 1/8/19 11:31 AM, Markus Armbruster wrote:
> I'd leave it hard-coded.  Replacing a few trivial defines by an arguably
> less trivial script doesn't feel like an improvement.  In this case, it
> doesn't even save lines.

As I've said, I'm fine with that. The autogeneration sounds the right 
thing to do here, as the table is autogenertaed anyway, but indeed it is 
already there, explained, and even the script for generating it appears 
in the comments for reproducibility.

> 
> I'm not sure I'd use shell for this, but since you already wrote it and
> it works...
> 

If you've noticed, the original script was in AWK. But to be as generic 
as possible, I didn't write the generation script in AWK because even 
AWK is not guaranteed to be installed on the build system. The only 
interpreted language that is guaranteed to be there is shell (most basic 
shell) because .configure itself needs it.

Sure, the script could be prettier and shorter, but I wanted to keep it 
compatible with the most basic shell, not only Bash, and without needing 
external programs.

Eric - thanks for the comment about 'local' - I will get rid of it if we 
decide to include this patch.

> 
> Unchecked use of command-line arguments is not nice:
> 
>  $ scripts/gen-sizes.sh
>  scripts/gen-sizes.sh: line 64: : No such file or directory
>  scripts/gen-sizes.sh: line 65: : No such file or directory
>  scripts/gen-sizes.sh: line 66: : No such file or directory
> 
> You should error out if $# -ne 1.  But in such a simple script, I'd
> dispense with arguments and print to stdout.  Matter of taste.
> Rejecting $# -ne 0 is still nice then.
> 

Agree, thanks!

Leonid.
___


Re: [Qemu-block] [Qemu-devel] [PATCH 1/1] include: Auto-generate the sizes lookup table

2019-01-04 Thread Leonid Bloch
On 1/4/19 12:31 PM, Alberto Garcia wrote:
> On Thu 03 Jan 2019 10:42:30 PM CET, Eric Blake wrote:
> 
>> In my view, code generators make sense when used on code that is
>> expected to change over time (a good example is QAPI because we add
>> new commands every release; other places might be a generator to help
>> deal with syscall handlers since newer kernels can add a syscall; or
>> even the fact that we have used some powerful GNU make textual
>> processing to make it easy to add files to particular subsets of the
>> build with as few lines edited as possible), where the goal is that
>> the generator gives you both a compact representation that is easier
>> to edit, and that the expansion from the generator ensures that
>> repetitive boilerplate is formed without typos.  In short, if a
>> generator results in a net reduction in lines of edited source in
>> relation to the lines it produces, AND if the source that gets
>> regenerated is likely to change, then it makes total sense to spend
>> time on the generator.  But when the amount of effort to write a
>> generator costs as much as just hard-coding the list outright,
>> especially when the list is not going to change (there really aren't
>> any other powers of 2 within 64 bits), I'm not sure a generator adds
>> any value.
> 
> I agree with Eric.

Fine with me. I just thought that in the previous conversation 
(https://patchwork.kernel.org/patch/10666975/#22302435) we have agreed 
that I'll send this patch. I've sent v2 already with Phil's suggestions 
included, please feel free to pull it if desired.

Leonid.

> 
> Berto
> 


[Qemu-block] [PATCH v2 0/1] include: Auto-generate the sizes lookup table

2019-01-03 Thread Leonid Bloch
Following the conversations here:
https://patchwork.kernel.org/patch/10665157
and here:
https://patchwork.kernel.org/patch/10666975

Making the lookup table for power-of-two sizes auto-generated, instead
of being hard-coded into the units.h file.

I'm not sure if the changes I've made to Makefile here are "standard".
Please correct me if that's not the case.

Sorry it took so much time - I was busy with something completely
different.


Regards,
Leonid.

---

Differences from v1:
* The generated header moved from $(SRC_PATH)/include/qemu/sizes.h to
  pow2_sizes.h in the build directory.
* The commit message mentions the commit which is partially reverted by
  the current commit.
* Small changes in the comments of the generated header file.

Leonid Bloch (1):
  include: Auto-generate the sizes lookup table

 .gitignore   |  1 +
 Makefile |  5 +++
 block/qcow2.h|  2 +-
 block/vdi.c  |  1 +
 include/qemu/units.h | 73 
 scripts/gen-sizes.sh | 66 +++
 6 files changed, 74 insertions(+), 74 deletions(-)
 create mode 100755 scripts/gen-sizes.sh

-- 
2.20.1




[Qemu-block] [PATCH v2 1/1] include: Auto-generate the sizes lookup table

2019-01-03 Thread Leonid Bloch
The lookup table for power-of-two sizes is now auto-generated during the
build, and not hard-coded into the units.h file.

This partially reverts commit 540b8492618eb.

Signed-off-by: Leonid Bloch 
---
 .gitignore   |  1 +
 Makefile |  5 +++
 block/qcow2.h|  2 +-
 block/vdi.c  |  1 +
 include/qemu/units.h | 73 
 scripts/gen-sizes.sh | 66 +++
 6 files changed, 74 insertions(+), 74 deletions(-)
 create mode 100755 scripts/gen-sizes.sh

diff --git a/.gitignore b/.gitignore
index 0430257313..721a7f4454 100644
--- a/.gitignore
+++ b/.gitignore
@@ -59,6 +59,7 @@
 /qemu-version.h
 /qemu-version.h.tmp
 /module_block.h
+/pow2_sizes.h
 /scsi/qemu-pr-helper
 /vhost-user-scsi
 /vhost-user-blk
diff --git a/Makefile b/Makefile
index dd53965f77..db72786ccb 100644
--- a/Makefile
+++ b/Makefile
@@ -122,6 +122,8 @@ endif
 
 GENERATED_FILES += module_block.h
 
+GENERATED_FILES += pow2_sizes.h
+
 TRACE_HEADERS = trace-root.h $(trace-events-subdirs:%=%/trace.h)
 TRACE_SOURCES = trace-root.c $(trace-events-subdirs:%=%/trace.c)
 TRACE_DTRACE =
@@ -499,6 +501,9 @@ ifdef CONFIG_MPATH
 scsi/qemu-pr-helper$(EXESUF): LIBS += -ludev -lmultipath -lmpathpersist
 endif
 
+pow2_sizes.h: $(SRC_PATH)/scripts/gen-sizes.sh
+   $(call quiet-command,sh $(SRC_PATH)/scripts/gen-sizes.sh $@,"GEN","$@")
+
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > 
$@,"GEN","$@")
 
diff --git a/block/qcow2.h b/block/qcow2.h
index a98d24500b..f5fa419ae7 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,7 +27,7 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
-#include "qemu/units.h"
+#include "pow2_sizes.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
diff --git a/block/vdi.c b/block/vdi.c
index 2380daa583..06d7335b3e 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -51,6 +51,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/units.h"
+#include "pow2_sizes.h"
 #include "qapi/error.h"
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qapi-visit-block-core.h"
diff --git a/include/qemu/units.h b/include/qemu/units.h
index 1c959d182e..692db3fbb2 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,77 +17,4 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
-/*
- * The following lookup table is intended to be used when a literal string of
- * the number of bytes is required (for example if it needs to be stringified).
- * It can also be used for generic shortcuts of power-of-two sizes.
- * This table is generated using the AWK script below:
- *
- *  BEGIN {
- *  suffix="KMGTPE";
- *  for(i=10; i<64; i++) {
- *  val=2**i;
- *  s=substr(suffix, int(i/10), 1);
- *  n=2**(i%10);
- *  pad=21-int(log(n)/log(10));
- *  printf("#define S_%d%siB %*d\n", n, s, pad, val);
- *  }
- *  }
- */
-
-#define S_1KiB  1024
-#define S_2KiB  2048
-#define S_4KiB  4096
-#define S_8KiB  8192
-#define S_16KiB16384
-#define S_32KiB32768
-#define S_64KiB65536
-#define S_128KiB  131072
-#define S_256KiB  262144
-#define S_512KiB  524288
-#define S_1MiB   1048576
-#define S_2MiB   2097152
-#define S_4MiB   4194304
-#define S_8MiB   8388608
-#define S_16MiB 16777216
-#define S_32MiB 33554432
-#define S_64MiB 67108864
-#define S_128MiB   134217728
-#define S_256MiB   268435456
-#define S_512MiB   536870912
-#define S_1GiB1073741824
-#define S_2GiB2147483648
-#define S_4GiB4294967296
-#define S_8GiB8589934592
-#define S_16GiB  17179869184
-#define S_32GiB  34359738368
-#define S_64GiB  68719476736
-#define S_128GiB137438953472
-#define S_256GiB274877906944
-#define S_512GiB549755813888
-#define S_1TiB 1099511627776
-#define S_2TiB 219902322
-#define S_4TiB 4398046511104
-#define S_8TiB 8796093022208
-#define S_16TiB   17592186044416
-#define S_32TiB   35184372088832
-#define S_64TiB   70368744177664
-#define S_128TiB 140737488355328
-#define S_256TiB 281474976710656
-#define S_512TiB 562949953421312
-#define S_1PiB  1125899906842624
-#define S_2PiB  2251799813685248
-#define S_4PiB  4503599627370496
-#define S_8PiB  9007199254740992
-#define S_16PiB18014398509481984
-#define S_32PiB36028797018963968
-#define S_64PiB72057594037927936
-#define S_128PiB  14

Re: [Qemu-block] [Qemu-devel] [PATCH 1/1] include: Auto-generate the sizes lookup table

2019-01-03 Thread Leonid Bloch
Hi,

On 1/4/19 12:04 AM, Eric Blake wrote:
> On 1/3/19 2:57 PM, Leonid Bloch wrote:
> 
>>> I have to say that I'm not very convinced of the benefits of replacing a
>>> set of trivial numeric macros with a longer and harder to read shell
>>> script accompanied by changes to the build system.
>>
>> I think that the benefit is that the script is easily verifiable,
>> whereas if someone would like to verify the table, they will need to
>> generate it themselves. Also, this table is automatically generated
>> anyway, so it only makes sense to generate it during the build.
> 
> But no one is submitting patches to active modify the table.  The work
> has already been done once to generate it, and the reviewers have
> already verified it; at this point, no further changes are likely to
> happen (other than my pipe dream of entirely getting rid of the table in
> favor of using runtime generation of human-friendly default strings is
> added to QemuOpts instead).  If the table were not already in git, then
> generating it at build time might make sense; but at this stage in the
> game, you're slowing down every build to regenerate something that is
> already correct.
> 

OK, that's a good point. But still, I think that you agree that it would 
be more correct to generate it during the build, right? So now it is 
there already and it works. But isn't it worth it to make it more correct?

Leonid.


Re: [Qemu-block] [PATCH 1/1] include: Auto-generate the sizes lookup table

2019-01-03 Thread Leonid Bloch
Hi Phil,

Thanks for your review! Very valid points. All taken, except the naming 
(I think "pow2_sizes.h" would be better than "lookup_table_sizes.h", 
no?) and the "generated" directory - I agree that it's a good idea, but 
I think it requires a separate patch which moves the generated files there.

Regards,
Leonid.

On 1/3/19 12:33 PM, Philippe Mathieu-Daudé wrote:
> 
> So as 'module_block.h', why not simply name it 'lookup_table_sizes.h'?
> 
> Actually, we could move those in a generated/ subdirectory, this
> would make the source file more explicit:
> 
> #include "generated/module_block.h"
> 
> Kevin, what do you think?
> 


Re: [Qemu-block] [PATCH 1/1] include: Auto-generate the sizes lookup table

2019-01-03 Thread Leonid Bloch
Hi Berto,

On 1/3/19 12:19 PM, Alberto Garcia wrote:
> On Wed 02 Jan 2019 12:09:05 PM CET, Leonid Bloch wrote:
>> +print_sizes() {
>> +local p=10
>> +while [ ${p} -lt 64 ]
>> +do
>> +local pad=' '
>> +local n=$((p % 10))
>> +n=$((1 << n))
>> +[ $((n / 100)) -eq 0 ] && pad='  '
>> +[ $((n / 10)) -eq 0 ] && pad='   '
>> +local suff=$((p / 10))
>> +printf "#define S_%u%s%s%20u\n" ${n} "$(size_suffix ${suff})" \
>> +"${pad}" $((1 << p))
>> +p=$((p + 1))
>> +done
>> +}
> 
> I have to say that I'm not very convinced of the benefits of replacing a
> set of trivial numeric macros with a longer and harder to read shell
> script accompanied by changes to the build system.

I think that the benefit is that the script is easily verifiable, 
whereas if someone would like to verify the table, they will need to 
generate it themselves. Also, this table is automatically generated 
anyway, so it only makes sense to generate it during the build.


Leonid.

> 
> Berto
> 


[Qemu-block] [PATCH 1/1] include: Auto-generate the sizes lookup table

2019-01-02 Thread Leonid Bloch
The lookup table for power-of-two sizes is now auto-generated during the
build, and not hard-coded into the units.h file.

Signed-off-by: Leonid Bloch 
---
 .gitignore   |  1 +
 Makefile |  5 +++
 block/qcow2.h|  2 +-
 block/vdi.c  |  1 +
 include/qemu/units.h | 73 
 scripts/gen-sizes.sh | 66 +++
 6 files changed, 74 insertions(+), 74 deletions(-)
 create mode 100755 scripts/gen-sizes.sh

diff --git a/.gitignore b/.gitignore
index 0430257313..5945220303 100644
--- a/.gitignore
+++ b/.gitignore
@@ -59,6 +59,7 @@
 /qemu-version.h
 /qemu-version.h.tmp
 /module_block.h
+/include/qemu/sizes.h
 /scsi/qemu-pr-helper
 /vhost-user-scsi
 /vhost-user-blk
diff --git a/Makefile b/Makefile
index dd53965f77..435d92821b 100644
--- a/Makefile
+++ b/Makefile
@@ -122,6 +122,8 @@ endif
 
 GENERATED_FILES += module_block.h
 
+GENERATED_FILES += $(SRC_PATH)/include/qemu/sizes.h
+
 TRACE_HEADERS = trace-root.h $(trace-events-subdirs:%=%/trace.h)
 TRACE_SOURCES = trace-root.c $(trace-events-subdirs:%=%/trace.c)
 TRACE_DTRACE =
@@ -499,6 +501,9 @@ ifdef CONFIG_MPATH
 scsi/qemu-pr-helper$(EXESUF): LIBS += -ludev -lmultipath -lmpathpersist
 endif
 
+$(SRC_PATH)/include/qemu/sizes.h: $(SRC_PATH)/scripts/gen-sizes.sh
+   $(call quiet-command,sh $(SRC_PATH)/scripts/gen-sizes.sh $@,"GEN","$@")
+
 qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > 
$@,"GEN","$@")
 
diff --git a/block/qcow2.h b/block/qcow2.h
index a98d24500b..41d14baa17 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,7 +27,7 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
-#include "qemu/units.h"
+#include "qemu/sizes.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
diff --git a/block/vdi.c b/block/vdi.c
index 2380daa583..62dafe0cc6 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -51,6 +51,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/units.h"
+#include "qemu/sizes.h"
 #include "qapi/error.h"
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qapi-visit-block-core.h"
diff --git a/include/qemu/units.h b/include/qemu/units.h
index 1c959d182e..692db3fbb2 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,77 +17,4 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
-/*
- * The following lookup table is intended to be used when a literal string of
- * the number of bytes is required (for example if it needs to be stringified).
- * It can also be used for generic shortcuts of power-of-two sizes.
- * This table is generated using the AWK script below:
- *
- *  BEGIN {
- *  suffix="KMGTPE";
- *  for(i=10; i<64; i++) {
- *  val=2**i;
- *  s=substr(suffix, int(i/10), 1);
- *  n=2**(i%10);
- *  pad=21-int(log(n)/log(10));
- *  printf("#define S_%d%siB %*d\n", n, s, pad, val);
- *  }
- *  }
- */
-
-#define S_1KiB  1024
-#define S_2KiB  2048
-#define S_4KiB  4096
-#define S_8KiB  8192
-#define S_16KiB16384
-#define S_32KiB32768
-#define S_64KiB65536
-#define S_128KiB  131072
-#define S_256KiB  262144
-#define S_512KiB  524288
-#define S_1MiB   1048576
-#define S_2MiB   2097152
-#define S_4MiB   4194304
-#define S_8MiB   8388608
-#define S_16MiB 16777216
-#define S_32MiB 33554432
-#define S_64MiB 67108864
-#define S_128MiB   134217728
-#define S_256MiB   268435456
-#define S_512MiB   536870912
-#define S_1GiB1073741824
-#define S_2GiB2147483648
-#define S_4GiB4294967296
-#define S_8GiB8589934592
-#define S_16GiB  17179869184
-#define S_32GiB  34359738368
-#define S_64GiB  68719476736
-#define S_128GiB137438953472
-#define S_256GiB274877906944
-#define S_512GiB549755813888
-#define S_1TiB 1099511627776
-#define S_2TiB 219902322
-#define S_4TiB 4398046511104
-#define S_8TiB 8796093022208
-#define S_16TiB   17592186044416
-#define S_32TiB   35184372088832
-#define S_64TiB   70368744177664
-#define S_128TiB 140737488355328
-#define S_256TiB 281474976710656
-#define S_512TiB 562949953421312
-#define S_1PiB  1125899906842624
-#define S_2PiB  2251799813685248
-#define S_4PiB  4503599627370496
-#define S_8PiB  9007199254740992
-#define S_16PiB18014398509481984
-#define S_32PiB36028797018963968
-#define S_64PiB72057594037927936
-#define S_128PiB  14

[Qemu-block] [PATCH 0/1] include: Auto-generate the sizes lookup table

2019-01-02 Thread Leonid Bloch
Following the conversations here:
https://patchwork.kernel.org/patch/10665157
and here:
https://patchwork.kernel.org/patch/10666975

Making the lookup table for power-of-two sizes auto-generated, instead
of being hard-coded into the units.h file.

I'm not sure if the changes I've made to Makefile here are "standard".
Please correct me if that's not the case.

Sorry it took so much time - I was busy with something completely
different.


Regards,
Leonid.

Leonid Bloch (1):
  include: Auto-generate the sizes lookup table

 .gitignore   |  1 +
 Makefile |  5 +++
 block/qcow2.h|  2 +-
 block/vdi.c  |  1 +
 include/qemu/units.h | 73 
 scripts/gen-sizes.sh | 66 +++
 6 files changed, 74 insertions(+), 74 deletions(-)
 create mode 100755 scripts/gen-sizes.sh

-- 
2.20.1




Re: [Qemu-block] [PATCH v2 1/1] include: Add a comment to explain the origin of sizes' lookup table

2018-11-06 Thread Leonid Bloch
Hi Phil, Hi Eric,

(Eric, for some reason you weren't CC'd to this thread - sorry.)

On 11/5/18 5:58 PM, Philippe Mathieu-Daudé wrote:
> Hi Leonid,
> 
> On 4/11/18 19:07, Leonid Bloch wrote:
>> The lookup table for power-of-two sizes was added in commit 540b8492618eb
>> for the purpose of having convenient shortcuts for these sizes in cases
>> when the literal number has to be present at compile time, and
>> expressions as '(1 * KiB)' can not be used. One such case is the
>> stringification of sizes. Beyond that, it is convenient to use these
>> shortcuts for all power-of-two sizes, even if they don't have to be
>> literal numbers.
>>
>> Despite its convenience, this table introduced 55 lines of "dumb" code,
>> the purpose and origin of which are obscure without reading the message
>> of the commit which introduced it. This patch fixes that by adding a
>> comment to the code itself with a brief explanation for the reasoning
>> behind this table. This comment includes the short AWK script that
>> generated the table, so that anyone who's interested could make sure
>> that the values in it are correct (otherwise these values look as if
>> they were typed manually).
>>
>> Signed-off-by: Leonid Bloch 
>> ---
>>   include/qemu/units.h | 18 ++
>>   1 file changed, 18 insertions(+)
>>
>> diff --git a/include/qemu/units.h b/include/qemu/units.h
>> index 68a7758650..1c959d182e 100644
>> --- a/include/qemu/units.h
>> +++ b/include/qemu/units.h
>> @@ -17,6 +17,24 @@
>>   #define PiB (INT64_C(1) << 50)
>>   #define EiB (INT64_C(1) << 60)
>> +/*
>> + * The following lookup table is intended to be used when a literal 
>> string of
>> + * the number of bytes is required (for example if it needs to be 
>> stringified).
>> + * It can also be used for generic shortcuts of power-of-two sizes.
> 
> ^ ok
> 
> v this part is not useful in the tree IMHO.
> 
>> + * This table is generated using the AWK script below:
>> + *
>> + *  BEGIN {
>> + *  suffix="KMGTPE";
>> + *  for(i=10; i<64; i++) {
>> + *  val=2**i;
>> + *  s=substr(suffix, int(i/10), 1);
>> + *  n=2**(i%10);
>> + *  pad=21-int(log(n)/log(10));
>> + *  printf("#define S_%d%siB %*d\n", n, s, pad, val);
>> + *  }
>> + *  }
>> + */
> 
> If it is generated and the stringified are also generated, why not 
> generate this once via the ./configure, since it is used by machines and 
> not by humans?

You beat me to it! I was just thinking about that last evening. Indeed 
it's not so elegant to put generated code in a file that is intended for 
human handling. Generating by ./configure would be prettier.
A runtime solution that interprets hard-coded expression strings, like 
Eric suggested, can also be a possibility, but it seems more complicated 
than just generating these at the configuration stage. Plus one gets the 
S_* constants that can be used in many places, not only where an 
explicit number is needed. What do you think?

A question though: if it to be generated by ./configure, where do you 
suggest to put the generated values? And also: is it OK to assume 
there's AWK (or another scripting language) on the system for generating 
these?

Thanks,

Leonid.


[Qemu-block] [PATCH v2 0/1] vdi: Use a literal number of bytes for

2018-11-04 Thread Leonid Bloch
Please see the commit message for the rationale.

Difference from v1:
* Format string is fixed.

Leonid Bloch (1):
  vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZE

 block/vdi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

-- 
2.17.1




[Qemu-block] [PATCH v2 1/1] vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZE

2018-11-04 Thread Leonid Bloch
If an expression is used to define DEFAULT_CLUSTER_SIZE, when compiled,
it will be embedded as a literal expression in the binary (as the
default value) because it is stringified to mark the size of the default
value. Now this is fixed by using a defined number to define this value.

Signed-off-by: Leonid Bloch 
---
 block/vdi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/vdi.c b/block/vdi.c
index 6555cffb88..d996793f1c 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -85,7 +85,7 @@
 #define BLOCK_OPT_STATIC "static"
 
 #define SECTOR_SIZE 512
-#define DEFAULT_CLUSTER_SIZE (1 * MiB)
+#define DEFAULT_CLUSTER_SIZE S_1MiB
 
 #if defined(CONFIG_VDI_DEBUG)
 #define VDI_DEBUG 1
@@ -432,7 +432,7 @@ static int vdi_open(BlockDriverState *bs, QDict *options, 
int flags,
 goto fail;
 } else if (header.block_size != DEFAULT_CLUSTER_SIZE) {
 error_setg(errp, "unsupported VDI image (block size %" PRIu32
- " is not %" PRIu64 ")",
+ " is not %" PRIu32 ")",
header.block_size, DEFAULT_CLUSTER_SIZE);
 ret = -ENOTSUP;
 goto fail;
-- 
2.17.1




[Qemu-block] [PATCH v2 1/1] include: Add a comment to explain the origin of sizes' lookup table

2018-11-04 Thread Leonid Bloch
The lookup table for power-of-two sizes was added in commit 540b8492618eb
for the purpose of having convenient shortcuts for these sizes in cases
when the literal number has to be present at compile time, and
expressions as '(1 * KiB)' can not be used. One such case is the
stringification of sizes. Beyond that, it is convenient to use these
shortcuts for all power-of-two sizes, even if they don't have to be
literal numbers.

Despite its convenience, this table introduced 55 lines of "dumb" code,
the purpose and origin of which are obscure without reading the message
of the commit which introduced it. This patch fixes that by adding a
comment to the code itself with a brief explanation for the reasoning
behind this table. This comment includes the short AWK script that
generated the table, so that anyone who's interested could make sure
that the values in it are correct (otherwise these values look as if
they were typed manually).

Signed-off-by: Leonid Bloch 
---
 include/qemu/units.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/include/qemu/units.h b/include/qemu/units.h
index 68a7758650..1c959d182e 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,6 +17,24 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
+/*
+ * The following lookup table is intended to be used when a literal string of
+ * the number of bytes is required (for example if it needs to be stringified).
+ * It can also be used for generic shortcuts of power-of-two sizes.
+ * This table is generated using the AWK script below:
+ *
+ *  BEGIN {
+ *  suffix="KMGTPE";
+ *  for(i=10; i<64; i++) {
+ *  val=2**i;
+ *  s=substr(suffix, int(i/10), 1);
+ *  n=2**(i%10);
+ *  pad=21-int(log(n)/log(10));
+ *  printf("#define S_%d%siB %*d\n", n, s, pad, val);
+ *  }
+ *  }
+ */
+
 #define S_1KiB  1024
 #define S_2KiB  2048
 #define S_4KiB  4096
-- 
2.17.1




[Qemu-block] [PATCH v2 0/1] include: Add a comment to explain the origin of

2018-11-04 Thread Leonid Bloch
Please see the commit message for the rationale.

Difference from v1:
* Tabs removed from the code indentation of the sample code in the
  comment, in order to pass checkpatch.

Leonid Bloch (1):
  include: Add a comment to explain the origin of sizes' lookup table

 include/qemu/units.h | 18 ++
 1 file changed, 18 insertions(+)

-- 
2.17.1




Re: [Qemu-block] [Qemu-devel] [PATCH] vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZE

2018-11-04 Thread Leonid Bloch
> This changes the data type, so a fix is needed for a format string in
> line 434.

Yes, I saw, thanks! Will do.

> 
> Regards
> Stefan
> 


[Qemu-block] [PATCH] vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZE

2018-11-02 Thread Leonid Bloch
If an expression is used to define DEFAULT_CLUSTER_SIZE, when compiled,
it will be embedded as a literal expression in the binary (as the
default value) because it is stringified to mark the size of the default
value. Now this is fixed by using a defined number to define this value.

Signed-off-by: Leonid Bloch 
---
 block/vdi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/vdi.c b/block/vdi.c
index 6555cffb88..25320eff47 100644
--- a/block/vdi.c
+++ b/block/vdi.c
@@ -85,7 +85,7 @@
 #define BLOCK_OPT_STATIC "static"
 
 #define SECTOR_SIZE 512
-#define DEFAULT_CLUSTER_SIZE (1 * MiB)
+#define DEFAULT_CLUSTER_SIZE S_1MiB
 
 #if defined(CONFIG_VDI_DEBUG)
 #define VDI_DEBUG 1
-- 
2.17.1




[Qemu-block] [PATCH] include: Add a comment to explain the origin of sizes' lookup table

2018-11-02 Thread Leonid Bloch
The lookup table for power-of-two sizes was added in commit 540b8492618eb
for the purpose of having convenient shortcuts for these sizes in cases
when the literal number has to be present at compile time, and
expressions as '(1 * KiB)' can not be used. One such case is the
stringification of sizes. Beyond that, it is convenient to use these
shortcuts for all power-of-two sizes, even if they don't have to be
literal numbers.

Despite its convenience, this table introduced 55 lines of "dumb" code,
the purpose and origin of which are obscure without reading the message
of the commit which introduced it. This patch fixes that by adding a
comment to the code itself with a brief explanation for the reasoning
behind this table. This comment includes the short AWK script that
generated the table, so that anyone who's interested could make sure
that the values in it are correct (otherwise these values look as if
they were typed manually).

Signed-off-by: Leonid Bloch 
---
 include/qemu/units.h | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/include/qemu/units.h b/include/qemu/units.h
index 68a7758650..051c274ca2 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,6 +17,24 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
+/*
+ * The following lookup table is intended to be used when a literal string of
+ * the number of bytes is required (for example if it needs to be stringified).
+ * It can also be used for generic shortcuts of power-of-two sizes.
+ * This table is generated using the AWK script below:
+ *
+ *  BEGIN {
+ * suffix="KMGTPE";
+ * for(i=10; i<64; i++) {
+ * val=2**i;
+ * s=substr(suffix, int(i/10), 1);
+ * n=2**(i%10);
+ * pad=21-int(log(n)/log(10));
+ * printf("#define S_%d%siB %*d\n", n, s, pad, val);
+ * }
+ *  }
+ */
+
 #define S_1KiB  1024
 #define S_2KiB  2048
 #define S_4KiB  4096
-- 
2.17.1




Re: [Qemu-block] [Qemu-devel] [PATCH] qemu/units: Move out QCow2 specific definitions

2018-11-02 Thread Leonid Bloch
Hi,

On 11/2/18 5:28 PM, Kevin Wolf wrote:
> Am 02.11.2018 um 15:52 hat Eric Blake geschrieben:
>> On 11/2/18 9:10 AM, Kevin Wolf wrote:
>>> Am 02.11.2018 um 13:37 hat Philippe Mathieu-Daudé geschrieben:
 Hi Kevin,

 On 2/11/18 12:07, Kevin Wolf wrote:
> Am 02.11.2018 um 09:58 hat Philippe Mathieu-Daudé geschrieben:
>> This definitions are QCow2 specific, there is no need to expose them
>> in the global namespace.

These are not QCOW2 specific. I wrote these for convenience in QCOW2, 
but there are many other places where these can be used (many 
pre-defined sizes are powers of two), and there are few places where 
they must replace the current notation, like in block/vdi.c with 
DEFAULT_CLUSTER_SIZE (unless an explicit value in bytes will be defined 
instead).

>>
>> Agreed. I didn't want it in the first place, arguing that if we want
>> stringification of defaults, it would be better to have a runtime function
>> do that, rather than adding a set of near-duplicate macro names.

A runtime function will not help here, as these are used in compile 
time. These result in strings that are actually compiled into the binaries.

>>>
>>> Then there is VDI which uses (1 * MiB), but that is compiled out and if
>>> you enable it, it breaks. So it needs the same fix.

Yeah, I need to fix that as promised. Will do shortly. :)

Leonid.


Re: [Qemu-block] [PATCH] qcow2: Fix cache-clean-interval documentation

2018-10-02 Thread Leonid Bloch
On 10/1/18 7:59 PM, Kevin Wolf wrote:
> Am 01.10.2018 um 16:35 hat Eric Blake geschrieben:
>> On 9/29/18 4:54 AM, Leonid Bloch wrote:
>>> Fixing cache-clean-interval documentation following the recent change to
>>> a default of 600 seconds on supported plarforms (only Linux currently).
>>>
>>> Signed-off-by: Leonid Bloch 
>>> ---
>>>docs/qcow2-cache.txt | 19 +--
>>>qapi/block-core.json |  3 ++-
>>>qemu-options.hx  |  3 ++-
>>>3 files changed, 13 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
>>> index 59358b816f..1778312e09 100644
>>> --- a/docs/qcow2-cache.txt
>>> +++ b/docs/qcow2-cache.txt
>>> @@ -202,18 +202,17 @@ Reducing the memory usage
>>>It is possible to clean unused cache entries in order to reduce the
>>>memory usage during periods of low I/O activity.
>>> -The parameter "cache-clean-interval" defines an interval (in seconds).
>>> -All cache entries that haven't been accessed during that interval are
>>> -removed from memory.
>>> +The parameter "cache-clean-interval" defines an interval (in seconds),
>>> +affer which all the cache entries that haven't been accessed during it
>>
>> s/affer/after/
>>
>> Maybe s/during it/during the interval/
> 
> Thanks, made these fixes and applied it to the block branch.
> 
> Kevin
> 

Thanks Kevin, Eric!


[Qemu-block] [PATCH] qcow2: Fix cache-clean-interval documentation

2018-09-29 Thread Leonid Bloch
Fixing cache-clean-interval documentation following the recent change to
a default of 600 seconds on supported plarforms (only Linux currently).

Signed-off-by: Leonid Bloch 
---
 docs/qcow2-cache.txt | 19 +--
 qapi/block-core.json |  3 ++-
 qemu-options.hx  |  3 ++-
 3 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 59358b816f..1778312e09 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -202,18 +202,17 @@ Reducing the memory usage
 It is possible to clean unused cache entries in order to reduce the
 memory usage during periods of low I/O activity.
 
-The parameter "cache-clean-interval" defines an interval (in seconds).
-All cache entries that haven't been accessed during that interval are
-removed from memory.
+The parameter "cache-clean-interval" defines an interval (in seconds),
+affer which all the cache entries that haven't been accessed during it
+are removed from memory. Setting this parameter to 0 disables this feature.
 
-This example removes all unused cache entries every 15 minutes:
+The following example removes all unused cache entries every 15 minutes:
 
-drive file=hd.qcow2,cache-clean-interval=900
 
-If unset, the default value for this parameter is 600. Setting it to 0
-disables this feature.
+If unset, the default value for this parameter is 600 on platforms which
+support this functionality, and is 0 (disabled) on other platforms.
 
-Note that this functionality currently relies on the MADV_DONTNEED
-argument for madvise() to actually free the memory. This is a
-Linux-specific feature, so cache-clean-interval is not supported in
-other systems.
+This functionality currently relies on the MADV_DONTNEED argument for
+madvise() to actually free the memory. This is a Linux-specific feature,
+so cache-clean-interval is not supported on other systems.
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 46dac23d2f..25b8a0e744 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2895,7 +2895,8 @@
 #
 # @cache-clean-interval:  clean unused entries in the L2 and refcount
 # caches. The interval is in seconds. The default value
-# is 600, and 0 disables this feature. (since 2.5)
+# is 600 on supporting platforms, and 0 on other
+# platforms. 0 disables this feature. (since 2.5)
 #
 # @encrypt:   Image decryption options. Mandatory for
 # encrypted images, except when doing a metadata-only
diff --git a/qemu-options.hx b/qemu-options.hx
index 52d9d9f06d..f139459e80 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -747,7 +747,8 @@ it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-The default value is 600. Setting it to 0 disables this feature.
+The default value is 600 on supporting platforms, and 0 on other platforms.
+Setting it to 0 disables this feature.
 
 @item pass-discard-request
 Whether discard requests to the qcow2 device should be forwarded to the data
-- 
2.17.1




Re: [Qemu-block] [PATCH v12 8/9] qcow2: Set the default cache-clean-interval to 10 minutes

2018-09-27 Thread Leonid Bloch



On September 27, 2018 5:53:34 PM CEST, Eric Blake  wrote:
>On 9/26/18 11:04 AM, Leonid Bloch wrote:
>> The default cache-clean-interval is set to 10 minutes, in order to
>lower
>> the overhead of the qcow2 caches (before the default was 0, i.e.
>> disabled).
>> 
>> * For non-Linux platforms the default is kept at 0, because
>>cache-clean-interval is not supported there yet.
>> 
>> Signed-off-by: Leonid Bloch 
>> Reviewed-by: Alberto Garcia 
>> Reviewed-by: Kevin Wolf 
>> ---
>
>> @@ -76,13 +76,15 @@
>>   
>>   #ifdef CONFIG_LINUX
>>   #define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
>> +#define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
>>   #else
>>   #define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
>> +/* Cache clean interval is currently available only on Linux, so
>must be 0 */
>> +#define DEFAULT_CACHE_CLEAN_INTERVAL 0
>>   #endif
>>   
>
>> +++ b/docs/qcow2-cache.txt
>> @@ -210,8 +210,8 @@ This example removes all unused cache entries
>every 15 minutes:
>>   
>>  -drive file=hd.qcow2,cache-clean-interval=900
>>   
>> -If unset, the default value for this parameter is 0 and it disables
>> -this feature.
>> +If unset, the default value for this parameter is 600. Setting it to
>0
>> +disables this feature.
>
>Should this wording mention that the non-zero default is only on Linux 
>(or rather, only on platforms where a non-zero value makes a
>difference)?

Yes, I think it should mention it, thanks. Should I send a separate patch?

Leonid.

>
>>   
>>   Note that this functionality currently relies on the MADV_DONTNEED
>>   argument for madvise() to actually free the memory. This is a
>> diff --git a/qapi/block-core.json b/qapi/block-core.json
>> index ac3b48ee54..46dac23d2f 100644
>> --- a/qapi/block-core.json
>> +++ b/qapi/block-core.json
>> @@ -2895,7 +2895,8 @@
>>   #
>>   # @cache-clean-interval:  clean unused entries in the L2 and
>refcount
>>   # caches. The interval is in seconds. The
>default value
>> -# is 0 and it disables this feature (since
>2.5)
>> +# is 600, and 0 disables this feature.
>(since 2.5)
>
>and again here
>
>> +#
>>   # @encrypt:   Image decryption options. Mandatory for
>>   # encrypted images, except when doing a
>metadata-only
>>   # probe of the image. (since 2.10)
>> diff --git a/qemu-options.hx b/qemu-options.hx
>> index 14aee78c6c..52d9d9f06d 100644
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -747,7 +747,7 @@ it which is not used for the L2 cache)
>>   
>>   @item cache-clean-interval
>>   Clean unused entries in the L2 and refcount caches. The interval is
>in seconds.
>> -The default value is 0 and it disables this feature.
>> +The default value is 600. Setting it to 0 disables this feature.
>
>and here?

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



[Qemu-block] [PATCH v12 8/9] qcow2: Set the default cache-clean-interval to 10 minutes

2018-09-26 Thread Leonid Bloch
The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).

* For non-Linux platforms the default is kept at 0, because
  cache-clean-interval is not supported there yet.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 block/qcow2.c| 2 +-
 block/qcow2.h| 4 +++-
 docs/qcow2-cache.txt | 4 ++--
 qapi/block-core.json | 3 ++-
 qemu-options.hx  | 2 +-
 5 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 20b5093269..95e1c98daa 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState 
*bs,
 /* New interval for cache cleanup timer */
 r->cache_clean_interval =
 qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL,
-s->cache_clean_interval);
+DEFAULT_CACHE_CLEAN_INTERVAL);
 #ifndef CONFIG_LINUX
 if (r->cache_clean_interval != 0) {
 error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL
diff --git a/block/qcow2.h b/block/qcow2.h
index 0f0e3534bf..ba430316b9 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -76,13 +76,15 @@
 
 #ifdef CONFIG_LINUX
 #define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
+#define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
 #else
 #define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
+/* Cache clean interval is currently available only on Linux, so must be 0 */
+#define DEFAULT_CACHE_CLEAN_INTERVAL 0
 #endif
 
 #define DEFAULT_CLUSTER_SIZE S_64KiB
 
-
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
 #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
 #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot"
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 1fcc0658b2..59358b816f 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -210,8 +210,8 @@ This example removes all unused cache entries every 15 
minutes:
 
-drive file=hd.qcow2,cache-clean-interval=900
 
-If unset, the default value for this parameter is 0 and it disables
-this feature.
+If unset, the default value for this parameter is 600. Setting it to 0
+disables this feature.
 
 Note that this functionality currently relies on the MADV_DONTNEED
 argument for madvise() to actually free the memory. This is a
diff --git a/qapi/block-core.json b/qapi/block-core.json
index ac3b48ee54..46dac23d2f 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2895,7 +2895,8 @@
 #
 # @cache-clean-interval:  clean unused entries in the L2 and refcount
 # caches. The interval is in seconds. The default value
-# is 0 and it disables this feature (since 2.5)
+# is 600, and 0 disables this feature. (since 2.5)
+#
 # @encrypt:   Image decryption options. Mandatory for
 # encrypted images, except when doing a metadata-only
 # probe of the image. (since 2.10)
diff --git a/qemu-options.hx b/qemu-options.hx
index 14aee78c6c..52d9d9f06d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -747,7 +747,7 @@ it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-The default value is 0 and it disables this feature.
+The default value is 600. Setting it to 0 disables this feature.
 
 @item pass-discard-request
 Whether discard requests to the qcow2 device should be forwarded to the data
-- 
2.17.1




[Qemu-block] [PATCH v12 4/9] qcow2: Avoid duplication in setting the refcount cache size

2018-09-26 Thread Leonid Bloch
The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 block/qcow2.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index d2c07ce9fe..cd0053b6ee 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
-if (!refcount_cache_size_set) {
-*refcount_cache_size = min_refcount_cache;
-}
 }
+/* l2_cache_size and refcount_cache_size are ensured to have at least
+ * their minimum values in qcow2_update_options_prepare() */
 
 if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) ||
 *l2_cache_entry_size > s->cluster_size ||
-- 
2.17.1




[Qemu-block] [PATCH v12 7/9] qcow2: Resize the cache upon image resizing

2018-09-26 Thread Leonid Bloch
The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning L2 cache relatively to
the image size, implies that the cache will be adapted accordingly
after an image resize.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 589f6c1b1c..20b5093269 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3418,6 +3418,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 uint64_t old_length;
 int64_t new_l1_size;
 int ret;
+QDict *options;
 
 if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA &&
 prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL)
@@ -3642,6 +3643,8 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 }
 
+bs->total_sectors = offset / BDRV_SECTOR_SIZE;
+
 /* write updated header.size */
 offset = cpu_to_be64(offset);
 ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size),
@@ -3652,6 +3655,14 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 
 s->l1_vm_state_index = new_l1_size;
+
+/* Update cache sizes */
+options = qdict_clone_shallow(bs->options);
+ret = qcow2_update_options(bs, options, s->flags, errp);
+qobject_unref(options);
+if (ret < 0) {
+goto fail;
+}
 ret = 0;
 fail:
 qemu_co_mutex_unlock(>lock);
-- 
2.17.1




[Qemu-block] [PATCH v12 6/9] qcow2: Increase the default upper limit on the L2 cache size

2018-09-26 Thread Leonid Bloch
The upper limit on the L2 cache size is increased from 1 MB to 32 MB
on Linux platforms, and to 8 MB on other platforms (this difference is
caused by the ability to set intervals for cache cleaning on Linux
platforms only).

This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 block/qcow2.h|  6 +-
 docs/qcow2-cache.txt | 15 +--
 qemu-options.hx  |  6 +++---
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 2f8c1fd15c..0f0e3534bf 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,7 +74,11 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-#define DEFAULT_L2_CACHE_MAX_SIZE S_1MiB
+#ifdef CONFIG_LINUX
+#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
+#else
+#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
+#endif
 
 #define DEFAULT_CLUSTER_SIZE S_64KiB
 
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 750447ea4f..1fcc0658b2 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -125,12 +125,15 @@ There are a few things that need to be taken into account:
  - Both caches must have a size that is a multiple of the cluster size
(or the cache entry size: see "Using smaller cache sizes" below).
 
- - The maximum L2 cache size is 1 MB by default (enough for full coverage
-   of 8 GB images, with the default cluster size). This value can be
-   modified using the "l2-cache-size" option. QEMU will not use more memory
-   than needed to hold all of the image's L2 tables, regardless of this max.
-   value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see
-   below).
+ - The maximum L2 cache size is 32 MB by default on Linux platforms (enough
+   for full coverage of 256 GB images, with the default cluster size). This
+   value can be modified using the "l2-cache-size" option. QEMU will not use
+   more memory than needed to hold all of the image's L2 tables, regardless
+   of this max. value.
+   On non-Linux platforms the maximal value is smaller by default (8 MB) and
+   this difference stems from the fact that on Linux the cache can be cleared
+   periodically if needed, using the "cache-clean-interval" option (see below).
+   The minimal L2 cache size is 2 clusters (or 2 cache entries, see below).
 
  - The default (and minimum) refcount cache size is 4 clusters.
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 6eef0f5651..14aee78c6c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -736,9 +736,9 @@ The maximum total size of the L2 table and refcount block 
caches in bytes
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: if cache-size is not specified - 1M; otherwise, as large as possible
-within the cache-size, while permitting the requested or the minimal refcount
-cache size)
+(default: if cache-size is not specified - 32M on Linux platforms, and 8M on
+non-Linux platforms; otherwise, as large as possible within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-- 
2.17.1




[Qemu-block] [PATCH v12 9/9] qcow2: Explicit number replaced by a constant

2018-09-26 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 block/qcow2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 95e1c98daa..7277feda13 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 /* 2^(s->refcount_order - 3) is the refcount width in bytes */
 s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);
 s->refcount_block_size = 1 << s->refcount_block_bits;
-bs->total_sectors = header.size / 512;
+bs->total_sectors = header.size / BDRV_SECTOR_SIZE;
 s->csize_shift = (62 - (s->cluster_bits - 8));
 s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
 s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
@@ -3450,7 +3450,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 goto fail;
 }
 
-old_length = bs->total_sectors * 512;
+old_length = bs->total_sectors * BDRV_SECTOR_SIZE;
 new_l1_size = size_to_l1(s, offset);
 
 if (offset < old_length) {
-- 
2.17.1




[Qemu-block] [PATCH v12 5/9] qcow2: Assign the L2 cache relatively to the image size

2018-09-26 Thread Leonid Bloch
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.

Previously, unless 'cache-size' was specified and was large enough, the
L2 cache was set to a certain size without taking the virtual image size
into account.

Now, the L2 cache assignment is aware of the virtual size of the image,
and will cover the entire image, unless the cache size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 block/qcow2.c  | 21 +
 block/qcow2.h  |  4 +---
 docs/qcow2-cache.txt   | 15 ++-
 qemu-options.hx|  6 +++---
 tests/qemu-iotests/137 |  8 +++-
 tests/qemu-iotests/137.out |  4 +++-
 6 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index cd0053b6ee..589f6c1b1c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  uint64_t *refcount_cache_size, Error **errp)
 {
 BDRVQcow2State *s = bs->opaque;
-uint64_t combined_cache_size;
+uint64_t combined_cache_size, l2_cache_max_setting;
 bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
 int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
 
 combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
 l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
 refcount_cache_size_set = qemu_opt_get(opts, 
QCOW2_OPT_REFCOUNT_CACHE_SIZE);
 
 combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
-*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
+l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE,
+ DEFAULT_L2_CACHE_MAX_SIZE);
 *refcount_cache_size = qemu_opt_get_size(opts,
  QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
 *l2_cache_entry_size = qemu_opt_get_size(
 opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size);
 
+*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting);
+
 if (combined_cache_size_set) {
 if (l2_cache_size_set && refcount_cache_size_set) {
 error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"at the same time");
 return;
-} else if (*l2_cache_size > combined_cache_size) {
+} else if (l2_cache_size_set &&
+   (l2_cache_max_setting > combined_cache_size)) {
 error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
 return;
@@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 } else if (refcount_cache_size_set) {
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 } else {
-uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
-
 /* Assign as much memory as possible to the L2 cache, and
  * use the remainder for the refcount cache */
 if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
@@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 }
 }
-} else {
-if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
- (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
- * s->cluster_size);
-}
 }
 /* l2_cache_size and refcount_cache_size are ensured to have at least
  * their minimum values in qcow2_update_options_prepare() */
diff --git a/block/qcow2.h b/block/qcow2.h
index a8d6f757b1..2f8c1fd15c 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,9 +74,7 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-/* Whic

[Qemu-block] [PATCH v12 3/9] qcow2: Make sizes more humanly readable

2018-09-26 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 block/qcow2.c | 2 +-
 block/qcow2.h | 9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index c13153735a..d2c07ce9fe 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 }
 } else {
 if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
+*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
diff --git a/block/qcow2.h b/block/qcow2.h
index 81b844e936..a8d6f757b1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
+#include "qemu/units.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
@@ -43,11 +44,11 @@
 
 /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_REFTABLE_SIZE 0x80
+#define QCOW_MAX_REFTABLE_SIZE S_8MiB
 
 /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_L1_SIZE 0x200
+#define QCOW_MAX_L1_SIZE S_32MiB
 
 /* Allow for an average of 1k per snapshot table entry, should be plenty of
  * space for snapshot names and IDs */
@@ -75,9 +76,9 @@
 
 /* Whichever is more */
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
-#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
+#define DEFAULT_L2_CACHE_SIZE S_1MiB
 
-#define DEFAULT_CLUSTER_SIZE 65536
+#define DEFAULT_CLUSTER_SIZE S_64KiB
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
-- 
2.17.1




[Qemu-block] [PATCH v12 1/9] qcow2: Options' documentation fixes

2018-09-26 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 docs/qcow2-cache.txt | 21 ++---
 qemu-options.hx  |  9 ++---
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 8a09a5cc5f..7e28b41bd3 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -79,14 +79,14 @@ Choosing the right cache sizes
 In order to choose the cache sizes we need to know how they relate to
 the amount of allocated space.
 
-The amount of virtual disk that can be mapped by the L2 and refcount
+The part of the virtual disk that can be mapped by the L2 and refcount
 caches (in bytes) is:
 
disk_size = l2_cache_size * cluster_size / 8
disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits
 
 With the default values for cluster_size (64KB) and refcount_bits
-(16), that is
+(16), this becomes:
 
disk_size = l2_cache_size * 8192
disk_size = refcount_cache_size * 32768
@@ -97,12 +97,16 @@ need:
l2_cache_size = disk_size_GB * 131072
refcount_cache_size = disk_size_GB * 32768
 
-QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
-cache of 256KB (262144 bytes), so using the formulas we've just seen
-we have
+For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual
+image size (given that the default cluster size is used):
 
-   1048576 / 131072 = 8 GB of virtual disk covered by that cache
-262144 /  32768 = 8 GB
+   8 GB / 8192 = 1 MB
+
+The refcount cache is 4 times the cluster size by default. With the default
+cluster size of 64 KB, it is 256 KB (262144 bytes). This is sufficient for
+8 GB of image size:
+
+   262144 * 32768 = 8 GB
 
 
 How to configure the cache sizes
@@ -130,6 +134,9 @@ There are a few things that need to be taken into account:
memory as possible to the L2 cache before increasing the refcount
cache size.
 
+ - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
+   can be set simultaneously.
+
 Unlike L2 tables, refcount blocks are not used during normal I/O but
 only during allocations and internal snapshots. In most cases they are
 accessed sequentially (even during random guest I/O) so increasing the
diff --git a/qemu-options.hx b/qemu-options.hx
index a642ad297f..2db6247eff 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -732,15 +732,18 @@ image file)
 
 @item cache-size
 The maximum total size of the L2 table and refcount block caches in bytes
-(default: 1048576 bytes or 8 clusters, whichever is larger)
+(default: the sum of l2-cache-size and refcount-cache-size)
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: 4/5 of the total cache size)
+(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever
+is larger; otherwise, as large as possible or needed within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-(default: 1/5 of the total cache size)
+(default: 4 times the cluster size; or if cache-size is specified, the part of
+it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-- 
2.17.1




[Qemu-block] [PATCH v12 2/9] include: Add a lookup table of sizes

2018-09-26 Thread Leonid Bloch
Adding a lookup table for the powers of two, with the appropriate size
prefixes. This is needed when a size has to be stringified, in which
case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))'
string. Powers of two are used very often for sizes, so such a table
will also make it easier and more intuitive to write them.

This table is generatred using the following AWK script:

BEGIN {
suffix="KMGTPE";
for(i=10; i<64; i++) {
val=2**i;
s=substr(suffix, int(i/10), 1);
n=2**(i%10);
pad=21-int(log(n)/log(10));
printf("#define S_%d%siB %*d\n", n, s, pad, val);
    }
}

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
Reviewed-by: Kevin Wolf 
---
 include/qemu/units.h | 55 
 1 file changed, 55 insertions(+)

diff --git a/include/qemu/units.h b/include/qemu/units.h
index 692db3fbb2..68a7758650 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,4 +17,59 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
+#define S_1KiB  1024
+#define S_2KiB  2048
+#define S_4KiB  4096
+#define S_8KiB  8192
+#define S_16KiB16384
+#define S_32KiB32768
+#define S_64KiB65536
+#define S_128KiB  131072
+#define S_256KiB  262144
+#define S_512KiB  524288
+#define S_1MiB   1048576
+#define S_2MiB   2097152
+#define S_4MiB   4194304
+#define S_8MiB   8388608
+#define S_16MiB 16777216
+#define S_32MiB 33554432
+#define S_64MiB 67108864
+#define S_128MiB   134217728
+#define S_256MiB   268435456
+#define S_512MiB   536870912
+#define S_1GiB1073741824
+#define S_2GiB2147483648
+#define S_4GiB4294967296
+#define S_8GiB8589934592
+#define S_16GiB  17179869184
+#define S_32GiB  34359738368
+#define S_64GiB  68719476736
+#define S_128GiB137438953472
+#define S_256GiB274877906944
+#define S_512GiB549755813888
+#define S_1TiB 1099511627776
+#define S_2TiB 219902322
+#define S_4TiB 4398046511104
+#define S_8TiB 8796093022208
+#define S_16TiB   17592186044416
+#define S_32TiB   35184372088832
+#define S_64TiB   70368744177664
+#define S_128TiB 140737488355328
+#define S_256TiB 281474976710656
+#define S_512TiB 562949953421312
+#define S_1PiB  1125899906842624
+#define S_2PiB  2251799813685248
+#define S_4PiB  4503599627370496
+#define S_8PiB  9007199254740992
+#define S_16PiB18014398509481984
+#define S_32PiB36028797018963968
+#define S_64PiB72057594037927936
+#define S_128PiB  144115188075855872
+#define S_256PiB  288230376151711744
+#define S_512PiB  576460752303423488
+#define S_1EiB   1152921504606846976
+#define S_2EiB   2305843009213693952
+#define S_4EiB   4611686018427387904
+#define S_8EiB   9223372036854775808
+
 #endif
-- 
2.17.1




[Qemu-block] [PATCH v12 0/9] Take the image size into account when allocating the L2 cache

2018-09-26 Thread Leonid Bloch
This series makes the qcow2 L2 cache assignment aware of the image size,
with the intention for it to cover the entire image. The importance of
this change is in noticeable performance improvement, especially with
heavy random I/O. The memory overhead is not big in most cases, as only
1 MB of cache for every 8 GB of image size is used. For cases with very
large images and/or small cluster sizes, or systems with limited RAM
resources, there is an upper limit on the default L2 cache: 32 MB for
Linux systems, and 8 MB for non-Linux systems (the difference is caused
by the fact that it is currently impossible to perform scheduled cache
cleaning on non-Linux systems). To modify this upper limit one can use
the already existing 'l2-cache-size' and 'cache-size' options. Moreover,
this fixes the behavior of 'l2-cache-size', as it was documented as the
*maximum* L2 cache size, but in practice behaved as the absolute size.

To compensate the memory overhead which may (but not necesarily will) be
increased following this behavior, the default cache-clean-interval is set
to 10 minutes by default (was disabled by default before).

The L2 cache is also resized accordingly, by default, if the image is
resized.

Additionally, few minor changes are made (refactoring and documentation
fixes).

Differences from v1:
* .gitignore modification patch removed (unneeded).
* The grammar fix in conflicting cache sizing patch removed (merged).
* The update total_sectors when resizing patch squashed with the
  resizing patch.
* L2 cache is now capped at 32 MB.
* The default cache-clean-interval is set to 30 seconds.

Differences from v2:
* Made it clear in the documentation that setting cache-clean-interval
  to 0 disables this feature.

Differences from v3:
* The default cache-clean-interval is set to 10 minutes instead of 30
  seconds before.
* Commit message changes to better explain the patches.
* Some refactoring.

Differences from v4:
* Refactoring.
* Documentation and commit message fixes.

Differences from v5:
* A patch with cosmetic changes is split from the main patch
* A patch for avoiding duplication in refcount cache size assignment is
  split from the main patch.
* In the main patch the cap on the L2 cache size is set to only 1 MB (in
  order to be close to the previous behavior) and a separate patch sets
  it to 32 MB.
* Changes in the documentation fixes patch [1/8].

Differences from v6:
* A patch is added to make the defined sizes more humanly readable.

Differences from v7:
* Documentation and commit message changes.
* Fix: when cache-size is specified and is large enough, with the L2 and
  refcount caches not set, the L2 cache will be as large as needed to
  cover the entire image, and not as the default max. size. (Return to
  the previous behavior in this part, which was correct).
* Cosmetic changes patch not used.

Differences from v8:
* Made a lookup table for power-of-two sizes. This will enable writing
  pretty size values (most of the ones used are powers of two) while
  allowing correct stringification of these values.
* Made the max. L2 cache size 8 MB on non-Linux systems.
* Returned the test for too large L2 cache size.
* Minor documentation fixes.

Differences from v9:
* DEFAULT_CACHE_CLEAN_INTERVAL is set to 0 for non-Linux platforms in
  order to avoid an error, since the cache clean interval feature is
  currently available only on Linux.

Differences from v10:
* Rewording documentation.

Differences from v11:
* Memory leak fixed in "Resize the cache upon image resizing".

Leonid Bloch (9):
  qcow2: Options' documentation fixes
  include: Add a lookup table of sizes
  qcow2: Make sizes more humanly readable
  qcow2: Avoid duplication in setting the refcount cache size
  qcow2: Assign the L2 cache relatively to the image size
  qcow2: Increase the default upper limit on the L2 cache size
  qcow2: Resize the cache upon image resizing
  qcow2: Set the default cache-clean-interval to 10 minutes
  qcow2: Explicit number replaced by a constant

 block/qcow2.c  | 43 -
 block/qcow2.h  | 19 -
 docs/qcow2-cache.txt   | 43 +++--
 include/qemu/units.h   | 55 ++
 qapi/block-core.json   |  3 ++-
 qemu-options.hx| 11 +---
 tests/qemu-iotests/137 |  8 +-
 tests/qemu-iotests/137.out |  4 ++-
 8 files changed, 140 insertions(+), 46 deletions(-)

-- 
2.17.1




Re: [Qemu-block] [PATCH v11 7/9] qcow2: Resize the cache upon image resizing

2018-09-26 Thread Leonid Bloch
Oops, sorry. Would something like that be OK:

On 9/26/18 1:55 PM, Alberto Garcia wrote:
> On Wed 26 Sep 2018 12:43:03 PM CEST, Kevin Wolf wrote:
>>> +/* Update cache sizes */
>>> +options = qdict_clone_shallow(bs->options);
>>> +ret = qcow2_update_options(bs, options, s->flags, errp);
+qobject_unref(options);
>>> +if (ret < 0) {
>>> +goto fail;
>>> +}
>>

Leonid.

>> Isn't options leaked, both in success and error cases?
> 
> I'm embarrassed not to have seen that :-!
> 
> Berto
> 


Re: [Qemu-block] [PATCH v10 1/9] qcow2: Options' documentation fixes

2018-09-24 Thread Leonid Bloch

On 9/24/18 10:34 PM, Max Reitz wrote:

On 24.09.18 18:53, Leonid Bloch wrote:

On 9/24/18 6:04 PM, Alberto Garcia wrote:

On Fri 21 Sep 2018 07:23:02 PM CEST, Leonid Bloch wrote:

Signed-off-by: Leonid Bloch 
---
   docs/qcow2-cache.txt | 20 +---
   qemu-options.hx  |  9 ++---
   2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 8a09a5cc5f..013991e21c 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be configured.
   Choosing the right cache sizes
   --
   In order to choose the cache sizes we need to know how they relate to
-the amount of allocated space.
+the amount of the allocated space.


I'm not a native English speaker, but the current version sounds correct
to me. Why do you need to add an article there? :-?


I'm not a native speaker as well, but "the" seems to be needed there:
- Which allocated space?
- __The__ allocated space!


It doesn't seem to be needed to me, as, well, it would be wrong in German.

I tried to come up with explanations, and it's really difficult.  So
there are measures where you'd use an article (e.g. "size of the
allocated space"), but I wouldn't do that with "amount".

It's like with "amount" you have some form of infinite supply of
something ("space" in this case) and then you get some amount of it, and
then you continue to talk about that amount.

It's different with quantities like "size", "volume", etc.  There, you
already have your amount, and then you measure it.

So you can say "Some amount of space", but you cannot say "Some size of
the space".  Or you can say "The amount of space that is X", but not
"The size of space that is X".

So "size", "volume", those quantities are measuring something.  "amount"
just refers to something, not a quantity.  So if you want to use an
article, it'd have to go in front of "amount" -- and in fact, it's there
already.

I think saying "the amount of space is 1 GB" is just a shortcut like
saying "the space is 1 GB".  Using "amount of" is not measuring like
saying "size of" is.  In this case, it's just hinting at the fact that
we're rather interested in the extent rather than the entity itself.

A final note: I don't think you'd ask "Which allocated space?".  I think
it'd be "What allocated space?", which further hints towards not using
an article after "of" here.

What I do take a bit of an issue with is actually the next sentence in
this document:


The amount of virtual disk that can be mapped by the L2 and refcount





caches (in bytes) is:


"Amount of disk" sounds like "amount of car" or "amount of person".
Which is to say, it sounds wrong.  Those are discrete things, but
"amount" sounds like something continuous.  Measuring "virtual disk" in
bytes is like measuring "car" in kilograms.

And in fact, we are not interested in bytes of virtual disk (it's a big
difference whether we're talking about one or four disks) but in bytes
of a single virtual disk.  So the fact that we cannot get an article in
here signifies that something's wrong.

Therefore, this might be better as "The part/portion/fraction of the
virtual disk...".

Max



Max,

Thanks for the detailed explanation! I'm still not really convinced 
about the first point, but grammar corrections are not the aim of this 
series, and it also might be correct - as you've written, so I won't 
touch it.
About the second point - you're absolutely right, I didn't notice this 
part. Including in v11.


Leonid.



[Qemu-block] [PATCH v11 0/9] Take the image size into account when allocating the L2 cache

2018-09-24 Thread Leonid Bloch
This series makes the qcow2 L2 cache assignment aware of the image size,
with the intention for it to cover the entire image. The importance of
this change is in noticeable performance improvement, especially with
heavy random I/O. The memory overhead is not big in most cases, as only
1 MB of cache for every 8 GB of image size is used. For cases with very
large images and/or small cluster sizes, or systems with limited RAM
resources, there is an upper limit on the default L2 cache: 32 MB for
Linux systems, and 8 MB for non-Linux systems (the difference is caused
by the fact that it is currently impossible to perform scheduled cache
cleaning on non-Linux systems). To modify this upper limit one can use
the already existing 'l2-cache-size' and 'cache-size' options. Moreover,
this fixes the behavior of 'l2-cache-size', as it was documented as the
*maximum* L2 cache size, but in practice behaved as the absolute size.

To compensate the memory overhead which may (but not necesarily will) be
increased following this behavior, the default cache-clean-interval is set
to 10 minutes by default (was disabled by default before).

The L2 cache is also resized accordingly, by default, if the image is
resized.

Additionally, few minor changes are made (refactoring and documentation
fixes).

Differences from v1:
* .gitignore modification patch removed (unneeded).
* The grammar fix in conflicting cache sizing patch removed (merged).
* The update total_sectors when resizing patch squashed with the
  resizing patch.
* L2 cache is now capped at 32 MB.
* The default cache-clean-interval is set to 30 seconds.

Differences from v2:
* Made it clear in the documentation that setting cache-clean-interval
  to 0 disables this feature.

Differences from v3:
* The default cache-clean-interval is set to 10 minutes instead of 30
  seconds before.
* Commit message changes to better explain the patches.
* Some refactoring.

Differences from v4:
* Refactoring.
* Documentation and commit message fixes.

Differences from v5:
* A patch with cosmetic changes is split from the main patch
* A patch for avoiding duplication in refcount cache size assignment is
  split from the main patch.
* In the main patch the cap on the L2 cache size is set to only 1 MB (in
  order to be close to the previous behavior) and a separate patch sets
  it to 32 MB.
* Changes in the documentation fixes patch [1/8].

Differences from v6:
* A patch is added to make the defined sizes more humanly readable.

Differences from v7:
* Documentation and commit message changes.
* Fix: when cache-size is specified and is large enough, with the L2 and
  refcount caches not set, the L2 cache will be as large as needed to
  cover the entire image, and not as the default max. size. (Return to
  the previous behavior in this part, which was correct).
* Cosmetic changes patch not used.

Differences from v8:
* Made a lookup table for power-of-two sizes. This will enable writing
  pretty size values (most of the ones used are powers of two) while
  allowing correct stringification of these values.
* Made the max. L2 cache size 8 MB on non-Linux systems.
* Returned the test for too large L2 cache size.
* Minor documentation fixes.

Differences from v9:
* DEFAULT_CACHE_CLEAN_INTERVAL is set to 0 for non-Linux platforms in
  order to avoid an error, since the cache clean interval feature is
  currently available only on Linux.

Differences from v10:
* Rewording documentation.

Leonid Bloch (9):
  qcow2: Options' documentation fixes
  include: Add a lookup table of sizes
  qcow2: Make sizes more humanly readable
  qcow2: Avoid duplication in setting the refcount cache size
  qcow2: Assign the L2 cache relatively to the image size
  qcow2: Increase the default upper limit on the L2 cache size
  qcow2: Resize the cache upon image resizing
  qcow2: Set the default cache-clean-interval to 10 minutes
  qcow2: Explicit number replaced by a constant

 block/qcow2.c  | 42 -
 block/qcow2.h  | 19 -
 docs/qcow2-cache.txt   | 43 +++--
 include/qemu/units.h   | 55 ++
 qapi/block-core.json   |  3 ++-
 qemu-options.hx| 11 +---
 tests/qemu-iotests/137 |  8 +-
 tests/qemu-iotests/137.out |  4 ++-
 8 files changed, 139 insertions(+), 46 deletions(-)

-- 
2.17.1




[Qemu-block] [PATCH v11 3/9] qcow2: Make sizes more humanly readable

2018-09-24 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 2 +-
 block/qcow2.h | 9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index c13153735a..d2c07ce9fe 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 }
 } else {
 if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
+*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
diff --git a/block/qcow2.h b/block/qcow2.h
index 81b844e936..a8d6f757b1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
+#include "qemu/units.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
@@ -43,11 +44,11 @@
 
 /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_REFTABLE_SIZE 0x80
+#define QCOW_MAX_REFTABLE_SIZE S_8MiB
 
 /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_L1_SIZE 0x200
+#define QCOW_MAX_L1_SIZE S_32MiB
 
 /* Allow for an average of 1k per snapshot table entry, should be plenty of
  * space for snapshot names and IDs */
@@ -75,9 +76,9 @@
 
 /* Whichever is more */
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
-#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
+#define DEFAULT_L2_CACHE_SIZE S_1MiB
 
-#define DEFAULT_CLUSTER_SIZE 65536
+#define DEFAULT_CLUSTER_SIZE S_64KiB
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
-- 
2.17.1




[Qemu-block] [PATCH v11 1/9] qcow2: Options' documentation fixes

2018-09-24 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
---
 docs/qcow2-cache.txt | 21 ++---
 qemu-options.hx  |  9 ++---
 2 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 8a09a5cc5f..7e28b41bd3 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -79,14 +79,14 @@ Choosing the right cache sizes
 In order to choose the cache sizes we need to know how they relate to
 the amount of allocated space.
 
-The amount of virtual disk that can be mapped by the L2 and refcount
+The part of the virtual disk that can be mapped by the L2 and refcount
 caches (in bytes) is:
 
disk_size = l2_cache_size * cluster_size / 8
disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits
 
 With the default values for cluster_size (64KB) and refcount_bits
-(16), that is
+(16), this becomes:
 
disk_size = l2_cache_size * 8192
disk_size = refcount_cache_size * 32768
@@ -97,12 +97,16 @@ need:
l2_cache_size = disk_size_GB * 131072
refcount_cache_size = disk_size_GB * 32768
 
-QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
-cache of 256KB (262144 bytes), so using the formulas we've just seen
-we have
+For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual
+image size (given that the default cluster size is used):
 
-   1048576 / 131072 = 8 GB of virtual disk covered by that cache
-262144 /  32768 = 8 GB
+   8 GB / 8192 = 1 MB
+
+The refcount cache is 4 times the cluster size by default. With the default
+cluster size of 64 KB, it is 256 KB (262144 bytes). This is sufficient for
+8 GB of image size:
+
+   262144 * 32768 = 8 GB
 
 
 How to configure the cache sizes
@@ -130,6 +134,9 @@ There are a few things that need to be taken into account:
memory as possible to the L2 cache before increasing the refcount
cache size.
 
+ - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
+   can be set simultaneously.
+
 Unlike L2 tables, refcount blocks are not used during normal I/O but
 only during allocations and internal snapshots. In most cases they are
 accessed sequentially (even during random guest I/O) so increasing the
diff --git a/qemu-options.hx b/qemu-options.hx
index a642ad297f..2db6247eff 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -732,15 +732,18 @@ image file)
 
 @item cache-size
 The maximum total size of the L2 table and refcount block caches in bytes
-(default: 1048576 bytes or 8 clusters, whichever is larger)
+(default: the sum of l2-cache-size and refcount-cache-size)
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: 4/5 of the total cache size)
+(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever
+is larger; otherwise, as large as possible or needed within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-(default: 1/5 of the total cache size)
+(default: 4 times the cluster size; or if cache-size is specified, the part of
+it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-- 
2.17.1




[Qemu-block] [PATCH v11 5/9] qcow2: Assign the L2 cache relatively to the image size

2018-09-24 Thread Leonid Bloch
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.

Previously, unless 'cache-size' was specified and was large enough, the
L2 cache was set to a certain size without taking the virtual image size
into account.

Now, the L2 cache assignment is aware of the virtual size of the image,
and will cover the entire image, unless the cache size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c  | 21 +
 block/qcow2.h  |  4 +---
 docs/qcow2-cache.txt   | 15 ++-
 qemu-options.hx|  6 +++---
 tests/qemu-iotests/137 |  8 +++-
 tests/qemu-iotests/137.out |  4 +++-
 6 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index cd0053b6ee..589f6c1b1c 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  uint64_t *refcount_cache_size, Error **errp)
 {
 BDRVQcow2State *s = bs->opaque;
-uint64_t combined_cache_size;
+uint64_t combined_cache_size, l2_cache_max_setting;
 bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
 int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
 
 combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
 l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
 refcount_cache_size_set = qemu_opt_get(opts, 
QCOW2_OPT_REFCOUNT_CACHE_SIZE);
 
 combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
-*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
+l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE,
+ DEFAULT_L2_CACHE_MAX_SIZE);
 *refcount_cache_size = qemu_opt_get_size(opts,
  QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
 *l2_cache_entry_size = qemu_opt_get_size(
 opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size);
 
+*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting);
+
 if (combined_cache_size_set) {
 if (l2_cache_size_set && refcount_cache_size_set) {
 error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"at the same time");
 return;
-} else if (*l2_cache_size > combined_cache_size) {
+} else if (l2_cache_size_set &&
+   (l2_cache_max_setting > combined_cache_size)) {
 error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
 return;
@@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 } else if (refcount_cache_size_set) {
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 } else {
-uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
-
 /* Assign as much memory as possible to the L2 cache, and
  * use the remainder for the refcount cache */
 if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
@@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 }
 }
-} else {
-if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
- (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
- * s->cluster_size);
-}
 }
 /* l2_cache_size and refcount_cache_size are ensured to have at least
  * their minimum values in qcow2_update_options_prepare() */
diff --git a/block/qcow2.h b/block/qcow2.h
index a8d6f757b1..2f8c1fd15c 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,9 +74,7 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-/* Whichever is more */
-#define DEFAULT_L2_CACHE_CLUSTE

[Qemu-block] [PATCH v11 4/9] qcow2: Avoid duplication in setting the refcount cache size

2018-09-24 Thread Leonid Bloch
The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index d2c07ce9fe..cd0053b6ee 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
-if (!refcount_cache_size_set) {
-*refcount_cache_size = min_refcount_cache;
-}
 }
+/* l2_cache_size and refcount_cache_size are ensured to have at least
+ * their minimum values in qcow2_update_options_prepare() */
 
 if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) ||
 *l2_cache_entry_size > s->cluster_size ||
-- 
2.17.1




[Qemu-block] [PATCH v11 2/9] include: Add a lookup table of sizes

2018-09-24 Thread Leonid Bloch
Adding a lookup table for the powers of two, with the appropriate size
prefixes. This is needed when a size has to be stringified, in which
case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))'
string. Powers of two are used very often for sizes, so such a table
will also make it easier and more intuitive to write them.

This table is generatred using the following AWK script:

BEGIN {
suffix="KMGTPE";
for(i=10; i<64; i++) {
val=2**i;
s=substr(suffix, int(i/10), 1);
n=2**(i%10);
pad=21-int(log(n)/log(10));
printf("#define S_%d%siB %*d\n", n, s, pad, val);
    }
}

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 include/qemu/units.h | 55 
 1 file changed, 55 insertions(+)

diff --git a/include/qemu/units.h b/include/qemu/units.h
index 692db3fbb2..68a7758650 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,4 +17,59 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
+#define S_1KiB  1024
+#define S_2KiB  2048
+#define S_4KiB  4096
+#define S_8KiB  8192
+#define S_16KiB16384
+#define S_32KiB32768
+#define S_64KiB65536
+#define S_128KiB  131072
+#define S_256KiB  262144
+#define S_512KiB  524288
+#define S_1MiB   1048576
+#define S_2MiB   2097152
+#define S_4MiB   4194304
+#define S_8MiB   8388608
+#define S_16MiB 16777216
+#define S_32MiB 33554432
+#define S_64MiB 67108864
+#define S_128MiB   134217728
+#define S_256MiB   268435456
+#define S_512MiB   536870912
+#define S_1GiB1073741824
+#define S_2GiB2147483648
+#define S_4GiB4294967296
+#define S_8GiB8589934592
+#define S_16GiB  17179869184
+#define S_32GiB  34359738368
+#define S_64GiB  68719476736
+#define S_128GiB137438953472
+#define S_256GiB274877906944
+#define S_512GiB549755813888
+#define S_1TiB 1099511627776
+#define S_2TiB 219902322
+#define S_4TiB 4398046511104
+#define S_8TiB 8796093022208
+#define S_16TiB   17592186044416
+#define S_32TiB   35184372088832
+#define S_64TiB   70368744177664
+#define S_128TiB 140737488355328
+#define S_256TiB 281474976710656
+#define S_512TiB 562949953421312
+#define S_1PiB  1125899906842624
+#define S_2PiB  2251799813685248
+#define S_4PiB  4503599627370496
+#define S_8PiB  9007199254740992
+#define S_16PiB18014398509481984
+#define S_32PiB36028797018963968
+#define S_64PiB72057594037927936
+#define S_128PiB  144115188075855872
+#define S_256PiB  288230376151711744
+#define S_512PiB  576460752303423488
+#define S_1EiB   1152921504606846976
+#define S_2EiB   2305843009213693952
+#define S_4EiB   4611686018427387904
+#define S_8EiB   9223372036854775808
+
 #endif
-- 
2.17.1




[Qemu-block] [PATCH v11 6/9] qcow2: Increase the default upper limit on the L2 cache size

2018-09-24 Thread Leonid Bloch
The upper limit on the L2 cache size is increased from 1 MB to 32 MB
on Linux platforms, and to 8 MB on other platforms (this difference is
caused by the ability to set intervals for cache cleaning on Linux
platforms only).

This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.h|  6 +-
 docs/qcow2-cache.txt | 15 +--
 qemu-options.hx  |  6 +++---
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 2f8c1fd15c..0f0e3534bf 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,7 +74,11 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-#define DEFAULT_L2_CACHE_MAX_SIZE S_1MiB
+#ifdef CONFIG_LINUX
+#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
+#else
+#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
+#endif
 
 #define DEFAULT_CLUSTER_SIZE S_64KiB
 
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 750447ea4f..1fcc0658b2 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -125,12 +125,15 @@ There are a few things that need to be taken into account:
  - Both caches must have a size that is a multiple of the cluster size
(or the cache entry size: see "Using smaller cache sizes" below).
 
- - The maximum L2 cache size is 1 MB by default (enough for full coverage
-   of 8 GB images, with the default cluster size). This value can be
-   modified using the "l2-cache-size" option. QEMU will not use more memory
-   than needed to hold all of the image's L2 tables, regardless of this max.
-   value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see
-   below).
+ - The maximum L2 cache size is 32 MB by default on Linux platforms (enough
+   for full coverage of 256 GB images, with the default cluster size). This
+   value can be modified using the "l2-cache-size" option. QEMU will not use
+   more memory than needed to hold all of the image's L2 tables, regardless
+   of this max. value.
+   On non-Linux platforms the maximal value is smaller by default (8 MB) and
+   this difference stems from the fact that on Linux the cache can be cleared
+   periodically if needed, using the "cache-clean-interval" option (see below).
+   The minimal L2 cache size is 2 clusters (or 2 cache entries, see below).
 
  - The default (and minimum) refcount cache size is 4 clusters.
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 6eef0f5651..14aee78c6c 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -736,9 +736,9 @@ The maximum total size of the L2 table and refcount block 
caches in bytes
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: if cache-size is not specified - 1M; otherwise, as large as possible
-within the cache-size, while permitting the requested or the minimal refcount
-cache size)
+(default: if cache-size is not specified - 32M on Linux platforms, and 8M on
+non-Linux platforms; otherwise, as large as possible within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-- 
2.17.1




[Qemu-block] [PATCH v11 9/9] qcow2: Explicit number replaced by a constant

2018-09-24 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 3c1859f9cc..983d4361d7 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 /* 2^(s->refcount_order - 3) is the refcount width in bytes */
 s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);
 s->refcount_block_size = 1 << s->refcount_block_bits;
-bs->total_sectors = header.size / 512;
+bs->total_sectors = header.size / BDRV_SECTOR_SIZE;
 s->csize_shift = (62 - (s->cluster_bits - 8));
 s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
 s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
@@ -3450,7 +3450,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 goto fail;
 }
 
-old_length = bs->total_sectors * 512;
+old_length = bs->total_sectors * BDRV_SECTOR_SIZE;
 new_l1_size = size_to_l1(s, offset);
 
 if (offset < old_length) {
-- 
2.17.1




[Qemu-block] [PATCH v11 7/9] qcow2: Resize the cache upon image resizing

2018-09-24 Thread Leonid Bloch
The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning L2 cache relatively to
the image size, implies that the cache will be adapted accordingly
after an image resize.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 589f6c1b1c..c68f896c66 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3418,6 +3418,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 uint64_t old_length;
 int64_t new_l1_size;
 int ret;
+QDict *options;
 
 if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA &&
 prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL)
@@ -3642,6 +3643,8 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 }
 
+bs->total_sectors = offset / BDRV_SECTOR_SIZE;
+
 /* write updated header.size */
 offset = cpu_to_be64(offset);
 ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size),
@@ -3652,6 +3655,13 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 
 s->l1_vm_state_index = new_l1_size;
+
+/* Update cache sizes */
+options = qdict_clone_shallow(bs->options);
+ret = qcow2_update_options(bs, options, s->flags, errp);
+if (ret < 0) {
+goto fail;
+}
 ret = 0;
 fail:
 qemu_co_mutex_unlock(>lock);
-- 
2.17.1




[Qemu-block] [PATCH v11 8/9] qcow2: Set the default cache-clean-interval to 10 minutes

2018-09-24 Thread Leonid Bloch
The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).

* For non-Linux platforms the default is kept at 0, because
  cache-clean-interval is not supported there yet.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c| 2 +-
 block/qcow2.h| 4 +++-
 docs/qcow2-cache.txt | 4 ++--
 qapi/block-core.json | 3 ++-
 qemu-options.hx  | 2 +-
 5 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index c68f896c66..3c1859f9cc 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState 
*bs,
 /* New interval for cache cleanup timer */
 r->cache_clean_interval =
 qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL,
-s->cache_clean_interval);
+DEFAULT_CACHE_CLEAN_INTERVAL);
 #ifndef CONFIG_LINUX
 if (r->cache_clean_interval != 0) {
 error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL
diff --git a/block/qcow2.h b/block/qcow2.h
index 0f0e3534bf..ba430316b9 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -76,13 +76,15 @@
 
 #ifdef CONFIG_LINUX
 #define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
+#define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
 #else
 #define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
+/* Cache clean interval is currently available only on Linux, so must be 0 */
+#define DEFAULT_CACHE_CLEAN_INTERVAL 0
 #endif
 
 #define DEFAULT_CLUSTER_SIZE S_64KiB
 
-
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
 #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
 #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot"
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 1fcc0658b2..59358b816f 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -210,8 +210,8 @@ This example removes all unused cache entries every 15 
minutes:
 
-drive file=hd.qcow2,cache-clean-interval=900
 
-If unset, the default value for this parameter is 0 and it disables
-this feature.
+If unset, the default value for this parameter is 600. Setting it to 0
+disables this feature.
 
 Note that this functionality currently relies on the MADV_DONTNEED
 argument for madvise() to actually free the memory. This is a
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 4c7a37afdc..08c27b9af7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2827,7 +2827,8 @@
 #
 # @cache-clean-interval:  clean unused entries in the L2 and refcount
 # caches. The interval is in seconds. The default value
-# is 0 and it disables this feature (since 2.5)
+# is 600, and 0 disables this feature. (since 2.5)
+#
 # @encrypt:   Image decryption options. Mandatory for
 # encrypted images, except when doing a metadata-only
 # probe of the image. (since 2.10)
diff --git a/qemu-options.hx b/qemu-options.hx
index 14aee78c6c..52d9d9f06d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -747,7 +747,7 @@ it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-The default value is 0 and it disables this feature.
+The default value is 600. Setting it to 0 disables this feature.
 
 @item pass-discard-request
 Whether discard requests to the qcow2 device should be forwarded to the data
-- 
2.17.1




Re: [Qemu-block] [PATCH v10 3/9] qcow2: Make sizes more humanly readable

2018-09-24 Thread Leonid Bloch

On 9/24/18 7:59 PM, Eric Blake wrote:

On 9/21/18 12:23 PM, Leonid Bloch wrote:

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
  block/qcow2.c | 2 +-
  block/qcow2.h | 9 +
  2 files changed, 6 insertions(+), 5 deletions(-)




+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
  #include "crypto/block.h"
  #include "qemu/coroutine.h"
+#include "qemu/units.h"
  //#define DEBUG_ALLOC
  //#define DEBUG_ALLOC2
@@ -43,11 +44,11 @@
  /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
   * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_REFTABLE_SIZE 0x80
+#define QCOW_MAX_REFTABLE_SIZE S_8MiB


Why not (8 * MiB)?  Where does this get stringified that you have to 
introduce a new macro? Can we instead fix the stringification location 
to call a convenience helper that converts arbitrary values into 
convenient strings at runtime, rather than having to worry about a 
separate macro just for stringification purposes?


- Why not (8 * MiB)?
Because we have the lookup table from the previous patch.
- Why do we need this table?
Because some values (not this one specifically, but for example 
DEFAULT_CLUSTER_SIZE) get stringified at compile time (not during 
runtime), and for these values the only two options are to use such a 
table for convenient writing, or to write the number of bytes explicitly 
in the definition.
Again, while the necessity for such a table comes from stringification, 
it has an added value of short and clear size definitions.





  /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
   * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_L1_SIZE 0x200
+#define QCOW_MAX_L1_SIZE S_32MiB


Again, why not (32 * MiB)?


Same as above.



  /* Allow for an average of 1k per snapshot table entry, should be 
plenty of

   * space for snapshot names and IDs */
@@ -75,9 +76,9 @@
  /* Whichever is more */
  #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
-#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
+#define DEFAULT_L2_CACHE_SIZE S_1MiB


and (1 * MiB)


Same as above.




-#define DEFAULT_CLUSTER_SIZE 65536
+#define DEFAULT_CLUSTER_SIZE S_64KiB


and (64 * KiB)


Because:

block/qcow2.c
4726:.def_value_str = stringify(DEFAULT_CLUSTER_SIZE)

(and similar in some other files).

Leonid.



Re: [Qemu-block] [PATCH v10 1/9] qcow2: Options' documentation fixes

2018-09-24 Thread Leonid Bloch
On 9/24/18 6:04 PM, Alberto Garcia wrote:
> On Fri 21 Sep 2018 07:23:02 PM CEST, Leonid Bloch wrote:
>> Signed-off-by: Leonid Bloch 
>> ---
>>   docs/qcow2-cache.txt | 20 +---
>>   qemu-options.hx  |  9 ++---
>>   2 files changed, 19 insertions(+), 10 deletions(-)
>>
>> diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
>> index 8a09a5cc5f..013991e21c 100644
>> --- a/docs/qcow2-cache.txt
>> +++ b/docs/qcow2-cache.txt
>> @@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be 
>> configured.
>>   Choosing the right cache sizes
>>   --
>>   In order to choose the cache sizes we need to know how they relate to
>> -the amount of allocated space.
>> +the amount of the allocated space.
> 
> I'm not a native English speaker, but the current version sounds correct
> to me. Why do you need to add an article there? :-?

I'm not a native speaker as well, but "the" seems to be needed there:
- Which allocated space?
- __The__ allocated space!

But I really do not insist on this one - it's clear enough. :)

> 
>> +For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual
>> +image size (given that the default cluster size is used):
>>   
>> -   1048576 / 131072 = 8 GB of virtual disk covered by that cache
>> -262144 /  32768 = 8 GB
>> +   8 GB / 8192 = 1 MB
>> +
>> +A default refcount cache is 4 times the cluster size, which defaults to
>> +256 KB (262144 bytes). This is sufficient for 8 GB of image size:
> 
> "A default ... which defaults to" sounds a bit strange I think.

Indeed. I'll reword. Thanks.

Leonid.

> 
>> --- a/qemu-options.hx
>> +++ b/qemu-options.hx
>> @@ -742,15 +742,18 @@ image file)
>>   
>>   @item cache-size
>>   The maximum total size of the L2 table and refcount block caches in bytes
>> -(default: 1048576 bytes or 8 clusters, whichever is larger)
>> +(default: the sum of l2-cache-size and refcount-cache-size)
> 
> Oh, so this had been wrong all the time? Good that you saw it!
> 
> Berto
> 


Re: [Qemu-block] [PATCH v10 2/9] include: Add a lookup table of sizes

2018-09-24 Thread Leonid Bloch
On 9/24/18 5:09 PM, Eric Blake wrote:
> On 9/21/18 12:23 PM, Leonid Bloch wrote:
>> Adding a lookup table for the powers of two, with the appropriate size
>> prefixes. This is needed when a size has to be stringified, in which
>> case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))'
>> string. Powers of two are used very often for sizes, so such a table
>> will also make it easier and more intuitive to write them.
> 
> Would it be better to provide a generic util function that takes an 
> arbitrary runtime value and converts it to a human-readable form?
> 

The problem is that the srtingification happens at compile time, and the 
literal string is written to qcow2.o (the default value). From here 
comes the need for this table. And the added benefit of it are more 
concise size notations for power-of-two sizes.

Leonid.


[Qemu-block] [PATCH v10 8/9] qcow2: Set the default cache-clean-interval to 10 minutes

2018-09-21 Thread Leonid Bloch
The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).

* For non-Linux platforms the default is kept at 0, because
  cache-clean-interval is not supported there yet.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c| 2 +-
 block/qcow2.h| 4 +++-
 docs/qcow2-cache.txt | 4 ++--
 qapi/block-core.json | 3 ++-
 qemu-options.hx  | 2 +-
 5 files changed, 9 insertions(+), 6 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 1445cd5360..f885afa0ed 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState 
*bs,
 /* New interval for cache cleanup timer */
 r->cache_clean_interval =
 qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL,
-s->cache_clean_interval);
+DEFAULT_CACHE_CLEAN_INTERVAL);
 #ifndef CONFIG_LINUX
 if (r->cache_clean_interval != 0) {
 error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL
diff --git a/block/qcow2.h b/block/qcow2.h
index 0f0e3534bf..ba430316b9 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -76,13 +76,15 @@
 
 #ifdef CONFIG_LINUX
 #define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
+#define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
 #else
 #define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
+/* Cache clean interval is currently available only on Linux, so must be 0 */
+#define DEFAULT_CACHE_CLEAN_INTERVAL 0
 #endif
 
 #define DEFAULT_CLUSTER_SIZE S_64KiB
 
-
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
 #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
 #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot"
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 5965d3d094..15ae797931 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -209,8 +209,8 @@ This example removes all unused cache entries every 15 
minutes:
 
-drive file=hd.qcow2,cache-clean-interval=900
 
-If unset, the default value for this parameter is 0 and it disables
-this feature.
+If unset, the default value for this parameter is 600. Setting it to 0
+disables this feature.
 
 Note that this functionality currently relies on the MADV_DONTNEED
 argument for madvise() to actually free the memory. This is a
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 4c7a37afdc..08c27b9af7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2827,7 +2827,8 @@
 #
 # @cache-clean-interval:  clean unused entries in the L2 and refcount
 # caches. The interval is in seconds. The default value
-# is 0 and it disables this feature (since 2.5)
+# is 600, and 0 disables this feature. (since 2.5)
+#
 # @encrypt:   Image decryption options. Mandatory for
 # encrypted images, except when doing a metadata-only
 # probe of the image. (since 2.10)
diff --git a/qemu-options.hx b/qemu-options.hx
index d5f4bcadd4..2975fdf9f8 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -757,7 +757,7 @@ it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-The default value is 0 and it disables this feature.
+The default value is 600. Setting it to 0 disables this feature.
 
 @item pass-discard-request
 Whether discard requests to the qcow2 device should be forwarded to the data
-- 
2.17.1




[Qemu-block] [PATCH v10 6/9] qcow2: Increase the default upper limit on the L2 cache size

2018-09-21 Thread Leonid Bloch
The upper limit on the L2 cache size is increased from 1 MB to 32 MB
on Linux platforms, and to 8 MB on other platforms (this difference is
caused by the ability to set intervals for cache cleaning on Linux
platforms only).

This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.h|  6 +-
 docs/qcow2-cache.txt | 15 +--
 qemu-options.hx  |  6 +++---
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 2f8c1fd15c..0f0e3534bf 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,7 +74,11 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-#define DEFAULT_L2_CACHE_MAX_SIZE S_1MiB
+#ifdef CONFIG_LINUX
+#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
+#else
+#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
+#endif
 
 #define DEFAULT_CLUSTER_SIZE S_64KiB
 
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index c84cd69cc7..5965d3d094 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -124,12 +124,15 @@ There are a few things that need to be taken into account:
  - Both caches must have a size that is a multiple of the cluster size
(or the cache entry size: see "Using smaller cache sizes" below).
 
- - The maximum L2 cache size is 1 MB by default (enough for full coverage
-   of 8 GB images, with the default cluster size). This value can be
-   modified using the "l2-cache-size" option. QEMU will not use more memory
-   than needed to hold all of the image's L2 tables, regardless of this max.
-   value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see
-   below).
+ - The maximum L2 cache size is 32 MB by default on Linux platforms (enough
+   for full coverage of 256 GB images, with the default cluster size). This
+   value can be modified using the "l2-cache-size" option. QEMU will not use
+   more memory than needed to hold all of the image's L2 tables, regardless
+   of this max. value.
+   On non-Linux platforms the maximal value is smaller by default (8 MB) and
+   this difference stems from the fact that on Linux the cache can be cleared
+   periodically if needed, using the "cache-clean-interval" option (see below).
+   The minimal L2 cache size is 2 clusters (or 2 cache entries, see below).
 
  - The default (and minimum) refcount cache size is 4 clusters.
 
diff --git a/qemu-options.hx b/qemu-options.hx
index f9fe43a4dc..d5f4bcadd4 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -746,9 +746,9 @@ The maximum total size of the L2 table and refcount block 
caches in bytes
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: if cache-size is not specified - 1M; otherwise, as large as possible
-within the cache-size, while permitting the requested or the minimal refcount
-cache size)
+(default: if cache-size is not specified - 32M on Linux platforms, and 8M on
+non-Linux platforms; otherwise, as large as possible within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-- 
2.17.1




[Qemu-block] [PATCH v10 7/9] qcow2: Resize the cache upon image resizing

2018-09-21 Thread Leonid Bloch
The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning L2 cache relatively to
the image size, implies that the cache will be adapted accordingly
after an image resize.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 01c39c56c0..1445cd5360 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3418,6 +3418,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 uint64_t old_length;
 int64_t new_l1_size;
 int ret;
+QDict *options;
 
 if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA &&
 prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL)
@@ -3642,6 +3643,8 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 }
 
+bs->total_sectors = offset / BDRV_SECTOR_SIZE;
+
 /* write updated header.size */
 offset = cpu_to_be64(offset);
 ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size),
@@ -3652,6 +3655,13 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 
 s->l1_vm_state_index = new_l1_size;
+
+/* Update cache sizes */
+options = qdict_clone_shallow(bs->options);
+ret = qcow2_update_options(bs, options, s->flags, errp);
+if (ret < 0) {
+goto fail;
+}
 ret = 0;
 fail:
 qemu_co_mutex_unlock(>lock);
-- 
2.17.1




[Qemu-block] [PATCH v10 0/9] Take the image size into account when allocating the L2 cache

2018-09-21 Thread Leonid Bloch
This series makes the qcow2 L2 cache assignment aware of the image size,
with the intention for it to cover the entire image. The importance of
this change is in noticeable performance improvement, especially with
heavy random I/O. The memory overhead is not big in most cases, as only
1 MB of cache for every 8 GB of image size is used. For cases with very
large images and/or small cluster sizes, or systems with limited RAM
resources, there is an upper limit on the default L2 cache: 32 MB for
Linux systems, and 8 MB for non-Linux systems (the difference is caused
by the fact that it is currently impossible to perform scheduled cache
cleaning on non-Linux systems). To modify this upper limit one can use
the already existing 'l2-cache-size' and 'cache-size' options. Moreover,
this fixes the behavior of 'l2-cache-size', as it was documented as the
*maximum* L2 cache size, but in practice behaved as the absolute size.

To compensate the memory overhead which may (but not necesarily will) be
increased following this behavior, the default cache-clean-interval is set
to 10 minutes by default (was disabled by default before).

The L2 cache is also resized accordingly, by default, if the image is
resized.

Additionally, few minor changes are made (refactoring and documentation
fixes).

Differences from v1:
* .gitignore modification patch removed (unneeded).
* The grammar fix in conflicting cache sizing patch removed (merged).
* The update total_sectors when resizing patch squashed with the
  resizing patch.
* L2 cache is now capped at 32 MB.
* The default cache-clean-interval is set to 30 seconds.

Differences from v2:
* Made it clear in the documentation that setting cache-clean-interval
  to 0 disables this feature.

Differences from v3:
* The default cache-clean-interval is set to 10 minutes instead of 30
  seconds before.
* Commit message changes to better explain the patches.
* Some refactoring.

Differences from v4:
* Refactoring.
* Documentation and commit message fixes.

Differences from v5:
* A patch with cosmetic changes is split from the main patch
* A patch for avoiding duplication in refcount cache size assignment is
  split from the main patch.
* In the main patch the cap on the L2 cache size is set to only 1 MB (in
  order to be close to the previous behavior) and a separate patch sets
  it to 32 MB.
* Changes in the documentation fixes patch [1/8].

Differences from v6:
* A patch is added to make the defined sizes more humanly readable.

Differences from v7:
* Documentation and commit message changes.
* Fix: when cache-size is specified and is large enough, with the L2 and
  refcount caches not set, the L2 cache will be as large as needed to
  cover the entire image, and not as the default max. size. (Return to
  the previous behavior in this part, which was correct).
* Cosmetic changes patch not used.

Differences from v8:
* Made a lookup table for power-of-two sizes. This will enable writing
  pretty size values (most of the ones used are powers of two) while
  allowing correct stringification of these values.
* Made the max. L2 cache size 8 MB on non-Linux systems.
* Returned the test for too large L2 cache size.
* Minor documentation fixes.

Differences from v9:
* DEFAULT_CACHE_CLEAN_INTERVAL is set to 0 for non-Linux platforms in
  order to avoid an error, since the cache clean interval feature is
  currently available only on Linux.

Leonid Bloch (9):
  qcow2: Options' documentation fixes
  include: Add a lookup table of sizes
  qcow2: Make sizes more humanly readable
  qcow2: Avoid duplication in setting the refcount cache size
  qcow2: Assign the L2 cache relatively to the image size
  qcow2: Increase the default upper limit on the L2 cache size
  qcow2: Resize the cache upon image resizing
  qcow2: Set the default cache-clean-interval to 10 minutes
  qcow2: Explicit number replaced by a constant

 block/qcow2.c  | 42 -
 block/qcow2.h  | 19 -
 docs/qcow2-cache.txt   | 42 +++--
 include/qemu/units.h   | 55 ++
 qapi/block-core.json   |  3 ++-
 qemu-options.hx| 11 +---
 tests/qemu-iotests/137 |  8 +-
 tests/qemu-iotests/137.out |  4 ++-
 8 files changed, 138 insertions(+), 46 deletions(-)

-- 
2.17.1




[Qemu-block] [PATCH v10 4/9] qcow2: Avoid duplication in setting the refcount cache size

2018-09-21 Thread Leonid Bloch
The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 67cc82f0b9..7949d15fc6 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
-if (!refcount_cache_size_set) {
-*refcount_cache_size = min_refcount_cache;
-}
 }
+/* l2_cache_size and refcount_cache_size are ensured to have at least
+ * their minimum values in qcow2_update_options_prepare() */
 
 if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) ||
 *l2_cache_entry_size > s->cluster_size ||
-- 
2.17.1




[Qemu-block] [PATCH v10 5/9] qcow2: Assign the L2 cache relatively to the image size

2018-09-21 Thread Leonid Bloch
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.

Previously, unless 'cache-size' was specified and was large enough, the
L2 cache was set to a certain size without taking the virtual image size
into account.

Now, the L2 cache assignment is aware of the virtual size of the image,
and will cover the entire image, unless the cache size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c  | 21 +
 block/qcow2.h  |  4 +---
 docs/qcow2-cache.txt   | 15 ++-
 qemu-options.hx|  6 +++---
 tests/qemu-iotests/137 |  8 +++-
 tests/qemu-iotests/137.out |  4 +++-
 6 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7949d15fc6..01c39c56c0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  uint64_t *refcount_cache_size, Error **errp)
 {
 BDRVQcow2State *s = bs->opaque;
-uint64_t combined_cache_size;
+uint64_t combined_cache_size, l2_cache_max_setting;
 bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
 int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
 
 combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
 l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
 refcount_cache_size_set = qemu_opt_get(opts, 
QCOW2_OPT_REFCOUNT_CACHE_SIZE);
 
 combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
-*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
+l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE,
+ DEFAULT_L2_CACHE_MAX_SIZE);
 *refcount_cache_size = qemu_opt_get_size(opts,
  QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
 *l2_cache_entry_size = qemu_opt_get_size(
 opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size);
 
+*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting);
+
 if (combined_cache_size_set) {
 if (l2_cache_size_set && refcount_cache_size_set) {
 error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"at the same time");
 return;
-} else if (*l2_cache_size > combined_cache_size) {
+} else if (l2_cache_size_set &&
+   (l2_cache_max_setting > combined_cache_size)) {
 error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
 return;
@@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 } else if (refcount_cache_size_set) {
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 } else {
-uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
-
 /* Assign as much memory as possible to the L2 cache, and
  * use the remainder for the refcount cache */
 if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
@@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 }
 }
-} else {
-if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
- (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
- * s->cluster_size);
-}
 }
 /* l2_cache_size and refcount_cache_size are ensured to have at least
  * their minimum values in qcow2_update_options_prepare() */
diff --git a/block/qcow2.h b/block/qcow2.h
index a8d6f757b1..2f8c1fd15c 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,9 +74,7 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-/* Whichever is more */
-#define DEFAULT_L2_CACHE_CLUSTE

[Qemu-block] [PATCH v10 9/9] qcow2: Explicit number replaced by a constant

2018-09-21 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index f885afa0ed..ffb4a9e4a1 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 /* 2^(s->refcount_order - 3) is the refcount width in bytes */
 s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);
 s->refcount_block_size = 1 << s->refcount_block_bits;
-bs->total_sectors = header.size / 512;
+bs->total_sectors = header.size / BDRV_SECTOR_SIZE;
 s->csize_shift = (62 - (s->cluster_bits - 8));
 s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
 s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
@@ -3450,7 +3450,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 goto fail;
 }
 
-old_length = bs->total_sectors * 512;
+old_length = bs->total_sectors * BDRV_SECTOR_SIZE;
 new_l1_size = size_to_l1(s, offset);
 
 if (offset < old_length) {
-- 
2.17.1




[Qemu-block] [PATCH v10 2/9] include: Add a lookup table of sizes

2018-09-21 Thread Leonid Bloch
Adding a lookup table for the powers of two, with the appropriate size
prefixes. This is needed when a size has to be stringified, in which
case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))'
string. Powers of two are used very often for sizes, so such a table
will also make it easier and more intuitive to write them.

This table is generatred using the following AWK script:

BEGIN {
suffix="KMGTPE";
for(i=10; i<64; i++) {
val=2**i;
s=substr(suffix, int(i/10), 1);
n=2**(i%10);
pad=21-int(log(n)/log(10));
printf("#define S_%d%siB %*d\n", n, s, pad, val);
    }
}

Signed-off-by: Leonid Bloch 
---
 include/qemu/units.h | 55 
 1 file changed, 55 insertions(+)

diff --git a/include/qemu/units.h b/include/qemu/units.h
index 692db3fbb2..68a7758650 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,4 +17,59 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
+#define S_1KiB  1024
+#define S_2KiB  2048
+#define S_4KiB  4096
+#define S_8KiB  8192
+#define S_16KiB16384
+#define S_32KiB32768
+#define S_64KiB65536
+#define S_128KiB  131072
+#define S_256KiB  262144
+#define S_512KiB  524288
+#define S_1MiB   1048576
+#define S_2MiB   2097152
+#define S_4MiB   4194304
+#define S_8MiB   8388608
+#define S_16MiB 16777216
+#define S_32MiB 33554432
+#define S_64MiB 67108864
+#define S_128MiB   134217728
+#define S_256MiB   268435456
+#define S_512MiB   536870912
+#define S_1GiB1073741824
+#define S_2GiB2147483648
+#define S_4GiB4294967296
+#define S_8GiB8589934592
+#define S_16GiB  17179869184
+#define S_32GiB  34359738368
+#define S_64GiB  68719476736
+#define S_128GiB137438953472
+#define S_256GiB274877906944
+#define S_512GiB549755813888
+#define S_1TiB 1099511627776
+#define S_2TiB 219902322
+#define S_4TiB 4398046511104
+#define S_8TiB 8796093022208
+#define S_16TiB   17592186044416
+#define S_32TiB   35184372088832
+#define S_64TiB   70368744177664
+#define S_128TiB 140737488355328
+#define S_256TiB 281474976710656
+#define S_512TiB 562949953421312
+#define S_1PiB  1125899906842624
+#define S_2PiB  2251799813685248
+#define S_4PiB  4503599627370496
+#define S_8PiB  9007199254740992
+#define S_16PiB18014398509481984
+#define S_32PiB36028797018963968
+#define S_64PiB72057594037927936
+#define S_128PiB  144115188075855872
+#define S_256PiB  288230376151711744
+#define S_512PiB  576460752303423488
+#define S_1EiB   1152921504606846976
+#define S_2EiB   2305843009213693952
+#define S_4EiB   4611686018427387904
+#define S_8EiB   9223372036854775808
+
 #endif
-- 
2.17.1




[Qemu-block] [PATCH v10 1/9] qcow2: Options' documentation fixes

2018-09-21 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
---
 docs/qcow2-cache.txt | 20 +---
 qemu-options.hx  |  9 ++---
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 8a09a5cc5f..013991e21c 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be configured.
 Choosing the right cache sizes
 --
 In order to choose the cache sizes we need to know how they relate to
-the amount of allocated space.
+the amount of the allocated space.
 
 The amount of virtual disk that can be mapped by the L2 and refcount
 caches (in bytes) is:
@@ -86,7 +86,7 @@ caches (in bytes) is:
disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits
 
 With the default values for cluster_size (64KB) and refcount_bits
-(16), that is
+(16), this becomes:
 
disk_size = l2_cache_size * 8192
disk_size = refcount_cache_size * 32768
@@ -97,12 +97,15 @@ need:
l2_cache_size = disk_size_GB * 131072
refcount_cache_size = disk_size_GB * 32768
 
-QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
-cache of 256KB (262144 bytes), so using the formulas we've just seen
-we have
+For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual
+image size (given that the default cluster size is used):
 
-   1048576 / 131072 = 8 GB of virtual disk covered by that cache
-262144 /  32768 = 8 GB
+   8 GB / 8192 = 1 MB
+
+A default refcount cache is 4 times the cluster size, which defaults to
+256 KB (262144 bytes). This is sufficient for 8 GB of image size:
+
+   262144 * 32768 = 8 GB
 
 
 How to configure the cache sizes
@@ -130,6 +133,9 @@ There are a few things that need to be taken into account:
memory as possible to the L2 cache before increasing the refcount
cache size.
 
+ - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
+   can be set simultaneously.
+
 Unlike L2 tables, refcount blocks are not used during normal I/O but
 only during allocations and internal snapshots. In most cases they are
 accessed sequentially (even during random guest I/O) so increasing the
diff --git a/qemu-options.hx b/qemu-options.hx
index 654ef484d9..ab1a3b240e 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -742,15 +742,18 @@ image file)
 
 @item cache-size
 The maximum total size of the L2 table and refcount block caches in bytes
-(default: 1048576 bytes or 8 clusters, whichever is larger)
+(default: the sum of l2-cache-size and refcount-cache-size)
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: 4/5 of the total cache size)
+(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever
+is larger; otherwise, as large as possible or needed within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-(default: 1/5 of the total cache size)
+(default: 4 times the cluster size; or if cache-size is specified, the part of
+it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-- 
2.17.1




[Qemu-block] [PATCH v10 3/9] qcow2: Make sizes more humanly readable

2018-09-21 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 2 +-
 block/qcow2.h | 9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index ec9e6238a0..67cc82f0b9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 }
 } else {
 if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
+*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
diff --git a/block/qcow2.h b/block/qcow2.h
index 81b844e936..a8d6f757b1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
+#include "qemu/units.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
@@ -43,11 +44,11 @@
 
 /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_REFTABLE_SIZE 0x80
+#define QCOW_MAX_REFTABLE_SIZE S_8MiB
 
 /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_L1_SIZE 0x200
+#define QCOW_MAX_L1_SIZE S_32MiB
 
 /* Allow for an average of 1k per snapshot table entry, should be plenty of
  * space for snapshot names and IDs */
@@ -75,9 +76,9 @@
 
 /* Whichever is more */
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
-#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
+#define DEFAULT_L2_CACHE_SIZE S_1MiB
 
-#define DEFAULT_CLUSTER_SIZE 65536
+#define DEFAULT_CLUSTER_SIZE S_64KiB
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
-- 
2.17.1




Re: [Qemu-block] [PATCH v9 8/9] qcow2: Set the default cache-clean-interval to 10 minutes

2018-09-21 Thread Leonid Bloch

On 9/21/18 3:35 PM, Alberto Garcia wrote:

On Tue 18 Sep 2018 05:29:22 PM CEST, Leonid Bloch wrote:

  /* New interval for cache cleanup timer */
  r->cache_clean_interval =
  qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL,
-s->cache_clean_interval);
+DEFAULT_CACHE_CLEAN_INTERVAL);


I just realized we're ignoring the old value (s->cache_clean_interval)
here. What's the consequence of this? (this was a change made by Kevin
Wolf in 5b0959a7d432062dcd740f8065004285b15695fa).


  #ifndef CONFIG_LINUX
  if (r->cache_clean_interval != 0) {
  error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL
 " not supported on this host");


Another thing that I noticed (see below)...


diff --git a/block/qcow2.h b/block/qcow2.h
index 0f0e3534bf..8c863897b0 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -82,6 +82,7 @@
  
  #define DEFAULT_CLUSTER_SIZE S_64KiB
  
+#define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */


Shouldn't this be set to 0 in non-Linux platforms? Otherwise it will try
to set it to 600 by default in all platforms and will complain with the
"not supported on this host" error message that I quoted above.


You're right! Will fix, thanks.

Leonid.



Berto





Re: [Qemu-block] [PATCH v9 2/9] include: Add a lookup table of sizes

2018-09-21 Thread Leonid Bloch

On 9/21/18 3:17 PM, Alberto Garcia wrote:

On Tue 18 Sep 2018 05:29:16 PM CEST, Leonid Bloch wrote:

Adding a lookup table for the powers of two, with the appropriate size
prefixes. This is needed when a size has to be stringified, in which
case something like '(1 * KiB)' would become a literal '(1 * (1L <<
10))' string. Powers of two are used very often for sizes, so such a
table will also make it easier and more intuitive to write them.


I wonder in what cases you want to stringify those literals... if it's
something that you want to show the user you either:

   a) know the value in advance, and then you probably want to show
  "4 GiB" instead of "4294967296 bytes"


Then the value will need to be set in two places at any future change.



   b) don't know the value in advance (it's a variable), but then you
  can't use these macros.

Am I missing anything?


This is needed in some places:

$ ag 'stringify\(DEFAULT_CLUSTER_SIZE\)'
block/parallels.c
99:.def_value_str = stringify(DEFAULT_CLUSTER_SIZE),

block/vdi.c
988:.def_value_str = stringify(DEFAULT_CLUSTER_SIZE)

block/qcow2.c
4726:.def_value_str = stringify(DEFAULT_CLUSTER_SIZE)

The latter causes failure in test case 137 if defined as (64 * KiB).

Of course the values in block/parallels.h and block/vdi.c will need to 
be changed as well, I'll do it in a separate patch.


Besides resolving the cases with the necessity to stringify, It also 
creates nice shortcuts for all feasible power-of-two sizes, which are 
very common.


Leonid.



Berto





[Qemu-block] [PATCH v9 6/9] qcow2: Increase the default upper limit on the L2 cache size

2018-09-18 Thread Leonid Bloch
The upper limit on the L2 cache size is increased from 1 MB to 32 MB
on Linux platforms, and to 8 MB on other platforms (this difference is
caused by the ability to set intervals for cache cleaning on Linux
platforms only).

This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.h|  6 +-
 docs/qcow2-cache.txt | 15 +--
 qemu-options.hx  |  6 +++---
 3 files changed, 17 insertions(+), 10 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index 2f8c1fd15c..0f0e3534bf 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,7 +74,11 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-#define DEFAULT_L2_CACHE_MAX_SIZE S_1MiB
+#ifdef CONFIG_LINUX
+#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB
+#else
+#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB
+#endif
 
 #define DEFAULT_CLUSTER_SIZE S_64KiB
 
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index c84cd69cc7..5965d3d094 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -124,12 +124,15 @@ There are a few things that need to be taken into account:
  - Both caches must have a size that is a multiple of the cluster size
(or the cache entry size: see "Using smaller cache sizes" below).
 
- - The maximum L2 cache size is 1 MB by default (enough for full coverage
-   of 8 GB images, with the default cluster size). This value can be
-   modified using the "l2-cache-size" option. QEMU will not use more memory
-   than needed to hold all of the image's L2 tables, regardless of this max.
-   value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see
-   below).
+ - The maximum L2 cache size is 32 MB by default on Linux platforms (enough
+   for full coverage of 256 GB images, with the default cluster size). This
+   value can be modified using the "l2-cache-size" option. QEMU will not use
+   more memory than needed to hold all of the image's L2 tables, regardless
+   of this max. value.
+   On non-Linux platforms the maximal value is smaller by default (8 MB) and
+   this difference stems from the fact that on Linux the cache can be cleared
+   periodically if needed, using the "cache-clean-interval" option (see below).
+   The minimal L2 cache size is 2 clusters (or 2 cache entries, see below).
 
  - The default (and minimum) refcount cache size is 4 clusters.
 
diff --git a/qemu-options.hx b/qemu-options.hx
index f9fe43a4dc..d5f4bcadd4 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -746,9 +746,9 @@ The maximum total size of the L2 table and refcount block 
caches in bytes
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: if cache-size is not specified - 1M; otherwise, as large as possible
-within the cache-size, while permitting the requested or the minimal refcount
-cache size)
+(default: if cache-size is not specified - 32M on Linux platforms, and 8M on
+non-Linux platforms; otherwise, as large as possible within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-- 
2.17.1




[Qemu-block] [PATCH v9 8/9] qcow2: Set the default cache-clean-interval to 10 minutes

2018-09-18 Thread Leonid Bloch
The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c| 2 +-
 block/qcow2.h| 1 +
 docs/qcow2-cache.txt | 4 ++--
 qapi/block-core.json | 3 ++-
 qemu-options.hx  | 2 +-
 5 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 1445cd5360..f885afa0ed 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState 
*bs,
 /* New interval for cache cleanup timer */
 r->cache_clean_interval =
 qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL,
-s->cache_clean_interval);
+DEFAULT_CACHE_CLEAN_INTERVAL);
 #ifndef CONFIG_LINUX
 if (r->cache_clean_interval != 0) {
 error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL
diff --git a/block/qcow2.h b/block/qcow2.h
index 0f0e3534bf..8c863897b0 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -82,6 +82,7 @@
 
 #define DEFAULT_CLUSTER_SIZE S_64KiB
 
+#define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
 #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 5965d3d094..15ae797931 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -209,8 +209,8 @@ This example removes all unused cache entries every 15 
minutes:
 
-drive file=hd.qcow2,cache-clean-interval=900
 
-If unset, the default value for this parameter is 0 and it disables
-this feature.
+If unset, the default value for this parameter is 600. Setting it to 0
+disables this feature.
 
 Note that this functionality currently relies on the MADV_DONTNEED
 argument for madvise() to actually free the memory. This is a
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 4c7a37afdc..08c27b9af7 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2827,7 +2827,8 @@
 #
 # @cache-clean-interval:  clean unused entries in the L2 and refcount
 # caches. The interval is in seconds. The default value
-# is 0 and it disables this feature (since 2.5)
+# is 600, and 0 disables this feature. (since 2.5)
+#
 # @encrypt:   Image decryption options. Mandatory for
 # encrypted images, except when doing a metadata-only
 # probe of the image. (since 2.10)
diff --git a/qemu-options.hx b/qemu-options.hx
index d5f4bcadd4..2975fdf9f8 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -757,7 +757,7 @@ it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-The default value is 0 and it disables this feature.
+The default value is 600. Setting it to 0 disables this feature.
 
 @item pass-discard-request
 Whether discard requests to the qcow2 device should be forwarded to the data
-- 
2.17.1




[Qemu-block] [PATCH v9 0/9] Take the image size into account when allocating the L2 cache

2018-09-18 Thread Leonid Bloch
Sorry for taking such a long pause after v8. I had several extremely
urgent issues to attend to.

This series makes the qcow2 L2 cache assignment aware of the image size,
with the intention for it to cover the entire image. The importance of
this change is in noticeable performance improvement, especially with
heavy random I/O. The memory overhead is not big in most cases, as only
1 MB of cache for every 8 GB of image size is used. For cases with very
large images and/or small cluster sizes, or systems with limited RAM
resources, there is an upper limit on the default L2 cache: 32 MB for
Linux systems, and 8 MB for non-Linux systems (the difference is caused
by the fact that it is currently impossible to perform scheduled cache
cleaning on non-Linux systems). To modify this upper limit one can use
the already existing 'l2-cache-size' and 'cache-size' options. Moreover,
this fixes the behavior of 'l2-cache-size', as it was documented as the
*maximum* L2 cache size, but in practice behaved as the absolute size.

To compensate the memory overhead which may (but not necesarily will) be
increased following this behavior, the default cache-clean-interval is set
to 10 minutes by default (was disabled by default before).

The L2 cache is also resized accordingly, by default, if the image is
resized.

Additionally, few minor changes are made (refactoring and documentation
fixes).

Differences from v1:
* .gitignore modification patch removed (unneeded).
* The grammar fix in conflicting cache sizing patch removed (merged).
* The update total_sectors when resizing patch squashed with the
  resizing patch.
* L2 cache is now capped at 32 MB.
* The default cache-clean-interval is set to 30 seconds.

Differences from v2:
* Made it clear in the documentation that setting cache-clean-interval
  to 0 disables this feature.

Differences from v3:
* The default cache-clean-interval is set to 10 minutes instead of 30
  seconds before.
* Commit message changes to better explain the patches.
* Some refactoring.

Differences from v4:
* Refactoring.
* Documentation and commit message fixes.

Differences from v5:
* A patch with cosmetic changes is split from the main patch
* A patch for avoiding duplication in refcount cache size assignment is
  split from the main patch.
* In the main patch the cap on the L2 cache size is set to only 1 MB (in
  order to be close to the previous behavior) and a separate patch sets
  it to 32 MB.
* Changes in the documentation fixes patch [1/8].

Differences from v6:
* A patch is added to make the defined sizes more humanly readable.

Differences from v7:
* Documentation and commit message changes.
* Fix: when cache-size is specified and is large enough, with the L2 and
  refcount caches not set, the L2 cache will be as large as needed to
  cover the entire image, and not as the default max. size. (Return to
  the previous behavior in this part, which was correct).
* Cosmetic changes patch not used.

Differences from v8:
* Made a lookup table for power-of-two sizes. This will enable writing
  pretty size values (most of the ones used are powers of two) while
  allowing correct stringification of these values.
* Made the max. L2 cache size 8 MB on non-Linux systems.
* Returned the test for too large L2 cache size.
* Minor documentation fixes.

Leonid Bloch (9):
  qcow2: Options' documentation fixes
  include: Add a lookup table of sizes
  qcow2: Make sizes more humanly readable
  qcow2: Avoid duplication in setting the refcount cache size
  qcow2: Assign the L2 cache relatively to the image size
  qcow2: Increase the default upper limit on the L2 cache size
  qcow2: Resize the cache upon image resizing
  qcow2: Set the default cache-clean-interval to 10 minutes
  qcow2: Explicit number replaced by a constant

 block/qcow2.c  | 42 -
 block/qcow2.h  | 16 ++-
 docs/qcow2-cache.txt   | 42 +++--
 include/qemu/units.h   | 55 ++
 qapi/block-core.json   |  3 ++-
 qemu-options.hx| 11 +---
 tests/qemu-iotests/137 |  8 +-
 tests/qemu-iotests/137.out |  4 ++-
 8 files changed, 136 insertions(+), 45 deletions(-)

-- 
2.17.1




[Qemu-block] [PATCH v9 9/9] qcow2: Explicit number replaced by a constant

2018-09-18 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index f885afa0ed..ffb4a9e4a1 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 /* 2^(s->refcount_order - 3) is the refcount width in bytes */
 s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);
 s->refcount_block_size = 1 << s->refcount_block_bits;
-bs->total_sectors = header.size / 512;
+bs->total_sectors = header.size / BDRV_SECTOR_SIZE;
 s->csize_shift = (62 - (s->cluster_bits - 8));
 s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
 s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
@@ -3450,7 +3450,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 goto fail;
 }
 
-old_length = bs->total_sectors * 512;
+old_length = bs->total_sectors * BDRV_SECTOR_SIZE;
 new_l1_size = size_to_l1(s, offset);
 
 if (offset < old_length) {
-- 
2.17.1




[Qemu-block] [PATCH v9 5/9] qcow2: Assign the L2 cache relatively to the image size

2018-09-18 Thread Leonid Bloch
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.

Previously, unless 'cache-size' was specified and was large enough, the
L2 cache was set to a certain size without taking the virtual image size
into account.

Now, the L2 cache assignment is aware of the virtual size of the image,
and will cover the entire image, unless the cache size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c  | 21 +
 block/qcow2.h  |  4 +---
 docs/qcow2-cache.txt   | 15 ++-
 qemu-options.hx|  6 +++---
 tests/qemu-iotests/137 |  8 +++-
 tests/qemu-iotests/137.out |  4 +++-
 6 files changed, 33 insertions(+), 25 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7949d15fc6..01c39c56c0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  uint64_t *refcount_cache_size, Error **errp)
 {
 BDRVQcow2State *s = bs->opaque;
-uint64_t combined_cache_size;
+uint64_t combined_cache_size, l2_cache_max_setting;
 bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
 int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
 
 combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
 l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
 refcount_cache_size_set = qemu_opt_get(opts, 
QCOW2_OPT_REFCOUNT_CACHE_SIZE);
 
 combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
-*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
+l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE,
+ DEFAULT_L2_CACHE_MAX_SIZE);
 *refcount_cache_size = qemu_opt_get_size(opts,
  QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
 *l2_cache_entry_size = qemu_opt_get_size(
 opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size);
 
+*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting);
+
 if (combined_cache_size_set) {
 if (l2_cache_size_set && refcount_cache_size_set) {
 error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"at the same time");
 return;
-} else if (*l2_cache_size > combined_cache_size) {
+} else if (l2_cache_size_set &&
+   (l2_cache_max_setting > combined_cache_size)) {
 error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
 return;
@@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 } else if (refcount_cache_size_set) {
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 } else {
-uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
-
 /* Assign as much memory as possible to the L2 cache, and
  * use the remainder for the refcount cache */
 if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
@@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 }
 }
-} else {
-if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
- (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
- * s->cluster_size);
-}
 }
 /* l2_cache_size and refcount_cache_size are ensured to have at least
  * their minimum values in qcow2_update_options_prepare() */
diff --git a/block/qcow2.h b/block/qcow2.h
index a8d6f757b1..2f8c1fd15c 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,9 +74,7 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-/* Whichever is more */
-#define DEFAULT_L2_CACHE_CLUSTE

[Qemu-block] [PATCH v9 3/9] qcow2: Make sizes more humanly readable

2018-09-18 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 2 +-
 block/qcow2.h | 9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index ec9e6238a0..67cc82f0b9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 }
 } else {
 if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
+*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
diff --git a/block/qcow2.h b/block/qcow2.h
index 81b844e936..a8d6f757b1 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
+#include "qemu/units.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
@@ -43,11 +44,11 @@
 
 /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_REFTABLE_SIZE 0x80
+#define QCOW_MAX_REFTABLE_SIZE S_8MiB
 
 /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_L1_SIZE 0x200
+#define QCOW_MAX_L1_SIZE S_32MiB
 
 /* Allow for an average of 1k per snapshot table entry, should be plenty of
  * space for snapshot names and IDs */
@@ -75,9 +76,9 @@
 
 /* Whichever is more */
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
-#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
+#define DEFAULT_L2_CACHE_SIZE S_1MiB
 
-#define DEFAULT_CLUSTER_SIZE 65536
+#define DEFAULT_CLUSTER_SIZE S_64KiB
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
-- 
2.17.1




[Qemu-block] [PATCH v9 2/9] include: Add a lookup table of sizes

2018-09-18 Thread Leonid Bloch
Adding a lookup table for the powers of two, with the appropriate size
prefixes. This is needed when a size has to be stringified, in which
case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))'
string. Powers of two are used very often for sizes, so such a table
will also make it easier and more intuitive to write them.

This table is generatred using the following AWK script:

BEGIN {
suffix="KMGTPE";
for(i=10; i<64; i++) {
val=2**i;
s=substr(suffix, int(i/10), 1);
n=2**(i%10);
pad=21-int(log(n)/log(10));
printf("#define S_%d%siB %*d\n", n, s, pad, val);
    }
}

Signed-off-by: Leonid Bloch 
---
 include/qemu/units.h | 55 
 1 file changed, 55 insertions(+)

diff --git a/include/qemu/units.h b/include/qemu/units.h
index 692db3fbb2..68a7758650 100644
--- a/include/qemu/units.h
+++ b/include/qemu/units.h
@@ -17,4 +17,59 @@
 #define PiB (INT64_C(1) << 50)
 #define EiB (INT64_C(1) << 60)
 
+#define S_1KiB  1024
+#define S_2KiB  2048
+#define S_4KiB  4096
+#define S_8KiB  8192
+#define S_16KiB16384
+#define S_32KiB32768
+#define S_64KiB65536
+#define S_128KiB  131072
+#define S_256KiB  262144
+#define S_512KiB  524288
+#define S_1MiB   1048576
+#define S_2MiB   2097152
+#define S_4MiB   4194304
+#define S_8MiB   8388608
+#define S_16MiB 16777216
+#define S_32MiB 33554432
+#define S_64MiB 67108864
+#define S_128MiB   134217728
+#define S_256MiB   268435456
+#define S_512MiB   536870912
+#define S_1GiB1073741824
+#define S_2GiB2147483648
+#define S_4GiB4294967296
+#define S_8GiB8589934592
+#define S_16GiB  17179869184
+#define S_32GiB  34359738368
+#define S_64GiB  68719476736
+#define S_128GiB137438953472
+#define S_256GiB274877906944
+#define S_512GiB549755813888
+#define S_1TiB 1099511627776
+#define S_2TiB 219902322
+#define S_4TiB 4398046511104
+#define S_8TiB 8796093022208
+#define S_16TiB   17592186044416
+#define S_32TiB   35184372088832
+#define S_64TiB   70368744177664
+#define S_128TiB 140737488355328
+#define S_256TiB 281474976710656
+#define S_512TiB 562949953421312
+#define S_1PiB  1125899906842624
+#define S_2PiB  2251799813685248
+#define S_4PiB  4503599627370496
+#define S_8PiB  9007199254740992
+#define S_16PiB18014398509481984
+#define S_32PiB36028797018963968
+#define S_64PiB72057594037927936
+#define S_128PiB  144115188075855872
+#define S_256PiB  288230376151711744
+#define S_512PiB  576460752303423488
+#define S_1EiB   1152921504606846976
+#define S_2EiB   2305843009213693952
+#define S_4EiB   4611686018427387904
+#define S_8EiB   9223372036854775808
+
 #endif
-- 
2.17.1




[Qemu-block] [PATCH v9 4/9] qcow2: Avoid duplication in setting the refcount cache size

2018-09-18 Thread Leonid Bloch
The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 67cc82f0b9..7949d15fc6 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
-if (!refcount_cache_size_set) {
-*refcount_cache_size = min_refcount_cache;
-}
 }
+/* l2_cache_size and refcount_cache_size are ensured to have at least
+ * their minimum values in qcow2_update_options_prepare() */
 
 if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) ||
 *l2_cache_entry_size > s->cluster_size ||
-- 
2.17.1




[Qemu-block] [PATCH v9 7/9] qcow2: Resize the cache upon image resizing

2018-09-18 Thread Leonid Bloch
The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning L2 cache relatively to
the image size, implies that the cache will be adapted accordingly
after an image resize.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 01c39c56c0..1445cd5360 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3418,6 +3418,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 uint64_t old_length;
 int64_t new_l1_size;
 int ret;
+QDict *options;
 
 if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA &&
 prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL)
@@ -3642,6 +3643,8 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 }
 
+bs->total_sectors = offset / BDRV_SECTOR_SIZE;
+
 /* write updated header.size */
 offset = cpu_to_be64(offset);
 ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size),
@@ -3652,6 +3655,13 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 
 s->l1_vm_state_index = new_l1_size;
+
+/* Update cache sizes */
+options = qdict_clone_shallow(bs->options);
+ret = qcow2_update_options(bs, options, s->flags, errp);
+if (ret < 0) {
+goto fail;
+}
 ret = 0;
 fail:
 qemu_co_mutex_unlock(>lock);
-- 
2.17.1




[Qemu-block] [PATCH v9 1/9] qcow2: Options' documentation fixes

2018-09-18 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
---
 docs/qcow2-cache.txt | 20 +---
 qemu-options.hx  |  9 ++---
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 8a09a5cc5f..013991e21c 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be configured.
 Choosing the right cache sizes
 --
 In order to choose the cache sizes we need to know how they relate to
-the amount of allocated space.
+the amount of the allocated space.
 
 The amount of virtual disk that can be mapped by the L2 and refcount
 caches (in bytes) is:
@@ -86,7 +86,7 @@ caches (in bytes) is:
disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits
 
 With the default values for cluster_size (64KB) and refcount_bits
-(16), that is
+(16), this becomes:
 
disk_size = l2_cache_size * 8192
disk_size = refcount_cache_size * 32768
@@ -97,12 +97,15 @@ need:
l2_cache_size = disk_size_GB * 131072
refcount_cache_size = disk_size_GB * 32768
 
-QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
-cache of 256KB (262144 bytes), so using the formulas we've just seen
-we have
+For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual
+image size (given that the default cluster size is used):
 
-   1048576 / 131072 = 8 GB of virtual disk covered by that cache
-262144 /  32768 = 8 GB
+   8 GB / 8192 = 1 MB
+
+A default refcount cache is 4 times the cluster size, which defaults to
+256 KB (262144 bytes). This is sufficient for 8 GB of image size:
+
+   262144 * 32768 = 8 GB
 
 
 How to configure the cache sizes
@@ -130,6 +133,9 @@ There are a few things that need to be taken into account:
memory as possible to the L2 cache before increasing the refcount
cache size.
 
+ - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
+   can be set simultaneously.
+
 Unlike L2 tables, refcount blocks are not used during normal I/O but
 only during allocations and internal snapshots. In most cases they are
 accessed sequentially (even during random guest I/O) so increasing the
diff --git a/qemu-options.hx b/qemu-options.hx
index 654ef484d9..ab1a3b240e 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -742,15 +742,18 @@ image file)
 
 @item cache-size
 The maximum total size of the L2 table and refcount block caches in bytes
-(default: 1048576 bytes or 8 clusters, whichever is larger)
+(default: the sum of l2-cache-size and refcount-cache-size)
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: 4/5 of the total cache size)
+(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever
+is larger; otherwise, as large as possible or needed within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-(default: 1/5 of the total cache size)
+(default: 4 times the cluster size; or if cache-size is specified, the part of
+it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-- 
2.17.1




Re: [Qemu-block] [PATCH v8 0/8] Take the image size into account when allocating the L2 cache

2018-09-12 Thread Leonid Bloch

Hi Kevin & All,

Sorry it took so long to send a new version! I had some very urgent 
things popping up on several fronts. I will send the new version over 
this weekend.


Thanks for the reminder, and sorry again.

Leonid.
___

On 9/10/18 5:33 PM, Kevin Wolf wrote:

Hi Leonid,

Am 13.08.2018 um 03:07 hat Leonid Bloch geschrieben:

This series makes the qcow2 L2 cache assignment aware of the image size,
with the intention for it to cover the entire image. The importance of
this change is in noticeable performance improvement, especially with
heavy random I/O. The memory overhead is not big in most cases, as only
1 MB of cache for every 8 GB of image size is used. For cases with very
large images and/or small cluster sizes, or systems with limited RAM
resources, there is an upper limit on the default L2 cache: 32 MB. To
modify this limit one can use the already existing 'l2-cache-size' and
'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size',
as it was documented as the *maximum* L2 cache size, but in practice
behaved as the absolute size.

To compensate the memory overhead which may be increased following this
behavior, the default cache-clean-interval is set to 10 minutes by default
(was disabled by default before).

The L2 cache is also resized accordingly, by default, if the image is
resized.


after digging through the most urgent part of my post-vacation backlog,
I saw this series again. It's been a while, but I don't think much is
missing, so let's get this finished now. :-)

We had some discussion in the v7 thread after you posted v8, and you
mentioned there that you'd do some changes in a v9 (at first sight, just
some test case stuff). Are you still planning to send that v9?

Kevin





Re: [Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size

2018-08-14 Thread Leonid Bloch

On 8/14/18 2:44 PM, Kevin Wolf wrote:

Am 14.08.2018 um 13:34 hat Leonid Bloch geschrieben:

On 8/14/18 11:18 AM, Kevin Wolf wrote:

Am 13.08.2018 um 18:42 hat Leonid Bloch geschrieben:

I don't actually think it's so bad to keep the cache permanently
allocated, but I wouldn't object to a lower default for non-Linux hosts
either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
might be more adequate. My typical desktop VMs are larger than 8 GB, but
smaller than 32 GB.


And for a Windows VM just the OS installation takes above 40 GB. While we
probably are not running Windows VMs for our own needs, it is very common
that a customer of, for example, some cloud service uses QEMU (unknowingly)
for a full-blown Windows. So 100 GB+ images which are quite heavily used is
not a rare scenario. 256 GB - yeah, that would be on the higher end.


The OS installation is mostly sequential access, though. You only need
that much cache when you have completely random I/O across the whole
image. Otherwise the LRU based approach of the cache is good enough to
keep those tables cached that are actually in use.


Sorry, by "OS installation" I meant the installed size of the OS, which
should be available for fast and frequent access, not the installation
process itself. Obviously for one-time tasks like the installation process
it's not worth it, unless one installs all the time, instead of using ready
images, for some reason. :)


But you never use everything that is present in an OS installation of
40 GB (is it really _that_ huge these days?), and you don't read OS
files non-stop. The most frequently used parts of the OS are actually in
the guest RAM.


Yes, Windows 8.1, with all the desktop bloat - just above 40 GB. :]
I did a proper benchmarking indeed only on heavy I/O load, where full 
cache did show above 50% improvement, although just regular usage felt 
faster as well, but maybe it's just psychosomatic. :)


Leonid.



I don't think you'll really notice the difference in qcow2 unless you
have a really I/O intensive workload - and that is not usually for OS
files, but for user data. For only occasional accesses, the additional
64k for the metadata table wouldn't play a big role.

Kevin





Re: [Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size

2018-08-14 Thread Leonid Bloch

On 8/14/18 11:18 AM, Kevin Wolf wrote:

Am 13.08.2018 um 18:42 hat Leonid Bloch geschrieben:

I don't actually think it's so bad to keep the cache permanently
allocated, but I wouldn't object to a lower default for non-Linux hosts
either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
might be more adequate. My typical desktop VMs are larger than 8 GB, but
smaller than 32 GB.


And for a Windows VM just the OS installation takes above 40 GB. While we
probably are not running Windows VMs for our own needs, it is very common
that a customer of, for example, some cloud service uses QEMU (unknowingly)
for a full-blown Windows. So 100 GB+ images which are quite heavily used is
not a rare scenario. 256 GB - yeah, that would be on the higher end.


The OS installation is mostly sequential access, though. You only need
that much cache when you have completely random I/O across the whole
image. Otherwise the LRU based approach of the cache is good enough to
keep those tables cached that are actually in use.


Sorry, by "OS installation" I meant the installed size of the OS, which 
should be available for fast and frequent access, not the installation 
process itself. Obviously for one-time tasks like the installation 
process it's not worth it, unless one installs all the time, instead of 
using ready images, for some reason. :)




The maximum cache size is maybe for huge databases or indeed random I/O
benchmarks, both of which are important to support (on Linux at least),
but probably not the most common use case.


So 16 MB would indeed be a reasonable default for the *max.* L2 cache now,
although below that would be too little, I think. 32 MB - if we want some
future-proofing.


I think we all agree that 32 MB + cache-clean-interval is okay.

It's just that for non-Linux guests, cache-clean-interval doesn't work.
However, we probably care less about those large random I/O cases there,
so a smaller cache size like 1 MB can do on non-Linux.

Kevin





Re: [Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size

2018-08-13 Thread Leonid Bloch

I don't actually think it's so bad to keep the cache permanently
allocated, but I wouldn't object to a lower default for non-Linux hosts
either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB)
might be more adequate. My typical desktop VMs are larger than 8 GB, but
smaller than 32 GB.

Kevin



And for a Windows VM just the OS installation takes above 40 GB. While 
we probably are not running Windows VMs for our own needs, it is very 
common that a customer of, for example, some cloud service uses QEMU 
(unknowingly) for a full-blown Windows. So 100 GB+ images which are 
quite heavily used is not a rare scenario. 256 GB - yeah, that would be 
on the higher end.


So 16 MB would indeed be a reasonable default for the *max.* L2 cache 
now, although below that would be too little, I think. 32 MB - if we 
want some future-proofing.


Leonid.



Re: [Qemu-block] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size

2018-08-13 Thread Leonid Bloch

On 8/13/18 2:33 PM, Kevin Wolf wrote:

Am 11.08.2018 um 21:19 hat Leonid Bloch geschrieben:

   @item refcount-cache-size
   The maximum size of the refcount block cache in bytes
diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137
index 87965625d8..e3fb078588 100755
--- a/tests/qemu-iotests/137
+++ b/tests/qemu-iotests/137
@@ -109,7 +109,6 @@ $QEMU_IO \
   -c "reopen -o cache-size=1M,l2-cache-size=64k,refcount-cache-size=64k" \
   -c "reopen -o cache-size=1M,l2-cache-size=2M" \
   -c "reopen -o cache-size=1M,refcount-cache-size=2M" \
--c "reopen -o l2-cache-size=256T" \


The "L2 cache size too big" error can still be tested, but you will need
to create an image large enough to allow such a big cache.

$ qemu-img create -f qcow2 -o cluster_size=256k hd.qcow2 32P
$ $QEMU -drive file=hd.qcow2,l2-cache-entry-size=512,l2-cache-size=1T


* 32P qcow2 will take 33M - is it OK to create it just for a test?
* Is it worth to create a special test scenario, with a separate image
creation, just for that case?


We're creating larger images than that during tests, so I think this is
fine. You don't have to create a new separate test file or anything,
just increase the size of the used test image from 64M to 32P or
whatever is necessary.


OK, sure, will do in v9.

Leonid.



Kevin





Re: [Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size

2018-08-13 Thread Leonid Bloch



On August 13, 2018 4:39:35 AM EEST, Max Reitz  wrote:
>On 2018-08-10 14:00, Alberto Garcia wrote:
>> On Fri 10 Aug 2018 08:26:44 AM CEST, Leonid Bloch wrote:
>>> The upper limit on the L2 cache size is increased from 1 MB to 32
>MB.
>>> This is done in order to allow default full coverage with the L2
>cache
>>> for images of up to 256 GB in size (was 8 GB). Note, that only the
>>> needed amount to cover the full image is allocated. The value which
>is
>>> changed here is just the upper limit on the L2 cache size, beyond
>which
>>> it will not grow, even if the size of the image will require it to.
>>>
>>> Signed-off-by: Leonid Bloch 
>> 
>> Reviewed-by: Alberto Garcia 
>> 
>>> -#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
>>> +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
>> 
>> The patch looks perfect to me now and I'm fine with this change, but
>> this is quite an increase from the previous default value. If anyone
>> thinks that this is too aggressive (or too little :)) I'm all ears.
>
>This is just noise from the sidelines (so nothing too serious), but
>anyway, I don't like it very much.
>
>My first point is that the old limit doesn't mean you can only use 8 GB
>qcow2 images.  You can use more, you just can't access more than 8 GB
>randomly.  I know I'm naive, but I think that the number of use cases
>where you need random IOPS spread out over more than 8 GB of an image
>are limited.
>
>My second point is that qemu still allocated 128 MB of RAM by default.
>Using 1/4th of that for every qcow2 image you attach to the VM seems a
>bit much.
>
>Now it gets a bit complicated.  This series makes cache-clean-interval
>default to 10 minutes, so it shouldn't be an issue in practice.  But
>one
>thing to note is that this is a Linux-specific feature, so on every
>other system, this really means 32 MB per image.  (Also, 10 minutes
>means that whenever I boot up my VM with a couple of disks with random
>accesses all over the images during boot, I might end up using 32 MB
>per
>image again (for 10 min), even though I don't really need that
>performance.)
>
>Now if we really rely on that cache-clean-interval, why not make it
>always cover the whole image by default?  I don't really see why we
>should now say "256 GB seems reasonable, and 32 MB doesn't sound like
>too much, let's go there".  (Well, OK, I do see how you end up using 32
>MB as basically a safety margin, where you'd say that anything above it
>is just unreasonable.)
>
>Do we update the limit in a couple of years again because people have
>more RAM and larger disks then?  (Maybe we do?)
>
>My personal opinion is this: Most users should be fine with 8 GB of
>randomly accessible image space (this may be wrong).  Whenever a user
>does have an application that uses more than 8 GB, they are probably in
>an area where they want to do some performance tuning anyway. 
>Requiring
>them to set l2-cache-full in that case seems reasonable to me.  Pushing
>the default to 256 GB to me looks a bit like just letting them run into
>the problem later.  It doesn't solve the issue that you need to do some
>performance tuning if you have a bit of a special use case (but maybe
>I'm wrong and accessing more than 8 GB randomly is what everybody does
>with their VMs).
>
>(Maybe it's even a good thing to limit it to a smaller number so users
>run into the issue sooner than later...)
>
>OTOH, this change means that everyone on a non-Linux system will have
>to
>use 32 MB of their RAM per qcow2 image, and everyone on a Linux system
>will potentially use it e.g. during boot when you do access a lot
>randomly (even though the performance usually is not of utmost
>importance then (important, but not extremely so)).  But then again,
>this will probably only affect a single disk (the one with the OS on
>it), so it won't be too bad.
>
>So my stance is:
>
>(1) Is it really worth pushing the default to 256 GB if you probably
>have to do a bit of performance tuning anyway when you get past 8 GB
>random IOPS?  I think it's reasonable to ask users to use l2-cache-full
>or adjust the cache to their needs.
>
>(2) For non-Linux systems, this seems to really mean 32 MB of RAM per
>qcow2 image.  That's 1/4th of default VM RAM.  Is that worth it?
>
>(3) For Linux, I don't like it much either, but that's because I'm
>stupid.  The fact that if you don't need this much random I/O only your
>boot disk may cause a RAM usage spike, and even then it's going to go
>down after 10 minutes, is probably enough to justify this change.
>
>
>I suppose my moaning would subside if we only increased the default on
>systems th

[Qemu-block] [PATCH v8 1/8] qcow2: Options' documentation fixes

2018-08-12 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
---
 docs/qcow2-cache.txt | 20 +---
 qemu-options.hx  |  9 ++---
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 8a09a5cc5f..2326db01b9 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be configured.
 Choosing the right cache sizes
 --
 In order to choose the cache sizes we need to know how they relate to
-the amount of allocated space.
+the amount of the allocated space.
 
 The amount of virtual disk that can be mapped by the L2 and refcount
 caches (in bytes) is:
@@ -86,7 +86,7 @@ caches (in bytes) is:
disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits
 
 With the default values for cluster_size (64KB) and refcount_bits
-(16), that is
+(16), this becomes:
 
disk_size = l2_cache_size * 8192
disk_size = refcount_cache_size * 32768
@@ -97,12 +97,15 @@ need:
l2_cache_size = disk_size_GB * 131072
refcount_cache_size = disk_size_GB * 32768
 
-QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount
-cache of 256KB (262144 bytes), so using the formulas we've just seen
-we have
+For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual
+image size (given that the default cluster size is used):
 
-   1048576 / 131072 = 8 GB of virtual disk covered by that cache
-262144 /  32768 = 8 GB
+   8 * 131072 = 1 MB
+
+A default refcount cache is 4 times the cluster size, which defaults to
+256 KB (262144 bytes). This is sufficient for 8 GB of image size:
+
+   262144 / 32768 = 8 GB
 
 
 How to configure the cache sizes
@@ -130,6 +133,9 @@ There are a few things that need to be taken into account:
memory as possible to the L2 cache before increasing the refcount
cache size.
 
+ - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size"
+   can be set simultaneously.
+
 Unlike L2 tables, refcount blocks are not used during normal I/O but
 only during allocations and internal snapshots. In most cases they are
 accessed sequentially (even during random guest I/O) so increasing the
diff --git a/qemu-options.hx b/qemu-options.hx
index b1bf0f485f..f6804758d3 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -752,15 +752,18 @@ image file)
 
 @item cache-size
 The maximum total size of the L2 table and refcount block caches in bytes
-(default: 1048576 bytes or 8 clusters, whichever is larger)
+(default: the sum of l2-cache-size and refcount-cache-size)
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: 4/5 of the total cache size)
+(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever
+is larger; otherwise, as large as possible or needed within the cache-size,
+while permitting the requested or the minimal refcount cache size)
 
 @item refcount-cache-size
 The maximum size of the refcount block cache in bytes
-(default: 1/5 of the total cache size)
+(default: 4 times the cluster size; or if cache-size is specified, the part of
+it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-- 
2.17.1




[Qemu-block] [PATCH v8 0/8] Take the image size into account when allocating the L2 cache

2018-08-12 Thread Leonid Bloch
This series makes the qcow2 L2 cache assignment aware of the image size,
with the intention for it to cover the entire image. The importance of
this change is in noticeable performance improvement, especially with
heavy random I/O. The memory overhead is not big in most cases, as only
1 MB of cache for every 8 GB of image size is used. For cases with very
large images and/or small cluster sizes, or systems with limited RAM
resources, there is an upper limit on the default L2 cache: 32 MB. To
modify this limit one can use the already existing 'l2-cache-size' and
'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size',
as it was documented as the *maximum* L2 cache size, but in practice
behaved as the absolute size.

To compensate the memory overhead which may be increased following this
behavior, the default cache-clean-interval is set to 10 minutes by default
(was disabled by default before).

The L2 cache is also resized accordingly, by default, if the image is
resized.

Additionally, few minor changes are made (refactoring and documentation
fixes).

Differences from v1:
* .gitignore modification patch removed (unneeded).
* The grammar fix in conflicting cache sizing patch removed (merged).
* The update total_sectors when resizing patch squashed with the
  resizing patch.
* L2 cache is now capped at 32 MB.
* The default cache-clean-interval is set to 30 seconds.

Differences from v2:
* Made it clear in the documentation that setting cache-clean-interval
  to 0 disables this feature.

Differences from v3:
* The default cache-clean-interval is set to 10 minutes instead of 30
  seconds before.
* Commit message changes to better explain the patches.
* Some refactoring.

Differences from v4:
* Refactoring.
* Documentation and commit message fixes.

Differences from v5:
* A patch with cosmetic changes is split from the main patch
* A patch for avoiding duplication in refcount cache size assignment is
  split from the main patch.
* In the main patch the cap on the L2 cache size is set to only 1 MB (in
  order to be close to the previous behavior) and a separate patch sets
  it to 32 MB.
* Changes in the documentation fixes patch [1/8].

Differences from v6:
* A patch is added to make the defined sizes more humanly readable.

Differences from v7:
* Documentation and commit message changes.
* Fix: when cache-size is specified and is large enough, with the L2 and
  refcount caches not set, the L2 cache will be as large as needed to
  cover the entire image, and not as the default max. size. (Return to
  the previous behavior in this part, which was correct).
* Cosmetic changes patch not used.


Leonid Bloch (8):
  qcow2: Options' documentation fixes
  qcow2: Make sizes more humanly readable
  qcow2: Avoid duplication in setting the refcount cache size
  qcow2: Assign the L2 cache relatively to the image size
  qcow2: Increase the default upper limit on the L2 cache size
  qcow2: Resize the cache upon image resizing
  qcow2: Set the default cache-clean-interval to 10 minutes
  qcow2: Explicit number replaced by a constant

 block/qcow2.c  | 42 ++
 block/qcow2.h  | 12 +--
 docs/qcow2-cache.txt   | 39 ++-
 qapi/block-core.json   |  3 ++-
 qemu-options.hx| 11 ++
 tests/qemu-iotests/137 |  1 -
 tests/qemu-iotests/137.out |  1 -
 7 files changed, 64 insertions(+), 45 deletions(-)

-- 
2.17.1




[Qemu-block] [PATCH v8 8/8] qcow2: Explicit number replaced by a constant

2018-08-12 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index f885afa0ed..ffb4a9e4a1 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 /* 2^(s->refcount_order - 3) is the refcount width in bytes */
 s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3);
 s->refcount_block_size = 1 << s->refcount_block_bits;
-bs->total_sectors = header.size / 512;
+bs->total_sectors = header.size / BDRV_SECTOR_SIZE;
 s->csize_shift = (62 - (s->cluster_bits - 8));
 s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
 s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
@@ -3450,7 +3450,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 goto fail;
 }
 
-old_length = bs->total_sectors * 512;
+old_length = bs->total_sectors * BDRV_SECTOR_SIZE;
 new_l1_size = size_to_l1(s, offset);
 
 if (offset < old_length) {
-- 
2.17.1




[Qemu-block] [PATCH v8 3/8] qcow2: Avoid duplication in setting the refcount cache size

2018-08-12 Thread Leonid Bloch
The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 67cc82f0b9..7949d15fc6 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
-if (!refcount_cache_size_set) {
-*refcount_cache_size = min_refcount_cache;
-}
 }
+/* l2_cache_size and refcount_cache_size are ensured to have at least
+ * their minimum values in qcow2_update_options_prepare() */
 
 if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) ||
 *l2_cache_entry_size > s->cluster_size ||
-- 
2.17.1




[Qemu-block] [PATCH v8 6/8] qcow2: Resize the cache upon image resizing

2018-08-12 Thread Leonid Bloch
The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning L2 cache relatively to
the image size, implies that the cache will be adapted accordingly
after an image resize.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/block/qcow2.c b/block/qcow2.c
index 01c39c56c0..1445cd5360 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3418,6 +3418,7 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 uint64_t old_length;
 int64_t new_l1_size;
 int ret;
+QDict *options;
 
 if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA &&
 prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL)
@@ -3642,6 +3643,8 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 }
 
+bs->total_sectors = offset / BDRV_SECTOR_SIZE;
+
 /* write updated header.size */
 offset = cpu_to_be64(offset);
 ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size),
@@ -3652,6 +3655,13 @@ static int coroutine_fn 
qcow2_co_truncate(BlockDriverState *bs, int64_t offset,
 }
 
 s->l1_vm_state_index = new_l1_size;
+
+/* Update cache sizes */
+options = qdict_clone_shallow(bs->options);
+ret = qcow2_update_options(bs, options, s->flags, errp);
+if (ret < 0) {
+goto fail;
+}
 ret = 0;
 fail:
 qemu_co_mutex_unlock(>lock);
-- 
2.17.1




[Qemu-block] [PATCH v8 4/8] qcow2: Assign the L2 cache relatively to the image size

2018-08-12 Thread Leonid Bloch
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.

Previously, unless 'cache-size' was specified and was large enough, the
L2 cache was set to a certain size without taking the virtual image size
into account.

Now, the L2 cache assignment is aware of the virtual size of the image,
and will cover the entire image, unless the cache size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c  | 21 +
 block/qcow2.h  |  4 +---
 docs/qcow2-cache.txt   | 15 ++-
 qemu-options.hx|  6 +++---
 tests/qemu-iotests/137 |  1 -
 tests/qemu-iotests/137.out |  1 -
 6 files changed, 23 insertions(+), 25 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7949d15fc6..01c39c56c0 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  uint64_t *refcount_cache_size, Error **errp)
 {
 BDRVQcow2State *s = bs->opaque;
-uint64_t combined_cache_size;
+uint64_t combined_cache_size, l2_cache_max_setting;
 bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
 int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
 
 combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
 l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
 refcount_cache_size_set = qemu_opt_get(opts, 
QCOW2_OPT_REFCOUNT_CACHE_SIZE);
 
 combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
-*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
+l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE,
+ DEFAULT_L2_CACHE_MAX_SIZE);
 *refcount_cache_size = qemu_opt_get_size(opts,
  QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
 *l2_cache_entry_size = qemu_opt_get_size(
 opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size);
 
+*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting);
+
 if (combined_cache_size_set) {
 if (l2_cache_size_set && refcount_cache_size_set) {
 error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"at the same time");
 return;
-} else if (*l2_cache_size > combined_cache_size) {
+} else if (l2_cache_size_set &&
+   (l2_cache_max_setting > combined_cache_size)) {
 error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
 return;
@@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 } else if (refcount_cache_size_set) {
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 } else {
-uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
-
 /* Assign as much memory as possible to the L2 cache, and
  * use the remainder for the refcount cache */
 if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
@@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 }
 }
-} else {
-if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
- (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
- * s->cluster_size);
-}
 }
 /* l2_cache_size and refcount_cache_size are ensured to have at least
  * their minimum values in qcow2_update_options_prepare() */
diff --git a/block/qcow2.h b/block/qcow2.h
index 39e1b279f8..d917b5f577 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,9 +74,7 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-/* Whichever is more */
-#define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters 

[Qemu-block] [PATCH v8 5/8] qcow2: Increase the default upper limit on the L2 cache size

2018-08-12 Thread Leonid Bloch
The upper limit on the L2 cache size is increased from 1 MB to 32 MB.
This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.h| 2 +-
 docs/qcow2-cache.txt | 4 ++--
 qemu-options.hx  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index d917b5f577..e699a55d02 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,7 +74,7 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
+#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
 
 #define DEFAULT_CLUSTER_SIZE (64 * KiB)
 
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index e89e74b372..0fc438f397 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -124,8 +124,8 @@ There are a few things that need to be taken into account:
  - Both caches must have a size that is a multiple of the cluster size
(or the cache entry size: see "Using smaller cache sizes" below).
 
- - The maximum L2 cache size is 1 MB by default (enough for full coverage
-   of 8 GB images, with the default cluster size). This value can be
+ - The maximum L2 cache size is 32 MB by default (enough for full coverage
+   of 256 GB images, with the default cluster size). This value can be
modified using the "l2-cache-size" option. QEMU will not use more memory
than needed to hold all of the image's L2 tables, regardless of this max.
value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see
diff --git a/qemu-options.hx b/qemu-options.hx
index 22e8e2d113..4c44cdbc23 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -756,7 +756,7 @@ The maximum total size of the L2 table and refcount block 
caches in bytes
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: if cache-size is not specified - 1M; otherwise, as large as possible
+(default: if cache-size is not specified - 32M; otherwise, as large as possible
 within the cache-size, while permitting the requested or the minimal refcount
 cache size)
 
-- 
2.17.1




[Qemu-block] [PATCH v8 2/8] qcow2: Make sizes more humanly readable

2018-08-12 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c | 2 +-
 block/qcow2.h | 9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index ec9e6238a0..67cc82f0b9 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 }
 } else {
 if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
+*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
diff --git a/block/qcow2.h b/block/qcow2.h
index 81b844e936..39e1b279f8 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
+#include "qemu/units.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
@@ -43,11 +44,11 @@
 
 /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_REFTABLE_SIZE 0x80
+#define QCOW_MAX_REFTABLE_SIZE (8 * MiB)
 
 /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_L1_SIZE 0x200
+#define QCOW_MAX_L1_SIZE (32 * MiB)
 
 /* Allow for an average of 1k per snapshot table entry, should be plenty of
  * space for snapshot names and IDs */
@@ -75,9 +76,9 @@
 
 /* Whichever is more */
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
-#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
+#define DEFAULT_L2_CACHE_SIZE (1 * MiB)
 
-#define DEFAULT_CLUSTER_SIZE 65536
+#define DEFAULT_CLUSTER_SIZE (64 * KiB)
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
-- 
2.17.1




Re: [Qemu-block] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size

2018-08-11 Thread Leonid Bloch

On 8/10/18 5:39 PM, Alberto Garcia wrote:

On Fri 10 Aug 2018 08:26:43 AM CEST, Leonid Bloch wrote:

Previously, the L2 cache was allocated without considering the image
size, and an option existed to manually determine its size.


This is not quite correct: the L2 cache was set to the maximum needed
for the image when "cache-size" was large enough:

 if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
 *l2_cache_size = max_l2_cache;
 } ...

See below for an example.


You're right. I'll fix that.




  } else {
-uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
-
  /* Assign as much memory as possible to the L2 cache, and
   * use the remainder for the refcount cache */
-if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
-*l2_cache_size = max_l2_cache;
+if (combined_cache_size >= *l2_cache_size + min_refcount_cache) {


This has an (unintended?) side effect:

If you have a 1TB qcow2 image and open it with e.g. cache-size=200M,
with the previous code you would get a 128MB L2 cache (max_l2_cache).
With the new code you only get 32MB.


Good catch!! Thanks!



The rest of the function looks good to me now. We're getting close :)


- - The default L2 cache size is 8 clusters or 1MB (whichever is more),
-   and the minimum is 2 clusters (or 2 cache entries, see below).
+ - The default L2 cache size will cover the entire virtual size of an
+   image, up to a certain maximum. This maximum is 1 MB by default
+   (enough for image sizes of up to 8 GB with the default cluster size)
+   and it can be reduced or enlarged using the "l2-cache-size" option.
+   The minimum is 2 clusters (or 2 cache entries, see below).


It's not clear with this wording if this "certain maximum" refers to the
cache size or the image size.


"This maximum is 1 MB by default (enough for image sizes of up to 8 GB 
..." - to me it's quite clear that it speaks of the image size. But I 
guess I can change the wording.




I'm thinking that perhaps it's clearer if you leave that item unchanged
and add a new one below, something like:


I can't leave it unchanged for two reasons: (1) "8 clusters or 1MB 
(whichever is more)" (2) "default" - now it is not default, but maximal.
OK, I'll change it to be more clear, incorporating your suggestion from 
below.




- The L2 cache size setting (regardless of whether it takes the default
   value or it's set by the user) refers to the maximum size of the L2
   cache. If QEMU needs less memory to hold all of the image's L2 tables,
   then that's the actual size of the cache that will be allocated. >

   - If the L2 cache is big enough to hold all of the image's L2 tables
-   (as explained in the "Choosing the right cache sizes" section
-   earlier in this document) then none of this is necessary and you
-   can omit the "l2-cache-entry-size" parameter altogether.
+   (the default behavior for images of up to 8 GB in size) then none
+   of this is necessary and you can omit the "l2-cache-entry-size"
+   parameter altogether.


I think this change is unnecessary :-?


Maybe I'll just mention the default here? Or refer to the section which 
speaks of the defaults ('as explained in the "Choosing the right cache 
sizes" and "How to configure the cache sizes" sections ...")?





  @item refcount-cache-size
  The maximum size of the refcount block cache in bytes
diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137
index 87965625d8..e3fb078588 100755
--- a/tests/qemu-iotests/137
+++ b/tests/qemu-iotests/137
@@ -109,7 +109,6 @@ $QEMU_IO \
  -c "reopen -o cache-size=1M,l2-cache-size=64k,refcount-cache-size=64k" \
  -c "reopen -o cache-size=1M,l2-cache-size=2M" \
  -c "reopen -o cache-size=1M,refcount-cache-size=2M" \
--c "reopen -o l2-cache-size=256T" \


The "L2 cache size too big" error can still be tested, but you will need
to create an image large enough to allow such a big cache.

$ qemu-img create -f qcow2 -o cluster_size=256k hd.qcow2 32P
$ $QEMU -drive file=hd.qcow2,l2-cache-entry-size=512,l2-cache-size=1T


* 32P qcow2 will take 33M - is it OK to create it just for a test?
* Is it worth to create a special test scenario, with a separate image 
creation, just for that case?


Leonid.



Berto





Re: [Qemu-block] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size

2018-08-11 Thread Leonid Bloch

On 8/10/18 4:14 PM, Alberto Garcia wrote:

On Fri 10 Aug 2018 08:26:42 AM CEST, Leonid Bloch wrote:

The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch 



-} else {
-if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
- (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
- * s->cluster_size);
-}
-if (!refcount_cache_size_set) {
-*refcount_cache_size = min_refcount_cache;
-}


Since you're getting rid of the rest of the code later anyway, I think
it's cleaner to only remove these last three lines here and leave the
rest untouched. It makes the patch shorter and easier to read.


But a correct formatting is important, I think. In every patch individually.




+/* If refcount-cache-size is not specified, it will be set to minimum
+ * in qcow2_update_options_prepare(). No need to set it here. */


This is not quite correct, because apart from the "not specified" case,
refcount_cache_size is also overridden in other circumstances: (a) the
value is specified but is too low, or (b) the value is set indirectly
via "cache-size" but the end result is too low[*].


Yes, I'll fix the comment. But I think that it's important that it 
remains, because someone who looks at the code does not necessarily 
looks at the commit message, and it may be puzzling that a minimal size 
is not enforced there.


How about the following:
"l2_cache_size and refcount_cache_size are ensured to have at least 
their minimum values in qcow2_update_options_prepare()"




Also, the same thing happens to l2-cache-size: if you set it manually
but it's too low then it will be overridden.

I'd personally remove the comment altogether, I think the explanation in
the commit message is enough.

Berto

[*] Now that I think of it: if you set e.g. cache-size = 16M and
 l2-cache-size = 16MB then you'll end up with refcount-cache-size =
 16M - 16M = 0, which will be overridden and set to the minimum.

 But then refcount-cache-size + l2-cache-size > cache-size


Yes, I have noticed this behavior, and I think that it is OK: the errors 
should be a response to a wrong input of the user, who does not 
necessarily understands the internals. So if a user inputs l2-cache-size 
which is larger than cache-size, that's obviously a mistake. But if they 
are equal, it's totally OK if QEMU will use some 256 or 128 KB for the 
minimal caches transparently. This is really negligible relatively to 
the total QEMU overhead.


Now the error message is: "l2-cache-size may not exceed cache-size".
Is it reasonable to make it: "l2-cache-size may not exceed the size of 
cache-size minus the minimal refcount cache size" (or similar)? :)




 So perhaps this should produce an error, and it may make sense to
 take the opportunity to move all the logic that ensures that
 MIN_SIZE <= size <= INT_MAX to read_cache_sizes(). But that's a
 possible task for the future and I wouldn't worry about it in this
 series.



Yeah, but it would be a good idea to move *all* the size-determining 
logic in one function.


Leonid.



Re: [Qemu-block] [PATCH v7 1/9] qcow2: Options' documentation fixes

2018-08-11 Thread Leonid Bloch

On 8/10/18 2:50 PM, Alberto Garcia wrote:

On Fri 10 Aug 2018 08:26:39 AM CEST, Leonid Bloch wrote:

Signed-off-by: Leonid Bloch 
---
  docs/qcow2-cache.txt | 16 +++-
  qemu-options.hx  |  9 ++---
  2 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 8a09a5cc5f..0f157d859a 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -97,12 +97,15 @@ need:
 l2_cache_size = disk_size_GB * 131072
 refcount_cache_size = disk_size_GB * 32768
  
-QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount

-cache of 256KB (262144 bytes), so using the formulas we've just seen
-we have
+With the default cluster size, to cover each 8 GB of the virtual image
+size, 1MB of L2 cache is needed:
  
-   1048576 / 131072 = 8 GB of virtual disk covered by that cache

-262144 /  32768 = 8 GB
+   65536 / 8 = 8192 = 8 GB / 1 MB


Where does this 65536 / 8 come from? The line that you changed follows
directly from the formula immediately before that paragraph:

  l2_cache_size = disk_size_GB * 131072

that is

  l2_cache_size / 131072 = disk_size_GB
1048576 / 131072 = 8 GB

Berto



How about the following (all the relevant section reproduced below, for 
a continuous readability):


""""""""""
Choosing the right cache sizes
--
In order to choose the cache sizes we need to know how they relate to
the amount of the allocated space.

The amount of virtual disk that can be mapped by the L2 and refcount
caches (in bytes) is:

   disk_size = l2_cache_size * cluster_size / 8
   disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits

With the default values for cluster_size (64KB) and refcount_bits
(16), this becomes:

   disk_size = l2_cache_size * 8192
   disk_size = refcount_cache_size * 32768

So in order to cover n GB of disk space with the default values we
need:

   l2_cache_size = disk_size_GB * 131072
   refcount_cache_size = disk_size_GB * 32768

For example, 1MB of L2 cache is needed to cover each 8 GB of the virtual
image size (given that the default cluster size is used):

   8 * 131072 = 1 MB

A default refcount cache is 4 times the cluster size, which defaults to
256 KB (262144 bytes). This is sufficient for 8 GB of image size:

   262144 / 32768 = 8 GB

""""""""""



[Qemu-block] [PATCH v7 3/9] qcow2: Make sizes more humanly readable

2018-08-10 Thread Leonid Bloch
Signed-off-by: Leonid Bloch 
---
 block/qcow2.c | 2 +-
 block/qcow2.h | 9 +
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 3f4abc394e..7a2d7a1d48 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -831,7 +831,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 }
 } else {
 if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE,
+*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
  (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
  * s->cluster_size);
 }
diff --git a/block/qcow2.h b/block/qcow2.h
index 81b844e936..39e1b279f8 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -27,6 +27,7 @@
 
 #include "crypto/block.h"
 #include "qemu/coroutine.h"
+#include "qemu/units.h"
 
 //#define DEBUG_ALLOC
 //#define DEBUG_ALLOC2
@@ -43,11 +44,11 @@
 
 /* 8 MB refcount table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_REFTABLE_SIZE 0x80
+#define QCOW_MAX_REFTABLE_SIZE (8 * MiB)
 
 /* 32 MB L1 table is enough for 2 PB images at 64k cluster size
  * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
-#define QCOW_MAX_L1_SIZE 0x200
+#define QCOW_MAX_L1_SIZE (32 * MiB)
 
 /* Allow for an average of 1k per snapshot table entry, should be plenty of
  * space for snapshot names and IDs */
@@ -75,9 +76,9 @@
 
 /* Whichever is more */
 #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */
-#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */
+#define DEFAULT_L2_CACHE_SIZE (1 * MiB)
 
-#define DEFAULT_CLUSTER_SIZE 65536
+#define DEFAULT_CLUSTER_SIZE (64 * KiB)
 
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
-- 
2.17.1




[Qemu-block] [PATCH v7 8/9] qcow2: Set the default cache-clean-interval to 10 minutes

2018-08-10 Thread Leonid Bloch
The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).

Signed-off-by: Leonid Bloch 
Reviewed-by: Alberto Garcia 
---
 block/qcow2.c| 2 +-
 block/qcow2.h| 1 +
 docs/qcow2-cache.txt | 4 ++--
 qapi/block-core.json | 3 ++-
 qemu-options.hx  | 2 +-
 5 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index ba4dfae735..b4f291765b 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState 
*bs,
 /* New interval for cache cleanup timer */
 r->cache_clean_interval =
 qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL,
-s->cache_clean_interval);
+DEFAULT_CACHE_CLEAN_INTERVAL);
 #ifndef CONFIG_LINUX
 if (r->cache_clean_interval != 0) {
 error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL
diff --git a/block/qcow2.h b/block/qcow2.h
index e699a55d02..5e94f7ffc4 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -78,6 +78,7 @@
 
 #define DEFAULT_CLUSTER_SIZE (64 * KiB)
 
+#define DEFAULT_CACHE_CLEAN_INTERVAL 600  /* seconds */
 
 #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts"
 #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request"
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 6ad1081d1a..684147ad45 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -204,8 +204,8 @@ This example removes all unused cache entries every 15 
minutes:
 
-drive file=hd.qcow2,cache-clean-interval=900
 
-If unset, the default value for this parameter is 0 and it disables
-this feature.
+If unset, the default value for this parameter is 600. Setting it to 0
+disables this feature.
 
 Note that this functionality currently relies on the MADV_DONTNEED
 argument for madvise() to actually free the memory. This is a
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 5b9084a394..9a6a708a37 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2830,7 +2830,8 @@
 #
 # @cache-clean-interval:  clean unused entries in the L2 and refcount
 # caches. The interval is in seconds. The default value
-# is 0 and it disables this feature (since 2.5)
+# is 600, and 0 disables this feature. (since 2.5)
+#
 # @encrypt:   Image decryption options. Mandatory for
 # encrypted images, except when doing a metadata-only
 # probe of the image. (since 2.10)
diff --git a/qemu-options.hx b/qemu-options.hx
index 4c44cdbc23..6abf3631ec 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -767,7 +767,7 @@ it which is not used for the L2 cache)
 
 @item cache-clean-interval
 Clean unused entries in the L2 and refcount caches. The interval is in seconds.
-The default value is 0 and it disables this feature.
+The default value is 600. Setting it to 0 disables this feature.
 
 @item pass-discard-request
 Whether discard requests to the qcow2 device should be forwarded to the data
-- 
2.17.1




[Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size

2018-08-10 Thread Leonid Bloch
The upper limit on the L2 cache size is increased from 1 MB to 32 MB.
This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.h| 2 +-
 docs/qcow2-cache.txt | 6 +++---
 qemu-options.hx  | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index d917b5f577..e699a55d02 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -74,7 +74,7 @@
 /* Must be at least 4 to cover all cases of refcount table growth */
 #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */
 
-#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB)
+#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB)
 
 #define DEFAULT_CLUSTER_SIZE (64 * KiB)
 
diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt
index 69af306267..6ad1081d1a 100644
--- a/docs/qcow2-cache.txt
+++ b/docs/qcow2-cache.txt
@@ -125,8 +125,8 @@ There are a few things that need to be taken into account:
(or the cache entry size: see "Using smaller cache sizes" below).
 
  - The default L2 cache size will cover the entire virtual size of an
-   image, up to a certain maximum. This maximum is 1 MB by default
-   (enough for image sizes of up to 8 GB with the default cluster size)
+   image, up to a certain maximum. This maximum is 32 MB by default
+   (enough for image sizes of up to 256 GB with the default cluster size)
and it can be reduced or enlarged using the "l2-cache-size" option.
The minimum is 2 clusters (or 2 cache entries, see below).
 
@@ -186,7 +186,7 @@ Some things to take into account:
always uses the cluster size as the entry size.
 
  - If the L2 cache is big enough to hold all of the image's L2 tables
-   (the default behavior for images of up to 8 GB in size) then none
+   (the default behavior for images of up to 256 GB in size) then none
of this is necessary and you can omit the "l2-cache-entry-size"
parameter altogether.
 
diff --git a/qemu-options.hx b/qemu-options.hx
index 22e8e2d113..4c44cdbc23 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -756,7 +756,7 @@ The maximum total size of the L2 table and refcount block 
caches in bytes
 
 @item l2-cache-size
 The maximum size of the L2 table cache in bytes
-(default: if cache-size is not specified - 1M; otherwise, as large as possible
+(default: if cache-size is not specified - 32M; otherwise, as large as possible
 within the cache-size, while permitting the requested or the minimal refcount
 cache size)
 
-- 
2.17.1




[Qemu-block] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size

2018-08-10 Thread Leonid Bloch
The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7a2d7a1d48..053242f94e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -829,16 +829,13 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 }
 }
-} else {
-if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
- (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
- * s->cluster_size);
-}
-if (!refcount_cache_size_set) {
-*refcount_cache_size = min_refcount_cache;
-}
+} else if (!l2_cache_size_set) {
+*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,
+ (uint64_t)DEFAULT_L2_CACHE_CLUSTERS
+ * s->cluster_size);
 }
+/* If refcount-cache-size is not specified, it will be set to minimum
+ * in qcow2_update_options_prepare(). No need to set it here. */
 
 if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) ||
 *l2_cache_entry_size > s->cluster_size ||
-- 
2.17.1




[Qemu-block] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache

2018-08-10 Thread Leonid Bloch
This series makes the qcow2 L2 cache assignment aware of the image size,
with the intention for it to cover the entire image. The importance of
this change is in noticeable performance improvement, especially with
heavy random I/O. The memory overhead is not big in most cases, as only
1 MB of cache for every 8 GB of image size is used. For cases with very
large images and/or small cluster sizes, or systems with limited RAM
resources, there is an upper limit on the default L2 cache: 32 MB. To
modify this limit one can use the already existing 'l2-cache-size' and
'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size',
as it was documented as the *maximum* L2 cache size, but in practice
behaved as the absolute size.

To compensate the memory overhead which may be increased following this
behavior, the default cache-clean-interval is set to 10 minutes by default
(was disabled by default before).

The L2 cache is also resized accordingly, by default, if the image is
resized.

Additionally, few minor changes are made (refactoring and documentation
fixes).

Differences from v1:
* .gitignore modification patch removed (unneeded).
* The grammar fix in conflicting cache sizing patch removed (merged).
* The update total_sectors when resizing patch squashed with the
  resizing patch.
* L2 cache is now capped at 32 MB.
* The default cache-clean-interval is set to 30 seconds.

Differences from v2:
* Made it clear in the documentation that setting cache-clean-interval
  to 0 disables this feature.

Differences from v3:
* The default cache-clean-interval is set to 10 minutes instead of 30
  seconds before.
* Commit message changes to better explain the patches.
* Some refactoring.

Differences from v4:
* Refactoring.
* Documentation and commit message fixes.

Differences from v5:
* A patch with cosmetic changes is split from the main patch
* A patch for avoiding duplication in refcount cache size assignment is
  split from the main patch.
* In the main patch the cap on the L2 cache size is set to only 1 MB (in
  order to be close to the previous behavior) and a separate patch sets
  it to 32 MB.
* Changes in the documentation fixes patch [1/8].

Differences from v6:
* A patch is added to make the defined sizes more humanly readable

Leonid Bloch (9):
  qcow2: Options' documentation fixes
  qcow2: Cosmetic changes
  qcow2: Make sizes more humanly readable
  qcow2: Avoid duplication in setting the refcount cache size
  qcow2: Assign the L2 cache relatively to the image size
  qcow2: Increase the default upper limit on the L2 cache size
  qcow2: Resize the cache upon image resizing
  qcow2: Set the default cache-clean-interval to 10 minutes
  qcow2: Explicit number replaced by a constant

 block/qcow2.c  | 54 +-
 block/qcow2.h  | 12 -
 docs/qcow2-cache.txt   | 33 ++-
 qapi/block-core.json   |  3 ++-
 qemu-options.hx| 11 +---
 tests/qemu-iotests/137 |  1 -
 tests/qemu-iotests/137.out |  1 -
 7 files changed, 66 insertions(+), 49 deletions(-)

-- 
2.17.1




[Qemu-block] [PATCH v7 2/9] qcow2: Cosmetic changes

2018-08-10 Thread Leonid Bloch
Some refactoring for better readability is done here.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index ec9e6238a0..3f4abc394e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -790,8 +790,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 *refcount_cache_size = qemu_opt_get_size(opts,
  QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
-*l2_cache_entry_size = qemu_opt_get_size(
-opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size);
+*l2_cache_entry_size = qemu_opt_get_size(opts,
+ QCOW2_OPT_L2_CACHE_ENTRY_SIZE,
+ s->cluster_size);
 
 if (combined_cache_size_set) {
 if (l2_cache_size_set && refcount_cache_size_set) {
@@ -823,8 +824,8 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts 
*opts,
 *l2_cache_size = max_l2_cache;
 *refcount_cache_size = combined_cache_size - *l2_cache_size;
 } else {
-*refcount_cache_size =
-MIN(combined_cache_size, min_refcount_cache);
+*refcount_cache_size = MIN(combined_cache_size,
+   min_refcount_cache);
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 }
 }
-- 
2.17.1




[Qemu-block] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size

2018-08-10 Thread Leonid Bloch
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.

Previously, the L2 cache was allocated without considering the image
size, and an option existed to manually determine its size. Thus to
achieve a full coverage of an image by the L2 cache (i.e. use more than
the default value of MAX(1 MB, 8 clusters)), a user needed to calculate
the required size manually, or with a script, and pass this value to the
'l2-cache-size' option.

Now, the L2 cache is assigned taking the virtual image size into account,
and will cover the entire image, unless the size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.

Signed-off-by: Leonid Bloch 
---
 block/qcow2.c  | 22 ++
 block/qcow2.h  |  4 +---
 docs/qcow2-cache.txt   | 13 -
 qemu-options.hx|  6 +++---
 tests/qemu-iotests/137 |  1 -
 tests/qemu-iotests/137.out |  1 -
 6 files changed, 22 insertions(+), 25 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 053242f94e..434fb89076 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -777,16 +777,19 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  uint64_t *refcount_cache_size, Error **errp)
 {
 BDRVQcow2State *s = bs->opaque;
-uint64_t combined_cache_size;
+uint64_t combined_cache_size, l2_cache_max_setting;
 bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set;
 int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size;
+uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
+uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
 
 combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE);
 l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE);
 refcount_cache_size_set = qemu_opt_get(opts, 
QCOW2_OPT_REFCOUNT_CACHE_SIZE);
 
 combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0);
-*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0);
+l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE,
+ DEFAULT_L2_CACHE_MAX_SIZE);
 *refcount_cache_size = qemu_opt_get_size(opts,
  QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0);
 
@@ -794,13 +797,16 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
  QCOW2_OPT_L2_CACHE_ENTRY_SIZE,
  s->cluster_size);
 
+*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting);
+
 if (combined_cache_size_set) {
 if (l2_cache_size_set && refcount_cache_size_set) {
 error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE
" and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set "
"at the same time");
 return;
-} else if (*l2_cache_size > combined_cache_size) {
+} else if (l2_cache_size_set &&
+   (l2_cache_max_setting > combined_cache_size)) {
 error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed "
QCOW2_OPT_CACHE_SIZE);
 return;
@@ -815,13 +821,9 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 } else if (refcount_cache_size_set) {
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 } else {
-uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE;
-uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8);
-
 /* Assign as much memory as possible to the L2 cache, and
  * use the remainder for the refcount cache */
-if (combined_cache_size >= max_l2_cache + min_refcount_cache) {
-*l2_cache_size = max_l2_cache;
+if (combined_cache_size >= *l2_cache_size + min_refcount_cache) {
 *refcount_cache_size = combined_cache_size - *l2_cache_size;
 } else {
 *refcount_cache_size = MIN(combined_cache_size,
@@ -829,10 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, 
QemuOpts *opts,
 *l2_cache_size = combined_cache_size - *refcount_cache_size;
 }
 }
-} else if (!l2_cache_size_set) {
-*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,

  1   2   >