Re: [Qemu-block] [PATCH] qcow2: Fix the calculation of the maximum L2 cache size
Thanks, Berto! On 8/16/19 3:17 PM, Alberto Garcia wrote: > The size of the qcow2 L2 cache defaults to 32 MB, which can be easily > larger than the maximum amount of L2 metadata that the image can have. > For example: with 64 KB clusters the user would need a qcow2 image > with a virtual size of 256 GB in order to have 32 MB of L2 metadata. > > Because of that, since commit b749562d9822d14ef69c9eaa5f85903010b86c30 > we forbid the L2 cache to become larger than the maximum amount of L2 > metadata for the image, calculated using this formula: > > uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); > > The problem with this formula is that the result should be rounded up > to the cluster size because an L2 table on disk always takes one full > cluster. > > For example, a 1280 MB qcow2 image with 64 KB clusters needs exactly > 160 KB of L2 metadata, but we need 192 KB on disk (3 clusters) even if > the last 32 KB of those are not going to be used. > > However QEMU rounds the numbers down and only creates 2 cache tables > (128 KB), which is not enough for the image. > > A quick test doing 4KB random writes on a 1280 MB image gives me > around 500 IOPS, while with the correct cache size I get 16K IOPS. > > Signed-off-by: Alberto Garcia > --- > block/qcow2.c | 6 +- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2.c b/block/qcow2.c > index 039bdc2f7e..865839682c 100644 > --- a/block/qcow2.c > +++ b/block/qcow2.c > @@ -826,7 +826,11 @@ static void read_cache_sizes(BlockDriverState *bs, > QemuOpts *opts, > bool l2_cache_entry_size_set; > int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size; > uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; > -uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); > +uint64_t max_l2_entries = DIV_ROUND_UP(virtual_disk_size, > s->cluster_size); > +/* An L2 table is always one cluster in size so the max cache size > + * should be a multiple of the cluster size. */ > +uint64_t max_l2_cache = ROUND_UP(max_l2_entries * sizeof(uint64_t), > + s->cluster_size); > > combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE); > l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE); >
Re: [Qemu-block] [PATCH 0/1] block: Eliminate the S_1KiB, S_2KiB, ... macros
On 1/11/19 9:14 PM, Markus Armbruster wrote: > Back in September, Leonid Block added a whole bunch of macros (commit * Bloch. :) > 540b8492618) to improve readability of qcow2.h a bit (commit > b6a95c6d100). He later used them to fix the "vdi" driver's parameter > cluster_size's default value (commit 3dd5b8f4718). He has now > proposed a further patch[1] to auto-generate these macros. That patch > feels overengineered to me. > > On closer examination, I found I dislike the macros before his new > patch. So did Eric Blake. > > The macros exist because the common KiB, MiB, ... macros aren't usable > when you need a literal rather than a constant expression. > stringify() does, and we use it to define the QemuOpts default value. > > Eric proposed to improve QemuOpts to accept integer default values, > too[2]. Before I review that patch series, I want to establish a > "stupidest solution that can possibly work" baseline. And that's what > this patch is. > > [1] [PATCH v2 0/1] include: Auto-generate the sizes lookup table > Message-ID: <20190103213320.2653-1-lbl...@janustech.com> > > [2] [PATCH v3 0/6] include: Auto-generate the sizes lookup table > Message-Id: <20190110191901.5082-1-ebl...@redhat.com> > > Markus Armbruster (1): >block: Eliminate the S_1KiB, S_2KiB, ... macros > > block/qcow2.h| 10 +++--- > block/vdi.c | 3 +- > include/qemu/units.h | 73 > 3 files changed, 7 insertions(+), 79 deletions(-) >
Re: [Qemu-block] [PATCH 1/1] block: Eliminate the S_1KiB, S_2KiB, ... macros
Hi, On 1/11/19 9:14 PM, Markus Armbruster wrote: > We define 54 macros for the powers of two >= 1024. We use six, in six > macro definitions. Four of them could just as well use the common MiB > macro, so do that. The remaining two can't, because they get passed > to stringify. Replace the macro by the literal number there. > Slightly harder to read in one instance (1048576 vs. S_1MiB), so add a > comment there. The other instance is a wash: 65536 vs S_64KiB. 65536 > has been good enough for more than seven years there. > > This effectively reverts commit 540b8492618 and 1240ac558d3. > > Signed-off-by: Markus Armbruster > --- > block/qcow2.h| 10 +++--- > block/vdi.c | 3 +- > include/qemu/units.h | 73 > 3 files changed, 7 insertions(+), 79 deletions(-) > > diff --git a/block/qcow2.h b/block/qcow2.h > index a98d24500b..2380cbfb41 100644 > --- a/block/qcow2.h > +++ b/block/qcow2.h > @@ -50,11 +50,11 @@ > > /* 8 MB refcount table is enough for 2 PB images at 64k cluster size >* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ > -#define QCOW_MAX_REFTABLE_SIZE S_8MiB > +#define QCOW_MAX_REFTABLE_SIZE (8 * MiB) > > /* 32 MB L1 table is enough for 2 PB images at 64k cluster size >* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ > -#define QCOW_MAX_L1_SIZE S_32MiB > +#define QCOW_MAX_L1_SIZE (32 * MiB) > > /* Allow for an average of 1k per snapshot table entry, should be plenty of >* space for snapshot names and IDs */ > @@ -81,15 +81,15 @@ > #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ > > #ifdef CONFIG_LINUX > -#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB > +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB) > #define DEFAULT_CACHE_CLEAN_INTERVAL 600 /* seconds */ > #else > -#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB > +#define DEFAULT_L2_CACHE_MAX_SIZE (8 * MiB) > /* Cache clean interval is currently available only on Linux, so must be 0 > */ > #define DEFAULT_CACHE_CLEAN_INTERVAL 0 > #endif > > -#define DEFAULT_CLUSTER_SIZE S_64KiB > +#define DEFAULT_CLUSTER_SIZE 65536 /* Note: can't use 64 * KiB here, because it's passed to stringify() */ Otherwise fine with me. The other solutions (including mine) indeed seem overengineered compared to this. Leonid. > > #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" > #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request" > diff --git a/block/vdi.c b/block/vdi.c > index 2380daa583..bf1d19dd68 100644 > --- a/block/vdi.c > +++ b/block/vdi.c > @@ -85,7 +85,8 @@ > #define BLOCK_OPT_STATIC "static" > > #define SECTOR_SIZE 512 > -#define DEFAULT_CLUSTER_SIZE S_1MiB > +#define DEFAULT_CLUSTER_SIZE 1048576 > +/* Note: can't use 1 * MiB, because it's passed to stringify() */ > > #if defined(CONFIG_VDI_DEBUG) > #define VDI_DEBUG 1 > diff --git a/include/qemu/units.h b/include/qemu/units.h > index 1c959d182e..692db3fbb2 100644 > --- a/include/qemu/units.h > +++ b/include/qemu/units.h > @@ -17,77 +17,4 @@ > #define PiB (INT64_C(1) << 50) > #define EiB (INT64_C(1) << 60) > > -/* > - * The following lookup table is intended to be used when a literal string of > - * the number of bytes is required (for example if it needs to be > stringified). > - * It can also be used for generic shortcuts of power-of-two sizes. > - * This table is generated using the AWK script below: > - * > - * BEGIN { > - * suffix="KMGTPE"; > - * for(i=10; i<64; i++) { > - * val=2**i; > - * s=substr(suffix, int(i/10), 1); > - * n=2**(i%10); > - * pad=21-int(log(n)/log(10)); > - * printf("#define S_%d%siB %*d\n", n, s, pad, val); > - * } > - * } > - */ > - > -#define S_1KiB 1024 > -#define S_2KiB 2048 > -#define S_4KiB 4096 > -#define S_8KiB 8192 > -#define S_16KiB16384 > -#define S_32KiB32768 > -#define S_64KiB65536 > -#define S_128KiB 131072 > -#define S_256KiB 262144 > -#define S_512KiB 524288 > -#define S_1MiB 1048576 > -#define S_2MiB 2097152 > -#define S_4MiB 4194304 > -#define S_8MiB 8388608 > -#define S_16MiB 16777216 > -#define S_32MiB 33554432 > -#define S_64MiB 67108864 > -#define S_128MiB 134217728 > -#define S_256MiB 268435456 > -#define S_512MiB 536870912 > -#define S_1GiB1073741824 > -#define S_2GiB2147483648 > -#define S_4GiB4294967296 > -#define S_8GiB8589934592 > -#define S_16GiB 17179869184 > -#define S_32GiB 34359738368 > -#define S_64GiB 68719476736 > -#define S_128GiB137438953472 > -#define S_256GiB274877906944 > -#define S_512GiB549755813888 > -#define S_1TiB
Re: [Qemu-block] [PATCH v3 1/6] qemu-option: Allow integer defaults
On 1/11/19 6:23 PM, Eric Blake wrote: > On 1/11/19 8:14 AM, Leonid Bloch wrote: >> On 1/10/19 9:18 PM, Eric Blake wrote: >>> Set the framework up for declaring integer options with an integer >>> default, instead of our current insane approach of requiring the >>> default value to be given as a string (which then has to be reparsed >>> at every use that wants a number). git grep '[^.]def_value_str' says >>> that we have done a good job of NOT abusing the internal field >>> outside of the implementation in qemu-option.c; therefore, it is not >>> too hard to audit that all code can either handle the new integer >>> defaults correctly or abort because a caller violated constraints. >>> Sadly, we DO have a constraint that qemu_opt_get() should not be >>> called on an option that has an integer type, because we have no >>> where to stash a cached const char * result; but callers that want >>> an integer should be using qemu_opt_get_number() and friends >>> anyways. >>> >>> Signed-off-by: Eric Blake >>> --- >>>include/qemu/option.h | 12 >>>util/qemu-option.c| 69 +-- >>>2 files changed, 72 insertions(+), 9 deletions(-) >>> >>> diff --git a/include/qemu/option.h b/include/qemu/option.h >>> index 844587cab39..46b80d5a6e1 100644 >>> --- a/include/qemu/option.h >>> +++ b/include/qemu/option.h >>> @@ -46,6 +46,18 @@ typedef struct QemuOptDesc { >>>const char *name; >>>enum QemuOptType type; >>>const char *help; >>> +/* >>> + * For QEMU_OPT_STRING: Leave def_value_int 0, and set def_value_str >>> + * to a default value or leave NULL for no default. >>> + * >>> + * For other types: Initialize at most non-zero def_value_int or a >>> + * parseable def_value_str for a default (must use a string for an >>> + * explicit default of 0, although an implicit default generally >>> + * works). If setting def_value_int, calling qemu_opt_get() on >>> + * that option will abort(); instead, call qemu_opt_get_del() or a >>> + * typed getter. >>> + */ >>> +uint64_t def_value_int; >>>const char *def_value_str; >>>} QemuOptDesc; >>> > >> >> If I understand correctly, the number will still be compiled into the >> binary as an expression string, but will be printed correctly during >> runtime? > > Anyone that uses .def_value_int will compile into the binary as an > 8-byte integer (regardless of how that number was spelled in C, either > as a single constant, or as a constant expression), and NOT as a decimal > string. String conversions for code that asks for a string will happen > by a runtime printf() (ideally, such code is limited to the case of > printing out information during help output). Code that is smart enough > to request a number now gets the default value directly, rather than the > old way of having to strtoll() decode a decimal string back into a > number. No one should ever be using .def_value_str = stringify(macro), > when they can instead just use .def_value_int = macro (which is what the > next patch fixes). Yes, you're right. Thanks for the explanation!
Re: [Qemu-block] [PATCH v3 1/6] qemu-option: Allow integer defaults
On 1/10/19 9:18 PM, Eric Blake wrote: > Set the framework up for declaring integer options with an integer > default, instead of our current insane approach of requiring the > default value to be given as a string (which then has to be reparsed > at every use that wants a number). git grep '[^.]def_value_str' says > that we have done a good job of NOT abusing the internal field > outside of the implementation in qemu-option.c; therefore, it is not > too hard to audit that all code can either handle the new integer > defaults correctly or abort because a caller violated constraints. > Sadly, we DO have a constraint that qemu_opt_get() should not be > called on an option that has an integer type, because we have no > where to stash a cached const char * result; but callers that want > an integer should be using qemu_opt_get_number() and friends > anyways. > > Signed-off-by: Eric Blake > --- > include/qemu/option.h | 12 > util/qemu-option.c| 69 +-- > 2 files changed, 72 insertions(+), 9 deletions(-) > > diff --git a/include/qemu/option.h b/include/qemu/option.h > index 844587cab39..46b80d5a6e1 100644 > --- a/include/qemu/option.h > +++ b/include/qemu/option.h > @@ -46,6 +46,18 @@ typedef struct QemuOptDesc { > const char *name; > enum QemuOptType type; > const char *help; > +/* > + * For QEMU_OPT_STRING: Leave def_value_int 0, and set def_value_str > + * to a default value or leave NULL for no default. > + * > + * For other types: Initialize at most non-zero def_value_int or a > + * parseable def_value_str for a default (must use a string for an > + * explicit default of 0, although an implicit default generally > + * works). If setting def_value_int, calling qemu_opt_get() on > + * that option will abort(); instead, call qemu_opt_get_del() or a > + * typed getter. > + */ > +uint64_t def_value_int; > const char *def_value_str; > } QemuOptDesc; > > diff --git a/util/qemu-option.c b/util/qemu-option.c > index de42e2a406a..06c4e8102a8 100644 > --- a/util/qemu-option.c > +++ b/util/qemu-option.c > @@ -321,7 +321,8 @@ const char *qemu_opt_get(QemuOpts *opts, const char *name) > opt = qemu_opt_find(opts, name); > if (!opt) { > const QemuOptDesc *desc = find_desc_by_name(opts->list->desc, name); > -if (desc && desc->def_value_str) { > +if (desc) { > +assert(!desc->def_value_int); > return desc->def_value_str; > } > } > @@ -364,8 +365,22 @@ char *qemu_opt_get_del(QemuOpts *opts, const char *name) > opt = qemu_opt_find(opts, name); > if (!opt) { > desc = find_desc_by_name(opts->list->desc, name); > -if (desc && desc->def_value_str) { > -str = g_strdup(desc->def_value_str); > +if (desc) { > +if (desc->def_value_str) { > +str = g_strdup(desc->def_value_str); > +} else if (desc->def_value_int) { > +switch (desc->type) { > +case QEMU_OPT_BOOL: > +str = g_strdup("on"); > +break; > +case QEMU_OPT_NUMBER: > +case QEMU_OPT_SIZE: > +str = g_strdup_printf("%" PRId64, desc->def_value_int); > +break; > +default: > +abort(); > +} > +} > } > return str; > } > @@ -400,8 +415,15 @@ static bool qemu_opt_get_bool_helper(QemuOpts *opts, > const char *name, > opt = qemu_opt_find(opts, name); > if (opt == NULL) { > const QemuOptDesc *desc = find_desc_by_name(opts->list->desc, name); > -if (desc && desc->def_value_str) { > -parse_option_bool(name, desc->def_value_str, , _abort); > +if (desc) { > +if (desc->def_value_int) { > +assert(desc->type != QEMU_OPT_STRING); > +return true; > +} > +if (desc->def_value_str) { > +parse_option_bool(name, desc->def_value_str, , > + _abort); > +} > } > return ret; > } > @@ -436,8 +458,15 @@ static uint64_t qemu_opt_get_number_helper(QemuOpts > *opts, const char *name, > opt = qemu_opt_find(opts, name); > if (opt == NULL) { > const QemuOptDesc *desc = find_desc_by_name(opts->list->desc, name); > -if (desc && desc->def_value_str) { > -parse_option_number(name, desc->def_value_str, , > _abort); > +if (desc) { > +if (desc->def_value_int) { > +assert(desc->type != QEMU_OPT_STRING); > +return desc->def_value_int; > +} > +if (desc->def_value_str) { > +parse_option_number(name, desc->def_value_str, , > +
Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] include: Auto-generate the sizes lookup table
On 1/10/19 2:51 PM, wrote: > Leonid Bloch writes: > >> Hi, >> >> On 1/8/19 2:20 PM, Kevin Wolf wrote: >>> Am 03.01.2019 um 22:33 hat Leonid Bloch geschrieben: >>>> The lookup table for power-of-two sizes is now auto-generated during the >>>> build, and not hard-coded into the units.h file. >>>> >>>> This partially reverts commit 540b8492618eb. >>>> >>>> Signed-off-by: Leonid Bloch >>> >>> During a downstream review, Max found a problem with the table that we >>> could fix while we're touching it: >>> >>> Upstream: All >= S_2GiB are not valid ints. (qemu assumes that >>> sizeof(int) == 4, right?) So S_2GiB should be 2147483648u and all >>> above should be ...ull or better UINT64_C(). > > So their (integer) type can vary. Whether that's a problem depends on > how the macros are used. > >> But the initial reasoning for this table was to have a pure number >> there. If there will be strings like "2147483648u/ull" or >> "UINT64_C(...)" there, they will be stringified, literally, and will >> appear as such inside the binary. > > "Macro that expands to an integer constant that stringify() turns into a > decimal number" is a rather peculiar contract. > >>If specifying the unit64 type is >> really needed, one can always use, e.g., 2 * GiB, from units.h. > > Right now I suspect these S_ macros should generally be avoided. > > We have 54 of them. I count six uses: > > block/qcow2.h:#define QCOW_MAX_REFTABLE_SIZE S_8MiB > block/qcow2.h:#define QCOW_MAX_L1_SIZE S_32MiB > block/qcow2.h:#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB > block/qcow2.h:#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB > block/qcow2.h:#define DEFAULT_CLUSTER_SIZE S_64KiB > block/vdi.c:#define DEFAULT_CLUSTER_SIZE S_1MiB > > Which if them truly need stringify-able integers? > These two: block/qcow2.h:#define DEFAULT_CLUSTER_SIZE block/vdi.c:#define DEFAULT_CLUSTER_SIZE.
Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] include: Auto-generate the sizes lookup table
On 1/10/19 2:40 PM, Markus Armbruster wrote: > Leonid Bloch writes: > >> Hi, >> >> On 1/8/19 11:31 AM, Markus Armbruster wrote: >>> I'd leave it hard-coded. Replacing a few trivial defines by an arguably >>> less trivial script doesn't feel like an improvement. In this case, it >>> doesn't even save lines. >> >> As I've said, I'm fine with that. The autogeneration sounds the right >> thing to do here, as the table is autogenertaed anyway, but indeed it is >> already there, explained, and even the script for generating it appears >> in the comments for reproducibility. >> >>> >>> I'm not sure I'd use shell for this, but since you already wrote it and >>> it works... >>> >> >> If you've noticed, the original script was in AWK. But to be as generic >> as possible, I didn't write the generation script in AWK because even >> AWK is not guaranteed to be installed on the build system. The only >> interpreted language that is guaranteed to be there is shell (most basic >> shell) because .configure itself needs it. > > Well, > > $ grep -w awk configure > maj=`libgcrypt-config --version | awk -F . '{print $1}'` > min=`libgcrypt-config --version | awk -F . '{print $2}'` > > The build also requires Python. Not for critical things as this. I mean it can still build OK without Python. > > [...] >
Re: [Qemu-block] [PATCH v2 1/1] include: Auto-generate the sizes lookup table
Hi, On 1/8/19 2:20 PM, Kevin Wolf wrote: > Am 03.01.2019 um 22:33 hat Leonid Bloch geschrieben: >> The lookup table for power-of-two sizes is now auto-generated during the >> build, and not hard-coded into the units.h file. >> >> This partially reverts commit 540b8492618eb. >> >> Signed-off-by: Leonid Bloch > > During a downstream review, Max found a problem with the table that we > could fix while we're touching it: > > Upstream: All >= S_2GiB are not valid ints. (qemu assumes that > sizeof(int) == 4, right?) So S_2GiB should be 2147483648u and all > above should be ...ull or better UINT64_C(). But the initial reasoning for this table was to have a pure number there. If there will be strings like "2147483648u/ull" or "UINT64_C(...)" there, they will be stringified, literally, and will appear as such inside the binary. If specifying the unit64 type is really needed, one can always use, e.g., 2 * GiB, from units.h. Leonid.
Re: [Qemu-block] [Qemu-devel] [PATCH v2 1/1] include: Auto-generate the sizes lookup table
Hi, On 1/8/19 11:31 AM, Markus Armbruster wrote: > I'd leave it hard-coded. Replacing a few trivial defines by an arguably > less trivial script doesn't feel like an improvement. In this case, it > doesn't even save lines. As I've said, I'm fine with that. The autogeneration sounds the right thing to do here, as the table is autogenertaed anyway, but indeed it is already there, explained, and even the script for generating it appears in the comments for reproducibility. > > I'm not sure I'd use shell for this, but since you already wrote it and > it works... > If you've noticed, the original script was in AWK. But to be as generic as possible, I didn't write the generation script in AWK because even AWK is not guaranteed to be installed on the build system. The only interpreted language that is guaranteed to be there is shell (most basic shell) because .configure itself needs it. Sure, the script could be prettier and shorter, but I wanted to keep it compatible with the most basic shell, not only Bash, and without needing external programs. Eric - thanks for the comment about 'local' - I will get rid of it if we decide to include this patch. > > Unchecked use of command-line arguments is not nice: > > $ scripts/gen-sizes.sh > scripts/gen-sizes.sh: line 64: : No such file or directory > scripts/gen-sizes.sh: line 65: : No such file or directory > scripts/gen-sizes.sh: line 66: : No such file or directory > > You should error out if $# -ne 1. But in such a simple script, I'd > dispense with arguments and print to stdout. Matter of taste. > Rejecting $# -ne 0 is still nice then. > Agree, thanks! Leonid. ___
Re: [Qemu-block] [Qemu-devel] [PATCH 1/1] include: Auto-generate the sizes lookup table
On 1/4/19 12:31 PM, Alberto Garcia wrote: > On Thu 03 Jan 2019 10:42:30 PM CET, Eric Blake wrote: > >> In my view, code generators make sense when used on code that is >> expected to change over time (a good example is QAPI because we add >> new commands every release; other places might be a generator to help >> deal with syscall handlers since newer kernels can add a syscall; or >> even the fact that we have used some powerful GNU make textual >> processing to make it easy to add files to particular subsets of the >> build with as few lines edited as possible), where the goal is that >> the generator gives you both a compact representation that is easier >> to edit, and that the expansion from the generator ensures that >> repetitive boilerplate is formed without typos. In short, if a >> generator results in a net reduction in lines of edited source in >> relation to the lines it produces, AND if the source that gets >> regenerated is likely to change, then it makes total sense to spend >> time on the generator. But when the amount of effort to write a >> generator costs as much as just hard-coding the list outright, >> especially when the list is not going to change (there really aren't >> any other powers of 2 within 64 bits), I'm not sure a generator adds >> any value. > > I agree with Eric. Fine with me. I just thought that in the previous conversation (https://patchwork.kernel.org/patch/10666975/#22302435) we have agreed that I'll send this patch. I've sent v2 already with Phil's suggestions included, please feel free to pull it if desired. Leonid. > > Berto >
[Qemu-block] [PATCH v2 0/1] include: Auto-generate the sizes lookup table
Following the conversations here: https://patchwork.kernel.org/patch/10665157 and here: https://patchwork.kernel.org/patch/10666975 Making the lookup table for power-of-two sizes auto-generated, instead of being hard-coded into the units.h file. I'm not sure if the changes I've made to Makefile here are "standard". Please correct me if that's not the case. Sorry it took so much time - I was busy with something completely different. Regards, Leonid. --- Differences from v1: * The generated header moved from $(SRC_PATH)/include/qemu/sizes.h to pow2_sizes.h in the build directory. * The commit message mentions the commit which is partially reverted by the current commit. * Small changes in the comments of the generated header file. Leonid Bloch (1): include: Auto-generate the sizes lookup table .gitignore | 1 + Makefile | 5 +++ block/qcow2.h| 2 +- block/vdi.c | 1 + include/qemu/units.h | 73 scripts/gen-sizes.sh | 66 +++ 6 files changed, 74 insertions(+), 74 deletions(-) create mode 100755 scripts/gen-sizes.sh -- 2.20.1
[Qemu-block] [PATCH v2 1/1] include: Auto-generate the sizes lookup table
The lookup table for power-of-two sizes is now auto-generated during the build, and not hard-coded into the units.h file. This partially reverts commit 540b8492618eb. Signed-off-by: Leonid Bloch --- .gitignore | 1 + Makefile | 5 +++ block/qcow2.h| 2 +- block/vdi.c | 1 + include/qemu/units.h | 73 scripts/gen-sizes.sh | 66 +++ 6 files changed, 74 insertions(+), 74 deletions(-) create mode 100755 scripts/gen-sizes.sh diff --git a/.gitignore b/.gitignore index 0430257313..721a7f4454 100644 --- a/.gitignore +++ b/.gitignore @@ -59,6 +59,7 @@ /qemu-version.h /qemu-version.h.tmp /module_block.h +/pow2_sizes.h /scsi/qemu-pr-helper /vhost-user-scsi /vhost-user-blk diff --git a/Makefile b/Makefile index dd53965f77..db72786ccb 100644 --- a/Makefile +++ b/Makefile @@ -122,6 +122,8 @@ endif GENERATED_FILES += module_block.h +GENERATED_FILES += pow2_sizes.h + TRACE_HEADERS = trace-root.h $(trace-events-subdirs:%=%/trace.h) TRACE_SOURCES = trace-root.c $(trace-events-subdirs:%=%/trace.c) TRACE_DTRACE = @@ -499,6 +501,9 @@ ifdef CONFIG_MPATH scsi/qemu-pr-helper$(EXESUF): LIBS += -ludev -lmultipath -lmpathpersist endif +pow2_sizes.h: $(SRC_PATH)/scripts/gen-sizes.sh + $(call quiet-command,sh $(SRC_PATH)/scripts/gen-sizes.sh $@,"GEN","$@") + qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"GEN","$@") diff --git a/block/qcow2.h b/block/qcow2.h index a98d24500b..f5fa419ae7 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -27,7 +27,7 @@ #include "crypto/block.h" #include "qemu/coroutine.h" -#include "qemu/units.h" +#include "pow2_sizes.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 diff --git a/block/vdi.c b/block/vdi.c index 2380daa583..06d7335b3e 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -51,6 +51,7 @@ #include "qemu/osdep.h" #include "qemu/units.h" +#include "pow2_sizes.h" #include "qapi/error.h" #include "qapi/qobject-input-visitor.h" #include "qapi/qapi-visit-block-core.h" diff --git a/include/qemu/units.h b/include/qemu/units.h index 1c959d182e..692db3fbb2 100644 --- a/include/qemu/units.h +++ b/include/qemu/units.h @@ -17,77 +17,4 @@ #define PiB (INT64_C(1) << 50) #define EiB (INT64_C(1) << 60) -/* - * The following lookup table is intended to be used when a literal string of - * the number of bytes is required (for example if it needs to be stringified). - * It can also be used for generic shortcuts of power-of-two sizes. - * This table is generated using the AWK script below: - * - * BEGIN { - * suffix="KMGTPE"; - * for(i=10; i<64; i++) { - * val=2**i; - * s=substr(suffix, int(i/10), 1); - * n=2**(i%10); - * pad=21-int(log(n)/log(10)); - * printf("#define S_%d%siB %*d\n", n, s, pad, val); - * } - * } - */ - -#define S_1KiB 1024 -#define S_2KiB 2048 -#define S_4KiB 4096 -#define S_8KiB 8192 -#define S_16KiB16384 -#define S_32KiB32768 -#define S_64KiB65536 -#define S_128KiB 131072 -#define S_256KiB 262144 -#define S_512KiB 524288 -#define S_1MiB 1048576 -#define S_2MiB 2097152 -#define S_4MiB 4194304 -#define S_8MiB 8388608 -#define S_16MiB 16777216 -#define S_32MiB 33554432 -#define S_64MiB 67108864 -#define S_128MiB 134217728 -#define S_256MiB 268435456 -#define S_512MiB 536870912 -#define S_1GiB1073741824 -#define S_2GiB2147483648 -#define S_4GiB4294967296 -#define S_8GiB8589934592 -#define S_16GiB 17179869184 -#define S_32GiB 34359738368 -#define S_64GiB 68719476736 -#define S_128GiB137438953472 -#define S_256GiB274877906944 -#define S_512GiB549755813888 -#define S_1TiB 1099511627776 -#define S_2TiB 219902322 -#define S_4TiB 4398046511104 -#define S_8TiB 8796093022208 -#define S_16TiB 17592186044416 -#define S_32TiB 35184372088832 -#define S_64TiB 70368744177664 -#define S_128TiB 140737488355328 -#define S_256TiB 281474976710656 -#define S_512TiB 562949953421312 -#define S_1PiB 1125899906842624 -#define S_2PiB 2251799813685248 -#define S_4PiB 4503599627370496 -#define S_8PiB 9007199254740992 -#define S_16PiB18014398509481984 -#define S_32PiB36028797018963968 -#define S_64PiB72057594037927936 -#define S_128PiB 14
Re: [Qemu-block] [Qemu-devel] [PATCH 1/1] include: Auto-generate the sizes lookup table
Hi, On 1/4/19 12:04 AM, Eric Blake wrote: > On 1/3/19 2:57 PM, Leonid Bloch wrote: > >>> I have to say that I'm not very convinced of the benefits of replacing a >>> set of trivial numeric macros with a longer and harder to read shell >>> script accompanied by changes to the build system. >> >> I think that the benefit is that the script is easily verifiable, >> whereas if someone would like to verify the table, they will need to >> generate it themselves. Also, this table is automatically generated >> anyway, so it only makes sense to generate it during the build. > > But no one is submitting patches to active modify the table. The work > has already been done once to generate it, and the reviewers have > already verified it; at this point, no further changes are likely to > happen (other than my pipe dream of entirely getting rid of the table in > favor of using runtime generation of human-friendly default strings is > added to QemuOpts instead). If the table were not already in git, then > generating it at build time might make sense; but at this stage in the > game, you're slowing down every build to regenerate something that is > already correct. > OK, that's a good point. But still, I think that you agree that it would be more correct to generate it during the build, right? So now it is there already and it works. But isn't it worth it to make it more correct? Leonid.
Re: [Qemu-block] [PATCH 1/1] include: Auto-generate the sizes lookup table
Hi Phil, Thanks for your review! Very valid points. All taken, except the naming (I think "pow2_sizes.h" would be better than "lookup_table_sizes.h", no?) and the "generated" directory - I agree that it's a good idea, but I think it requires a separate patch which moves the generated files there. Regards, Leonid. On 1/3/19 12:33 PM, Philippe Mathieu-Daudé wrote: > > So as 'module_block.h', why not simply name it 'lookup_table_sizes.h'? > > Actually, we could move those in a generated/ subdirectory, this > would make the source file more explicit: > > #include "generated/module_block.h" > > Kevin, what do you think? >
Re: [Qemu-block] [PATCH 1/1] include: Auto-generate the sizes lookup table
Hi Berto, On 1/3/19 12:19 PM, Alberto Garcia wrote: > On Wed 02 Jan 2019 12:09:05 PM CET, Leonid Bloch wrote: >> +print_sizes() { >> +local p=10 >> +while [ ${p} -lt 64 ] >> +do >> +local pad=' ' >> +local n=$((p % 10)) >> +n=$((1 << n)) >> +[ $((n / 100)) -eq 0 ] && pad=' ' >> +[ $((n / 10)) -eq 0 ] && pad=' ' >> +local suff=$((p / 10)) >> +printf "#define S_%u%s%s%20u\n" ${n} "$(size_suffix ${suff})" \ >> +"${pad}" $((1 << p)) >> +p=$((p + 1)) >> +done >> +} > > I have to say that I'm not very convinced of the benefits of replacing a > set of trivial numeric macros with a longer and harder to read shell > script accompanied by changes to the build system. I think that the benefit is that the script is easily verifiable, whereas if someone would like to verify the table, they will need to generate it themselves. Also, this table is automatically generated anyway, so it only makes sense to generate it during the build. Leonid. > > Berto >
[Qemu-block] [PATCH 1/1] include: Auto-generate the sizes lookup table
The lookup table for power-of-two sizes is now auto-generated during the build, and not hard-coded into the units.h file. Signed-off-by: Leonid Bloch --- .gitignore | 1 + Makefile | 5 +++ block/qcow2.h| 2 +- block/vdi.c | 1 + include/qemu/units.h | 73 scripts/gen-sizes.sh | 66 +++ 6 files changed, 74 insertions(+), 74 deletions(-) create mode 100755 scripts/gen-sizes.sh diff --git a/.gitignore b/.gitignore index 0430257313..5945220303 100644 --- a/.gitignore +++ b/.gitignore @@ -59,6 +59,7 @@ /qemu-version.h /qemu-version.h.tmp /module_block.h +/include/qemu/sizes.h /scsi/qemu-pr-helper /vhost-user-scsi /vhost-user-blk diff --git a/Makefile b/Makefile index dd53965f77..435d92821b 100644 --- a/Makefile +++ b/Makefile @@ -122,6 +122,8 @@ endif GENERATED_FILES += module_block.h +GENERATED_FILES += $(SRC_PATH)/include/qemu/sizes.h + TRACE_HEADERS = trace-root.h $(trace-events-subdirs:%=%/trace.h) TRACE_SOURCES = trace-root.c $(trace-events-subdirs:%=%/trace.c) TRACE_DTRACE = @@ -499,6 +501,9 @@ ifdef CONFIG_MPATH scsi/qemu-pr-helper$(EXESUF): LIBS += -ludev -lmultipath -lmpathpersist endif +$(SRC_PATH)/include/qemu/sizes.h: $(SRC_PATH)/scripts/gen-sizes.sh + $(call quiet-command,sh $(SRC_PATH)/scripts/gen-sizes.sh $@,"GEN","$@") + qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"GEN","$@") diff --git a/block/qcow2.h b/block/qcow2.h index a98d24500b..41d14baa17 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -27,7 +27,7 @@ #include "crypto/block.h" #include "qemu/coroutine.h" -#include "qemu/units.h" +#include "qemu/sizes.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 diff --git a/block/vdi.c b/block/vdi.c index 2380daa583..62dafe0cc6 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -51,6 +51,7 @@ #include "qemu/osdep.h" #include "qemu/units.h" +#include "qemu/sizes.h" #include "qapi/error.h" #include "qapi/qobject-input-visitor.h" #include "qapi/qapi-visit-block-core.h" diff --git a/include/qemu/units.h b/include/qemu/units.h index 1c959d182e..692db3fbb2 100644 --- a/include/qemu/units.h +++ b/include/qemu/units.h @@ -17,77 +17,4 @@ #define PiB (INT64_C(1) << 50) #define EiB (INT64_C(1) << 60) -/* - * The following lookup table is intended to be used when a literal string of - * the number of bytes is required (for example if it needs to be stringified). - * It can also be used for generic shortcuts of power-of-two sizes. - * This table is generated using the AWK script below: - * - * BEGIN { - * suffix="KMGTPE"; - * for(i=10; i<64; i++) { - * val=2**i; - * s=substr(suffix, int(i/10), 1); - * n=2**(i%10); - * pad=21-int(log(n)/log(10)); - * printf("#define S_%d%siB %*d\n", n, s, pad, val); - * } - * } - */ - -#define S_1KiB 1024 -#define S_2KiB 2048 -#define S_4KiB 4096 -#define S_8KiB 8192 -#define S_16KiB16384 -#define S_32KiB32768 -#define S_64KiB65536 -#define S_128KiB 131072 -#define S_256KiB 262144 -#define S_512KiB 524288 -#define S_1MiB 1048576 -#define S_2MiB 2097152 -#define S_4MiB 4194304 -#define S_8MiB 8388608 -#define S_16MiB 16777216 -#define S_32MiB 33554432 -#define S_64MiB 67108864 -#define S_128MiB 134217728 -#define S_256MiB 268435456 -#define S_512MiB 536870912 -#define S_1GiB1073741824 -#define S_2GiB2147483648 -#define S_4GiB4294967296 -#define S_8GiB8589934592 -#define S_16GiB 17179869184 -#define S_32GiB 34359738368 -#define S_64GiB 68719476736 -#define S_128GiB137438953472 -#define S_256GiB274877906944 -#define S_512GiB549755813888 -#define S_1TiB 1099511627776 -#define S_2TiB 219902322 -#define S_4TiB 4398046511104 -#define S_8TiB 8796093022208 -#define S_16TiB 17592186044416 -#define S_32TiB 35184372088832 -#define S_64TiB 70368744177664 -#define S_128TiB 140737488355328 -#define S_256TiB 281474976710656 -#define S_512TiB 562949953421312 -#define S_1PiB 1125899906842624 -#define S_2PiB 2251799813685248 -#define S_4PiB 4503599627370496 -#define S_8PiB 9007199254740992 -#define S_16PiB18014398509481984 -#define S_32PiB36028797018963968 -#define S_64PiB72057594037927936 -#define S_128PiB 14
[Qemu-block] [PATCH 0/1] include: Auto-generate the sizes lookup table
Following the conversations here: https://patchwork.kernel.org/patch/10665157 and here: https://patchwork.kernel.org/patch/10666975 Making the lookup table for power-of-two sizes auto-generated, instead of being hard-coded into the units.h file. I'm not sure if the changes I've made to Makefile here are "standard". Please correct me if that's not the case. Sorry it took so much time - I was busy with something completely different. Regards, Leonid. Leonid Bloch (1): include: Auto-generate the sizes lookup table .gitignore | 1 + Makefile | 5 +++ block/qcow2.h| 2 +- block/vdi.c | 1 + include/qemu/units.h | 73 scripts/gen-sizes.sh | 66 +++ 6 files changed, 74 insertions(+), 74 deletions(-) create mode 100755 scripts/gen-sizes.sh -- 2.20.1
Re: [Qemu-block] [PATCH v2 1/1] include: Add a comment to explain the origin of sizes' lookup table
Hi Phil, Hi Eric, (Eric, for some reason you weren't CC'd to this thread - sorry.) On 11/5/18 5:58 PM, Philippe Mathieu-Daudé wrote: > Hi Leonid, > > On 4/11/18 19:07, Leonid Bloch wrote: >> The lookup table for power-of-two sizes was added in commit 540b8492618eb >> for the purpose of having convenient shortcuts for these sizes in cases >> when the literal number has to be present at compile time, and >> expressions as '(1 * KiB)' can not be used. One such case is the >> stringification of sizes. Beyond that, it is convenient to use these >> shortcuts for all power-of-two sizes, even if they don't have to be >> literal numbers. >> >> Despite its convenience, this table introduced 55 lines of "dumb" code, >> the purpose and origin of which are obscure without reading the message >> of the commit which introduced it. This patch fixes that by adding a >> comment to the code itself with a brief explanation for the reasoning >> behind this table. This comment includes the short AWK script that >> generated the table, so that anyone who's interested could make sure >> that the values in it are correct (otherwise these values look as if >> they were typed manually). >> >> Signed-off-by: Leonid Bloch >> --- >> include/qemu/units.h | 18 ++ >> 1 file changed, 18 insertions(+) >> >> diff --git a/include/qemu/units.h b/include/qemu/units.h >> index 68a7758650..1c959d182e 100644 >> --- a/include/qemu/units.h >> +++ b/include/qemu/units.h >> @@ -17,6 +17,24 @@ >> #define PiB (INT64_C(1) << 50) >> #define EiB (INT64_C(1) << 60) >> +/* >> + * The following lookup table is intended to be used when a literal >> string of >> + * the number of bytes is required (for example if it needs to be >> stringified). >> + * It can also be used for generic shortcuts of power-of-two sizes. > > ^ ok > > v this part is not useful in the tree IMHO. > >> + * This table is generated using the AWK script below: >> + * >> + * BEGIN { >> + * suffix="KMGTPE"; >> + * for(i=10; i<64; i++) { >> + * val=2**i; >> + * s=substr(suffix, int(i/10), 1); >> + * n=2**(i%10); >> + * pad=21-int(log(n)/log(10)); >> + * printf("#define S_%d%siB %*d\n", n, s, pad, val); >> + * } >> + * } >> + */ > > If it is generated and the stringified are also generated, why not > generate this once via the ./configure, since it is used by machines and > not by humans? You beat me to it! I was just thinking about that last evening. Indeed it's not so elegant to put generated code in a file that is intended for human handling. Generating by ./configure would be prettier. A runtime solution that interprets hard-coded expression strings, like Eric suggested, can also be a possibility, but it seems more complicated than just generating these at the configuration stage. Plus one gets the S_* constants that can be used in many places, not only where an explicit number is needed. What do you think? A question though: if it to be generated by ./configure, where do you suggest to put the generated values? And also: is it OK to assume there's AWK (or another scripting language) on the system for generating these? Thanks, Leonid.
[Qemu-block] [PATCH v2 0/1] vdi: Use a literal number of bytes for
Please see the commit message for the rationale. Difference from v1: * Format string is fixed. Leonid Bloch (1): vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZE block/vdi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) -- 2.17.1
[Qemu-block] [PATCH v2 1/1] vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZE
If an expression is used to define DEFAULT_CLUSTER_SIZE, when compiled, it will be embedded as a literal expression in the binary (as the default value) because it is stringified to mark the size of the default value. Now this is fixed by using a defined number to define this value. Signed-off-by: Leonid Bloch --- block/vdi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/vdi.c b/block/vdi.c index 6555cffb88..d996793f1c 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -85,7 +85,7 @@ #define BLOCK_OPT_STATIC "static" #define SECTOR_SIZE 512 -#define DEFAULT_CLUSTER_SIZE (1 * MiB) +#define DEFAULT_CLUSTER_SIZE S_1MiB #if defined(CONFIG_VDI_DEBUG) #define VDI_DEBUG 1 @@ -432,7 +432,7 @@ static int vdi_open(BlockDriverState *bs, QDict *options, int flags, goto fail; } else if (header.block_size != DEFAULT_CLUSTER_SIZE) { error_setg(errp, "unsupported VDI image (block size %" PRIu32 - " is not %" PRIu64 ")", + " is not %" PRIu32 ")", header.block_size, DEFAULT_CLUSTER_SIZE); ret = -ENOTSUP; goto fail; -- 2.17.1
[Qemu-block] [PATCH v2 1/1] include: Add a comment to explain the origin of sizes' lookup table
The lookup table for power-of-two sizes was added in commit 540b8492618eb for the purpose of having convenient shortcuts for these sizes in cases when the literal number has to be present at compile time, and expressions as '(1 * KiB)' can not be used. One such case is the stringification of sizes. Beyond that, it is convenient to use these shortcuts for all power-of-two sizes, even if they don't have to be literal numbers. Despite its convenience, this table introduced 55 lines of "dumb" code, the purpose and origin of which are obscure without reading the message of the commit which introduced it. This patch fixes that by adding a comment to the code itself with a brief explanation for the reasoning behind this table. This comment includes the short AWK script that generated the table, so that anyone who's interested could make sure that the values in it are correct (otherwise these values look as if they were typed manually). Signed-off-by: Leonid Bloch --- include/qemu/units.h | 18 ++ 1 file changed, 18 insertions(+) diff --git a/include/qemu/units.h b/include/qemu/units.h index 68a7758650..1c959d182e 100644 --- a/include/qemu/units.h +++ b/include/qemu/units.h @@ -17,6 +17,24 @@ #define PiB (INT64_C(1) << 50) #define EiB (INT64_C(1) << 60) +/* + * The following lookup table is intended to be used when a literal string of + * the number of bytes is required (for example if it needs to be stringified). + * It can also be used for generic shortcuts of power-of-two sizes. + * This table is generated using the AWK script below: + * + * BEGIN { + * suffix="KMGTPE"; + * for(i=10; i<64; i++) { + * val=2**i; + * s=substr(suffix, int(i/10), 1); + * n=2**(i%10); + * pad=21-int(log(n)/log(10)); + * printf("#define S_%d%siB %*d\n", n, s, pad, val); + * } + * } + */ + #define S_1KiB 1024 #define S_2KiB 2048 #define S_4KiB 4096 -- 2.17.1
[Qemu-block] [PATCH v2 0/1] include: Add a comment to explain the origin of
Please see the commit message for the rationale. Difference from v1: * Tabs removed from the code indentation of the sample code in the comment, in order to pass checkpatch. Leonid Bloch (1): include: Add a comment to explain the origin of sizes' lookup table include/qemu/units.h | 18 ++ 1 file changed, 18 insertions(+) -- 2.17.1
Re: [Qemu-block] [Qemu-devel] [PATCH] vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZE
> This changes the data type, so a fix is needed for a format string in > line 434. Yes, I saw, thanks! Will do. > > Regards > Stefan >
[Qemu-block] [PATCH] vdi: Use a literal number of bytes for DEFAULT_CLUSTER_SIZE
If an expression is used to define DEFAULT_CLUSTER_SIZE, when compiled, it will be embedded as a literal expression in the binary (as the default value) because it is stringified to mark the size of the default value. Now this is fixed by using a defined number to define this value. Signed-off-by: Leonid Bloch --- block/vdi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/vdi.c b/block/vdi.c index 6555cffb88..25320eff47 100644 --- a/block/vdi.c +++ b/block/vdi.c @@ -85,7 +85,7 @@ #define BLOCK_OPT_STATIC "static" #define SECTOR_SIZE 512 -#define DEFAULT_CLUSTER_SIZE (1 * MiB) +#define DEFAULT_CLUSTER_SIZE S_1MiB #if defined(CONFIG_VDI_DEBUG) #define VDI_DEBUG 1 -- 2.17.1
[Qemu-block] [PATCH] include: Add a comment to explain the origin of sizes' lookup table
The lookup table for power-of-two sizes was added in commit 540b8492618eb for the purpose of having convenient shortcuts for these sizes in cases when the literal number has to be present at compile time, and expressions as '(1 * KiB)' can not be used. One such case is the stringification of sizes. Beyond that, it is convenient to use these shortcuts for all power-of-two sizes, even if they don't have to be literal numbers. Despite its convenience, this table introduced 55 lines of "dumb" code, the purpose and origin of which are obscure without reading the message of the commit which introduced it. This patch fixes that by adding a comment to the code itself with a brief explanation for the reasoning behind this table. This comment includes the short AWK script that generated the table, so that anyone who's interested could make sure that the values in it are correct (otherwise these values look as if they were typed manually). Signed-off-by: Leonid Bloch --- include/qemu/units.h | 18 ++ 1 file changed, 18 insertions(+) diff --git a/include/qemu/units.h b/include/qemu/units.h index 68a7758650..051c274ca2 100644 --- a/include/qemu/units.h +++ b/include/qemu/units.h @@ -17,6 +17,24 @@ #define PiB (INT64_C(1) << 50) #define EiB (INT64_C(1) << 60) +/* + * The following lookup table is intended to be used when a literal string of + * the number of bytes is required (for example if it needs to be stringified). + * It can also be used for generic shortcuts of power-of-two sizes. + * This table is generated using the AWK script below: + * + * BEGIN { + * suffix="KMGTPE"; + * for(i=10; i<64; i++) { + * val=2**i; + * s=substr(suffix, int(i/10), 1); + * n=2**(i%10); + * pad=21-int(log(n)/log(10)); + * printf("#define S_%d%siB %*d\n", n, s, pad, val); + * } + * } + */ + #define S_1KiB 1024 #define S_2KiB 2048 #define S_4KiB 4096 -- 2.17.1
Re: [Qemu-block] [Qemu-devel] [PATCH] qemu/units: Move out QCow2 specific definitions
Hi, On 11/2/18 5:28 PM, Kevin Wolf wrote: > Am 02.11.2018 um 15:52 hat Eric Blake geschrieben: >> On 11/2/18 9:10 AM, Kevin Wolf wrote: >>> Am 02.11.2018 um 13:37 hat Philippe Mathieu-Daudé geschrieben: Hi Kevin, On 2/11/18 12:07, Kevin Wolf wrote: > Am 02.11.2018 um 09:58 hat Philippe Mathieu-Daudé geschrieben: >> This definitions are QCow2 specific, there is no need to expose them >> in the global namespace. These are not QCOW2 specific. I wrote these for convenience in QCOW2, but there are many other places where these can be used (many pre-defined sizes are powers of two), and there are few places where they must replace the current notation, like in block/vdi.c with DEFAULT_CLUSTER_SIZE (unless an explicit value in bytes will be defined instead). >> >> Agreed. I didn't want it in the first place, arguing that if we want >> stringification of defaults, it would be better to have a runtime function >> do that, rather than adding a set of near-duplicate macro names. A runtime function will not help here, as these are used in compile time. These result in strings that are actually compiled into the binaries. >>> >>> Then there is VDI which uses (1 * MiB), but that is compiled out and if >>> you enable it, it breaks. So it needs the same fix. Yeah, I need to fix that as promised. Will do shortly. :) Leonid.
Re: [Qemu-block] [PATCH] qcow2: Fix cache-clean-interval documentation
On 10/1/18 7:59 PM, Kevin Wolf wrote: > Am 01.10.2018 um 16:35 hat Eric Blake geschrieben: >> On 9/29/18 4:54 AM, Leonid Bloch wrote: >>> Fixing cache-clean-interval documentation following the recent change to >>> a default of 600 seconds on supported plarforms (only Linux currently). >>> >>> Signed-off-by: Leonid Bloch >>> --- >>>docs/qcow2-cache.txt | 19 +-- >>>qapi/block-core.json | 3 ++- >>>qemu-options.hx | 3 ++- >>>3 files changed, 13 insertions(+), 12 deletions(-) >>> >>> diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt >>> index 59358b816f..1778312e09 100644 >>> --- a/docs/qcow2-cache.txt >>> +++ b/docs/qcow2-cache.txt >>> @@ -202,18 +202,17 @@ Reducing the memory usage >>>It is possible to clean unused cache entries in order to reduce the >>>memory usage during periods of low I/O activity. >>> -The parameter "cache-clean-interval" defines an interval (in seconds). >>> -All cache entries that haven't been accessed during that interval are >>> -removed from memory. >>> +The parameter "cache-clean-interval" defines an interval (in seconds), >>> +affer which all the cache entries that haven't been accessed during it >> >> s/affer/after/ >> >> Maybe s/during it/during the interval/ > > Thanks, made these fixes and applied it to the block branch. > > Kevin > Thanks Kevin, Eric!
[Qemu-block] [PATCH] qcow2: Fix cache-clean-interval documentation
Fixing cache-clean-interval documentation following the recent change to a default of 600 seconds on supported plarforms (only Linux currently). Signed-off-by: Leonid Bloch --- docs/qcow2-cache.txt | 19 +-- qapi/block-core.json | 3 ++- qemu-options.hx | 3 ++- 3 files changed, 13 insertions(+), 12 deletions(-) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 59358b816f..1778312e09 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -202,18 +202,17 @@ Reducing the memory usage It is possible to clean unused cache entries in order to reduce the memory usage during periods of low I/O activity. -The parameter "cache-clean-interval" defines an interval (in seconds). -All cache entries that haven't been accessed during that interval are -removed from memory. +The parameter "cache-clean-interval" defines an interval (in seconds), +affer which all the cache entries that haven't been accessed during it +are removed from memory. Setting this parameter to 0 disables this feature. -This example removes all unused cache entries every 15 minutes: +The following example removes all unused cache entries every 15 minutes: -drive file=hd.qcow2,cache-clean-interval=900 -If unset, the default value for this parameter is 600. Setting it to 0 -disables this feature. +If unset, the default value for this parameter is 600 on platforms which +support this functionality, and is 0 (disabled) on other platforms. -Note that this functionality currently relies on the MADV_DONTNEED -argument for madvise() to actually free the memory. This is a -Linux-specific feature, so cache-clean-interval is not supported in -other systems. +This functionality currently relies on the MADV_DONTNEED argument for +madvise() to actually free the memory. This is a Linux-specific feature, +so cache-clean-interval is not supported on other systems. diff --git a/qapi/block-core.json b/qapi/block-core.json index 46dac23d2f..25b8a0e744 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2895,7 +2895,8 @@ # # @cache-clean-interval: clean unused entries in the L2 and refcount # caches. The interval is in seconds. The default value -# is 600, and 0 disables this feature. (since 2.5) +# is 600 on supporting platforms, and 0 on other +# platforms. 0 disables this feature. (since 2.5) # # @encrypt: Image decryption options. Mandatory for # encrypted images, except when doing a metadata-only diff --git a/qemu-options.hx b/qemu-options.hx index 52d9d9f06d..f139459e80 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -747,7 +747,8 @@ it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -The default value is 600. Setting it to 0 disables this feature. +The default value is 600 on supporting platforms, and 0 on other platforms. +Setting it to 0 disables this feature. @item pass-discard-request Whether discard requests to the qcow2 device should be forwarded to the data -- 2.17.1
Re: [Qemu-block] [PATCH v12 8/9] qcow2: Set the default cache-clean-interval to 10 minutes
On September 27, 2018 5:53:34 PM CEST, Eric Blake wrote: >On 9/26/18 11:04 AM, Leonid Bloch wrote: >> The default cache-clean-interval is set to 10 minutes, in order to >lower >> the overhead of the qcow2 caches (before the default was 0, i.e. >> disabled). >> >> * For non-Linux platforms the default is kept at 0, because >>cache-clean-interval is not supported there yet. >> >> Signed-off-by: Leonid Bloch >> Reviewed-by: Alberto Garcia >> Reviewed-by: Kevin Wolf >> --- > >> @@ -76,13 +76,15 @@ >> >> #ifdef CONFIG_LINUX >> #define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB >> +#define DEFAULT_CACHE_CLEAN_INTERVAL 600 /* seconds */ >> #else >> #define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB >> +/* Cache clean interval is currently available only on Linux, so >must be 0 */ >> +#define DEFAULT_CACHE_CLEAN_INTERVAL 0 >> #endif >> > >> +++ b/docs/qcow2-cache.txt >> @@ -210,8 +210,8 @@ This example removes all unused cache entries >every 15 minutes: >> >> -drive file=hd.qcow2,cache-clean-interval=900 >> >> -If unset, the default value for this parameter is 0 and it disables >> -this feature. >> +If unset, the default value for this parameter is 600. Setting it to >0 >> +disables this feature. > >Should this wording mention that the non-zero default is only on Linux >(or rather, only on platforms where a non-zero value makes a >difference)? Yes, I think it should mention it, thanks. Should I send a separate patch? Leonid. > >> >> Note that this functionality currently relies on the MADV_DONTNEED >> argument for madvise() to actually free the memory. This is a >> diff --git a/qapi/block-core.json b/qapi/block-core.json >> index ac3b48ee54..46dac23d2f 100644 >> --- a/qapi/block-core.json >> +++ b/qapi/block-core.json >> @@ -2895,7 +2895,8 @@ >> # >> # @cache-clean-interval: clean unused entries in the L2 and >refcount >> # caches. The interval is in seconds. The >default value >> -# is 0 and it disables this feature (since >2.5) >> +# is 600, and 0 disables this feature. >(since 2.5) > >and again here > >> +# >> # @encrypt: Image decryption options. Mandatory for >> # encrypted images, except when doing a >metadata-only >> # probe of the image. (since 2.10) >> diff --git a/qemu-options.hx b/qemu-options.hx >> index 14aee78c6c..52d9d9f06d 100644 >> --- a/qemu-options.hx >> +++ b/qemu-options.hx >> @@ -747,7 +747,7 @@ it which is not used for the L2 cache) >> >> @item cache-clean-interval >> Clean unused entries in the L2 and refcount caches. The interval is >in seconds. >> -The default value is 0 and it disables this feature. >> +The default value is 600. Setting it to 0 disables this feature. > >and here? -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
[Qemu-block] [PATCH v12 8/9] qcow2: Set the default cache-clean-interval to 10 minutes
The default cache-clean-interval is set to 10 minutes, in order to lower the overhead of the qcow2 caches (before the default was 0, i.e. disabled). * For non-Linux platforms the default is kept at 0, because cache-clean-interval is not supported there yet. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia Reviewed-by: Kevin Wolf --- block/qcow2.c| 2 +- block/qcow2.h| 4 +++- docs/qcow2-cache.txt | 4 ++-- qapi/block-core.json | 3 ++- qemu-options.hx | 2 +- 5 files changed, 9 insertions(+), 6 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 20b5093269..95e1c98daa 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState *bs, /* New interval for cache cleanup timer */ r->cache_clean_interval = qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, -s->cache_clean_interval); +DEFAULT_CACHE_CLEAN_INTERVAL); #ifndef CONFIG_LINUX if (r->cache_clean_interval != 0) { error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL diff --git a/block/qcow2.h b/block/qcow2.h index 0f0e3534bf..ba430316b9 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -76,13 +76,15 @@ #ifdef CONFIG_LINUX #define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB +#define DEFAULT_CACHE_CLEAN_INTERVAL 600 /* seconds */ #else #define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB +/* Cache clean interval is currently available only on Linux, so must be 0 */ +#define DEFAULT_CACHE_CLEAN_INTERVAL 0 #endif #define DEFAULT_CLUSTER_SIZE S_64KiB - #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request" #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot" diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 1fcc0658b2..59358b816f 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -210,8 +210,8 @@ This example removes all unused cache entries every 15 minutes: -drive file=hd.qcow2,cache-clean-interval=900 -If unset, the default value for this parameter is 0 and it disables -this feature. +If unset, the default value for this parameter is 600. Setting it to 0 +disables this feature. Note that this functionality currently relies on the MADV_DONTNEED argument for madvise() to actually free the memory. This is a diff --git a/qapi/block-core.json b/qapi/block-core.json index ac3b48ee54..46dac23d2f 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2895,7 +2895,8 @@ # # @cache-clean-interval: clean unused entries in the L2 and refcount # caches. The interval is in seconds. The default value -# is 0 and it disables this feature (since 2.5) +# is 600, and 0 disables this feature. (since 2.5) +# # @encrypt: Image decryption options. Mandatory for # encrypted images, except when doing a metadata-only # probe of the image. (since 2.10) diff --git a/qemu-options.hx b/qemu-options.hx index 14aee78c6c..52d9d9f06d 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -747,7 +747,7 @@ it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -The default value is 0 and it disables this feature. +The default value is 600. Setting it to 0 disables this feature. @item pass-discard-request Whether discard requests to the qcow2 device should be forwarded to the data -- 2.17.1
[Qemu-block] [PATCH v12 4/9] qcow2: Avoid duplication in setting the refcount cache size
The refcount cache size does not need to be set to its minimum value in read_cache_sizes(), as it is set to at least its minimum value in qcow2_update_options_prepare(). Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia Reviewed-by: Kevin Wolf --- block/qcow2.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index d2c07ce9fe..cd0053b6ee 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } -if (!refcount_cache_size_set) { -*refcount_cache_size = min_refcount_cache; -} } +/* l2_cache_size and refcount_cache_size are ensured to have at least + * their minimum values in qcow2_update_options_prepare() */ if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) || *l2_cache_entry_size > s->cluster_size || -- 2.17.1
[Qemu-block] [PATCH v12 7/9] qcow2: Resize the cache upon image resizing
The caches are now recalculated upon image resizing. This is done because the new default behavior of assigning L2 cache relatively to the image size, implies that the cache will be adapted accordingly after an image resize. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index 589f6c1b1c..20b5093269 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3418,6 +3418,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, uint64_t old_length; int64_t new_l1_size; int ret; +QDict *options; if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA && prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL) @@ -3642,6 +3643,8 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } } +bs->total_sectors = offset / BDRV_SECTOR_SIZE; + /* write updated header.size */ offset = cpu_to_be64(offset); ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size), @@ -3652,6 +3655,14 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } s->l1_vm_state_index = new_l1_size; + +/* Update cache sizes */ +options = qdict_clone_shallow(bs->options); +ret = qcow2_update_options(bs, options, s->flags, errp); +qobject_unref(options); +if (ret < 0) { +goto fail; +} ret = 0; fail: qemu_co_mutex_unlock(>lock); -- 2.17.1
[Qemu-block] [PATCH v12 6/9] qcow2: Increase the default upper limit on the L2 cache size
The upper limit on the L2 cache size is increased from 1 MB to 32 MB on Linux platforms, and to 8 MB on other platforms (this difference is caused by the ability to set intervals for cache cleaning on Linux platforms only). This is done in order to allow default full coverage with the L2 cache for images of up to 256 GB in size (was 8 GB). Note, that only the needed amount to cover the full image is allocated. The value which is changed here is just the upper limit on the L2 cache size, beyond which it will not grow, even if the size of the image will require it to. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia Reviewed-by: Kevin Wolf --- block/qcow2.h| 6 +- docs/qcow2-cache.txt | 15 +-- qemu-options.hx | 6 +++--- 3 files changed, 17 insertions(+), 10 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index 2f8c1fd15c..0f0e3534bf 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,7 +74,11 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -#define DEFAULT_L2_CACHE_MAX_SIZE S_1MiB +#ifdef CONFIG_LINUX +#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB +#else +#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB +#endif #define DEFAULT_CLUSTER_SIZE S_64KiB diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 750447ea4f..1fcc0658b2 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -125,12 +125,15 @@ There are a few things that need to be taken into account: - Both caches must have a size that is a multiple of the cluster size (or the cache entry size: see "Using smaller cache sizes" below). - - The maximum L2 cache size is 1 MB by default (enough for full coverage - of 8 GB images, with the default cluster size). This value can be - modified using the "l2-cache-size" option. QEMU will not use more memory - than needed to hold all of the image's L2 tables, regardless of this max. - value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see - below). + - The maximum L2 cache size is 32 MB by default on Linux platforms (enough + for full coverage of 256 GB images, with the default cluster size). This + value can be modified using the "l2-cache-size" option. QEMU will not use + more memory than needed to hold all of the image's L2 tables, regardless + of this max. value. + On non-Linux platforms the maximal value is smaller by default (8 MB) and + this difference stems from the fact that on Linux the cache can be cleared + periodically if needed, using the "cache-clean-interval" option (see below). + The minimal L2 cache size is 2 clusters (or 2 cache entries, see below). - The default (and minimum) refcount cache size is 4 clusters. diff --git a/qemu-options.hx b/qemu-options.hx index 6eef0f5651..14aee78c6c 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -736,9 +736,9 @@ The maximum total size of the L2 table and refcount block caches in bytes @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: if cache-size is not specified - 1M; otherwise, as large as possible -within the cache-size, while permitting the requested or the minimal refcount -cache size) +(default: if cache-size is not specified - 32M on Linux platforms, and 8M on +non-Linux platforms; otherwise, as large as possible within the cache-size, +while permitting the requested or the minimal refcount cache size) @item refcount-cache-size The maximum size of the refcount block cache in bytes -- 2.17.1
[Qemu-block] [PATCH v12 9/9] qcow2: Explicit number replaced by a constant
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia Reviewed-by: Kevin Wolf --- block/qcow2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 95e1c98daa..7277feda13 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options, /* 2^(s->refcount_order - 3) is the refcount width in bytes */ s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); s->refcount_block_size = 1 << s->refcount_block_bits; -bs->total_sectors = header.size / 512; +bs->total_sectors = header.size / BDRV_SECTOR_SIZE; s->csize_shift = (62 - (s->cluster_bits - 8)); s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; s->cluster_offset_mask = (1LL << s->csize_shift) - 1; @@ -3450,7 +3450,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, goto fail; } -old_length = bs->total_sectors * 512; +old_length = bs->total_sectors * BDRV_SECTOR_SIZE; new_l1_size = size_to_l1(s, offset); if (offset < old_length) { -- 2.17.1
[Qemu-block] [PATCH v12 5/9] qcow2: Assign the L2 cache relatively to the image size
Sufficient L2 cache can noticeably improve the performance when using large images with frequent I/O. Previously, unless 'cache-size' was specified and was large enough, the L2 cache was set to a certain size without taking the virtual image size into account. Now, the L2 cache assignment is aware of the virtual size of the image, and will cover the entire image, unless the cache size needed for that is larger than a certain maximum. This maximum is set to 1 MB by default (enough to cover an 8 GB image with the default cluster size) but can be increased or decreased using the 'l2-cache-size' option. This option was previously documented as the *maximum* L2 cache size, and this patch makes it behave as such, instead of as a constant size. Also, the existing option 'cache-size' can limit the sum of both L2 and refcount caches, as previously. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia Reviewed-by: Kevin Wolf --- block/qcow2.c | 21 + block/qcow2.h | 4 +--- docs/qcow2-cache.txt | 15 ++- qemu-options.hx| 6 +++--- tests/qemu-iotests/137 | 8 +++- tests/qemu-iotests/137.out | 4 +++- 6 files changed, 33 insertions(+), 25 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index cd0053b6ee..589f6c1b1c 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, uint64_t *refcount_cache_size, Error **errp) { BDRVQcow2State *s = bs->opaque; -uint64_t combined_cache_size; +uint64_t combined_cache_size, l2_cache_max_setting; bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set; int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size; +uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; +uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE); l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE); refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE); combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0); -*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0); +l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, + DEFAULT_L2_CACHE_MAX_SIZE); *refcount_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0); *l2_cache_entry_size = qemu_opt_get_size( opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size); +*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting); + if (combined_cache_size_set) { if (l2_cache_size_set && refcount_cache_size_set) { error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE " and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set " "at the same time"); return; -} else if (*l2_cache_size > combined_cache_size) { +} else if (l2_cache_size_set && + (l2_cache_max_setting > combined_cache_size)) { error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed " QCOW2_OPT_CACHE_SIZE); return; @@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } else if (refcount_cache_size_set) { *l2_cache_size = combined_cache_size - *refcount_cache_size; } else { -uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; -uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); - /* Assign as much memory as possible to the L2 cache, and * use the remainder for the refcount cache */ if (combined_cache_size >= max_l2_cache + min_refcount_cache) { @@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, *l2_cache_size = combined_cache_size - *refcount_cache_size; } } -} else { -if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, - (uint64_t)DEFAULT_L2_CACHE_CLUSTERS - * s->cluster_size); -} } /* l2_cache_size and refcount_cache_size are ensured to have at least * their minimum values in qcow2_update_options_prepare() */ diff --git a/block/qcow2.h b/block/qcow2.h index a8d6f757b1..2f8c1fd15c 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,9 +74,7 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -/* Whic
[Qemu-block] [PATCH v12 3/9] qcow2: Make sizes more humanly readable
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia Reviewed-by: Kevin Wolf --- block/qcow2.c | 2 +- block/qcow2.h | 9 + 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index c13153735a..d2c07ce9fe 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } } else { if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE, +*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } diff --git a/block/qcow2.h b/block/qcow2.h index 81b844e936..a8d6f757b1 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -27,6 +27,7 @@ #include "crypto/block.h" #include "qemu/coroutine.h" +#include "qemu/units.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 @@ -43,11 +44,11 @@ /* 8 MB refcount table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_REFTABLE_SIZE 0x80 +#define QCOW_MAX_REFTABLE_SIZE S_8MiB /* 32 MB L1 table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_L1_SIZE 0x200 +#define QCOW_MAX_L1_SIZE S_32MiB /* Allow for an average of 1k per snapshot table entry, should be plenty of * space for snapshot names and IDs */ @@ -75,9 +76,9 @@ /* Whichever is more */ #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */ -#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */ +#define DEFAULT_L2_CACHE_SIZE S_1MiB -#define DEFAULT_CLUSTER_SIZE 65536 +#define DEFAULT_CLUSTER_SIZE S_64KiB #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" -- 2.17.1
[Qemu-block] [PATCH v12 1/9] qcow2: Options' documentation fixes
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia Reviewed-by: Kevin Wolf --- docs/qcow2-cache.txt | 21 ++--- qemu-options.hx | 9 ++--- 2 files changed, 20 insertions(+), 10 deletions(-) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 8a09a5cc5f..7e28b41bd3 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -79,14 +79,14 @@ Choosing the right cache sizes In order to choose the cache sizes we need to know how they relate to the amount of allocated space. -The amount of virtual disk that can be mapped by the L2 and refcount +The part of the virtual disk that can be mapped by the L2 and refcount caches (in bytes) is: disk_size = l2_cache_size * cluster_size / 8 disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits With the default values for cluster_size (64KB) and refcount_bits -(16), that is +(16), this becomes: disk_size = l2_cache_size * 8192 disk_size = refcount_cache_size * 32768 @@ -97,12 +97,16 @@ need: l2_cache_size = disk_size_GB * 131072 refcount_cache_size = disk_size_GB * 32768 -QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount -cache of 256KB (262144 bytes), so using the formulas we've just seen -we have +For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual +image size (given that the default cluster size is used): - 1048576 / 131072 = 8 GB of virtual disk covered by that cache -262144 / 32768 = 8 GB + 8 GB / 8192 = 1 MB + +The refcount cache is 4 times the cluster size by default. With the default +cluster size of 64 KB, it is 256 KB (262144 bytes). This is sufficient for +8 GB of image size: + + 262144 * 32768 = 8 GB How to configure the cache sizes @@ -130,6 +134,9 @@ There are a few things that need to be taken into account: memory as possible to the L2 cache before increasing the refcount cache size. + - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size" + can be set simultaneously. + Unlike L2 tables, refcount blocks are not used during normal I/O but only during allocations and internal snapshots. In most cases they are accessed sequentially (even during random guest I/O) so increasing the diff --git a/qemu-options.hx b/qemu-options.hx index a642ad297f..2db6247eff 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -732,15 +732,18 @@ image file) @item cache-size The maximum total size of the L2 table and refcount block caches in bytes -(default: 1048576 bytes or 8 clusters, whichever is larger) +(default: the sum of l2-cache-size and refcount-cache-size) @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: 4/5 of the total cache size) +(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever +is larger; otherwise, as large as possible or needed within the cache-size, +while permitting the requested or the minimal refcount cache size) @item refcount-cache-size The maximum size of the refcount block cache in bytes -(default: 1/5 of the total cache size) +(default: 4 times the cluster size; or if cache-size is specified, the part of +it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -- 2.17.1
[Qemu-block] [PATCH v12 2/9] include: Add a lookup table of sizes
Adding a lookup table for the powers of two, with the appropriate size prefixes. This is needed when a size has to be stringified, in which case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))' string. Powers of two are used very often for sizes, so such a table will also make it easier and more intuitive to write them. This table is generatred using the following AWK script: BEGIN { suffix="KMGTPE"; for(i=10; i<64; i++) { val=2**i; s=substr(suffix, int(i/10), 1); n=2**(i%10); pad=21-int(log(n)/log(10)); printf("#define S_%d%siB %*d\n", n, s, pad, val); } } Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia Reviewed-by: Kevin Wolf --- include/qemu/units.h | 55 1 file changed, 55 insertions(+) diff --git a/include/qemu/units.h b/include/qemu/units.h index 692db3fbb2..68a7758650 100644 --- a/include/qemu/units.h +++ b/include/qemu/units.h @@ -17,4 +17,59 @@ #define PiB (INT64_C(1) << 50) #define EiB (INT64_C(1) << 60) +#define S_1KiB 1024 +#define S_2KiB 2048 +#define S_4KiB 4096 +#define S_8KiB 8192 +#define S_16KiB16384 +#define S_32KiB32768 +#define S_64KiB65536 +#define S_128KiB 131072 +#define S_256KiB 262144 +#define S_512KiB 524288 +#define S_1MiB 1048576 +#define S_2MiB 2097152 +#define S_4MiB 4194304 +#define S_8MiB 8388608 +#define S_16MiB 16777216 +#define S_32MiB 33554432 +#define S_64MiB 67108864 +#define S_128MiB 134217728 +#define S_256MiB 268435456 +#define S_512MiB 536870912 +#define S_1GiB1073741824 +#define S_2GiB2147483648 +#define S_4GiB4294967296 +#define S_8GiB8589934592 +#define S_16GiB 17179869184 +#define S_32GiB 34359738368 +#define S_64GiB 68719476736 +#define S_128GiB137438953472 +#define S_256GiB274877906944 +#define S_512GiB549755813888 +#define S_1TiB 1099511627776 +#define S_2TiB 219902322 +#define S_4TiB 4398046511104 +#define S_8TiB 8796093022208 +#define S_16TiB 17592186044416 +#define S_32TiB 35184372088832 +#define S_64TiB 70368744177664 +#define S_128TiB 140737488355328 +#define S_256TiB 281474976710656 +#define S_512TiB 562949953421312 +#define S_1PiB 1125899906842624 +#define S_2PiB 2251799813685248 +#define S_4PiB 4503599627370496 +#define S_8PiB 9007199254740992 +#define S_16PiB18014398509481984 +#define S_32PiB36028797018963968 +#define S_64PiB72057594037927936 +#define S_128PiB 144115188075855872 +#define S_256PiB 288230376151711744 +#define S_512PiB 576460752303423488 +#define S_1EiB 1152921504606846976 +#define S_2EiB 2305843009213693952 +#define S_4EiB 4611686018427387904 +#define S_8EiB 9223372036854775808 + #endif -- 2.17.1
[Qemu-block] [PATCH v12 0/9] Take the image size into account when allocating the L2 cache
This series makes the qcow2 L2 cache assignment aware of the image size, with the intention for it to cover the entire image. The importance of this change is in noticeable performance improvement, especially with heavy random I/O. The memory overhead is not big in most cases, as only 1 MB of cache for every 8 GB of image size is used. For cases with very large images and/or small cluster sizes, or systems with limited RAM resources, there is an upper limit on the default L2 cache: 32 MB for Linux systems, and 8 MB for non-Linux systems (the difference is caused by the fact that it is currently impossible to perform scheduled cache cleaning on non-Linux systems). To modify this upper limit one can use the already existing 'l2-cache-size' and 'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size', as it was documented as the *maximum* L2 cache size, but in practice behaved as the absolute size. To compensate the memory overhead which may (but not necesarily will) be increased following this behavior, the default cache-clean-interval is set to 10 minutes by default (was disabled by default before). The L2 cache is also resized accordingly, by default, if the image is resized. Additionally, few minor changes are made (refactoring and documentation fixes). Differences from v1: * .gitignore modification patch removed (unneeded). * The grammar fix in conflicting cache sizing patch removed (merged). * The update total_sectors when resizing patch squashed with the resizing patch. * L2 cache is now capped at 32 MB. * The default cache-clean-interval is set to 30 seconds. Differences from v2: * Made it clear in the documentation that setting cache-clean-interval to 0 disables this feature. Differences from v3: * The default cache-clean-interval is set to 10 minutes instead of 30 seconds before. * Commit message changes to better explain the patches. * Some refactoring. Differences from v4: * Refactoring. * Documentation and commit message fixes. Differences from v5: * A patch with cosmetic changes is split from the main patch * A patch for avoiding duplication in refcount cache size assignment is split from the main patch. * In the main patch the cap on the L2 cache size is set to only 1 MB (in order to be close to the previous behavior) and a separate patch sets it to 32 MB. * Changes in the documentation fixes patch [1/8]. Differences from v6: * A patch is added to make the defined sizes more humanly readable. Differences from v7: * Documentation and commit message changes. * Fix: when cache-size is specified and is large enough, with the L2 and refcount caches not set, the L2 cache will be as large as needed to cover the entire image, and not as the default max. size. (Return to the previous behavior in this part, which was correct). * Cosmetic changes patch not used. Differences from v8: * Made a lookup table for power-of-two sizes. This will enable writing pretty size values (most of the ones used are powers of two) while allowing correct stringification of these values. * Made the max. L2 cache size 8 MB on non-Linux systems. * Returned the test for too large L2 cache size. * Minor documentation fixes. Differences from v9: * DEFAULT_CACHE_CLEAN_INTERVAL is set to 0 for non-Linux platforms in order to avoid an error, since the cache clean interval feature is currently available only on Linux. Differences from v10: * Rewording documentation. Differences from v11: * Memory leak fixed in "Resize the cache upon image resizing". Leonid Bloch (9): qcow2: Options' documentation fixes include: Add a lookup table of sizes qcow2: Make sizes more humanly readable qcow2: Avoid duplication in setting the refcount cache size qcow2: Assign the L2 cache relatively to the image size qcow2: Increase the default upper limit on the L2 cache size qcow2: Resize the cache upon image resizing qcow2: Set the default cache-clean-interval to 10 minutes qcow2: Explicit number replaced by a constant block/qcow2.c | 43 - block/qcow2.h | 19 - docs/qcow2-cache.txt | 43 +++-- include/qemu/units.h | 55 ++ qapi/block-core.json | 3 ++- qemu-options.hx| 11 +--- tests/qemu-iotests/137 | 8 +- tests/qemu-iotests/137.out | 4 ++- 8 files changed, 140 insertions(+), 46 deletions(-) -- 2.17.1
Re: [Qemu-block] [PATCH v11 7/9] qcow2: Resize the cache upon image resizing
Oops, sorry. Would something like that be OK: On 9/26/18 1:55 PM, Alberto Garcia wrote: > On Wed 26 Sep 2018 12:43:03 PM CEST, Kevin Wolf wrote: >>> +/* Update cache sizes */ >>> +options = qdict_clone_shallow(bs->options); >>> +ret = qcow2_update_options(bs, options, s->flags, errp); +qobject_unref(options); >>> +if (ret < 0) { >>> +goto fail; >>> +} >> Leonid. >> Isn't options leaked, both in success and error cases? > > I'm embarrassed not to have seen that :-! > > Berto >
Re: [Qemu-block] [PATCH v10 1/9] qcow2: Options' documentation fixes
On 9/24/18 10:34 PM, Max Reitz wrote: On 24.09.18 18:53, Leonid Bloch wrote: On 9/24/18 6:04 PM, Alberto Garcia wrote: On Fri 21 Sep 2018 07:23:02 PM CEST, Leonid Bloch wrote: Signed-off-by: Leonid Bloch --- docs/qcow2-cache.txt | 20 +--- qemu-options.hx | 9 ++--- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 8a09a5cc5f..013991e21c 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be configured. Choosing the right cache sizes -- In order to choose the cache sizes we need to know how they relate to -the amount of allocated space. +the amount of the allocated space. I'm not a native English speaker, but the current version sounds correct to me. Why do you need to add an article there? :-? I'm not a native speaker as well, but "the" seems to be needed there: - Which allocated space? - __The__ allocated space! It doesn't seem to be needed to me, as, well, it would be wrong in German. I tried to come up with explanations, and it's really difficult. So there are measures where you'd use an article (e.g. "size of the allocated space"), but I wouldn't do that with "amount". It's like with "amount" you have some form of infinite supply of something ("space" in this case) and then you get some amount of it, and then you continue to talk about that amount. It's different with quantities like "size", "volume", etc. There, you already have your amount, and then you measure it. So you can say "Some amount of space", but you cannot say "Some size of the space". Or you can say "The amount of space that is X", but not "The size of space that is X". So "size", "volume", those quantities are measuring something. "amount" just refers to something, not a quantity. So if you want to use an article, it'd have to go in front of "amount" -- and in fact, it's there already. I think saying "the amount of space is 1 GB" is just a shortcut like saying "the space is 1 GB". Using "amount of" is not measuring like saying "size of" is. In this case, it's just hinting at the fact that we're rather interested in the extent rather than the entity itself. A final note: I don't think you'd ask "Which allocated space?". I think it'd be "What allocated space?", which further hints towards not using an article after "of" here. What I do take a bit of an issue with is actually the next sentence in this document: The amount of virtual disk that can be mapped by the L2 and refcount caches (in bytes) is: "Amount of disk" sounds like "amount of car" or "amount of person". Which is to say, it sounds wrong. Those are discrete things, but "amount" sounds like something continuous. Measuring "virtual disk" in bytes is like measuring "car" in kilograms. And in fact, we are not interested in bytes of virtual disk (it's a big difference whether we're talking about one or four disks) but in bytes of a single virtual disk. So the fact that we cannot get an article in here signifies that something's wrong. Therefore, this might be better as "The part/portion/fraction of the virtual disk...". Max Max, Thanks for the detailed explanation! I'm still not really convinced about the first point, but grammar corrections are not the aim of this series, and it also might be correct - as you've written, so I won't touch it. About the second point - you're absolutely right, I didn't notice this part. Including in v11. Leonid.
[Qemu-block] [PATCH v11 0/9] Take the image size into account when allocating the L2 cache
This series makes the qcow2 L2 cache assignment aware of the image size, with the intention for it to cover the entire image. The importance of this change is in noticeable performance improvement, especially with heavy random I/O. The memory overhead is not big in most cases, as only 1 MB of cache for every 8 GB of image size is used. For cases with very large images and/or small cluster sizes, or systems with limited RAM resources, there is an upper limit on the default L2 cache: 32 MB for Linux systems, and 8 MB for non-Linux systems (the difference is caused by the fact that it is currently impossible to perform scheduled cache cleaning on non-Linux systems). To modify this upper limit one can use the already existing 'l2-cache-size' and 'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size', as it was documented as the *maximum* L2 cache size, but in practice behaved as the absolute size. To compensate the memory overhead which may (but not necesarily will) be increased following this behavior, the default cache-clean-interval is set to 10 minutes by default (was disabled by default before). The L2 cache is also resized accordingly, by default, if the image is resized. Additionally, few minor changes are made (refactoring and documentation fixes). Differences from v1: * .gitignore modification patch removed (unneeded). * The grammar fix in conflicting cache sizing patch removed (merged). * The update total_sectors when resizing patch squashed with the resizing patch. * L2 cache is now capped at 32 MB. * The default cache-clean-interval is set to 30 seconds. Differences from v2: * Made it clear in the documentation that setting cache-clean-interval to 0 disables this feature. Differences from v3: * The default cache-clean-interval is set to 10 minutes instead of 30 seconds before. * Commit message changes to better explain the patches. * Some refactoring. Differences from v4: * Refactoring. * Documentation and commit message fixes. Differences from v5: * A patch with cosmetic changes is split from the main patch * A patch for avoiding duplication in refcount cache size assignment is split from the main patch. * In the main patch the cap on the L2 cache size is set to only 1 MB (in order to be close to the previous behavior) and a separate patch sets it to 32 MB. * Changes in the documentation fixes patch [1/8]. Differences from v6: * A patch is added to make the defined sizes more humanly readable. Differences from v7: * Documentation and commit message changes. * Fix: when cache-size is specified and is large enough, with the L2 and refcount caches not set, the L2 cache will be as large as needed to cover the entire image, and not as the default max. size. (Return to the previous behavior in this part, which was correct). * Cosmetic changes patch not used. Differences from v8: * Made a lookup table for power-of-two sizes. This will enable writing pretty size values (most of the ones used are powers of two) while allowing correct stringification of these values. * Made the max. L2 cache size 8 MB on non-Linux systems. * Returned the test for too large L2 cache size. * Minor documentation fixes. Differences from v9: * DEFAULT_CACHE_CLEAN_INTERVAL is set to 0 for non-Linux platforms in order to avoid an error, since the cache clean interval feature is currently available only on Linux. Differences from v10: * Rewording documentation. Leonid Bloch (9): qcow2: Options' documentation fixes include: Add a lookup table of sizes qcow2: Make sizes more humanly readable qcow2: Avoid duplication in setting the refcount cache size qcow2: Assign the L2 cache relatively to the image size qcow2: Increase the default upper limit on the L2 cache size qcow2: Resize the cache upon image resizing qcow2: Set the default cache-clean-interval to 10 minutes qcow2: Explicit number replaced by a constant block/qcow2.c | 42 - block/qcow2.h | 19 - docs/qcow2-cache.txt | 43 +++-- include/qemu/units.h | 55 ++ qapi/block-core.json | 3 ++- qemu-options.hx| 11 +--- tests/qemu-iotests/137 | 8 +- tests/qemu-iotests/137.out | 4 ++- 8 files changed, 139 insertions(+), 46 deletions(-) -- 2.17.1
[Qemu-block] [PATCH v11 3/9] qcow2: Make sizes more humanly readable
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 2 +- block/qcow2.h | 9 + 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index c13153735a..d2c07ce9fe 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } } else { if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE, +*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } diff --git a/block/qcow2.h b/block/qcow2.h index 81b844e936..a8d6f757b1 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -27,6 +27,7 @@ #include "crypto/block.h" #include "qemu/coroutine.h" +#include "qemu/units.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 @@ -43,11 +44,11 @@ /* 8 MB refcount table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_REFTABLE_SIZE 0x80 +#define QCOW_MAX_REFTABLE_SIZE S_8MiB /* 32 MB L1 table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_L1_SIZE 0x200 +#define QCOW_MAX_L1_SIZE S_32MiB /* Allow for an average of 1k per snapshot table entry, should be plenty of * space for snapshot names and IDs */ @@ -75,9 +76,9 @@ /* Whichever is more */ #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */ -#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */ +#define DEFAULT_L2_CACHE_SIZE S_1MiB -#define DEFAULT_CLUSTER_SIZE 65536 +#define DEFAULT_CLUSTER_SIZE S_64KiB #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" -- 2.17.1
[Qemu-block] [PATCH v11 1/9] qcow2: Options' documentation fixes
Signed-off-by: Leonid Bloch --- docs/qcow2-cache.txt | 21 ++--- qemu-options.hx | 9 ++--- 2 files changed, 20 insertions(+), 10 deletions(-) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 8a09a5cc5f..7e28b41bd3 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -79,14 +79,14 @@ Choosing the right cache sizes In order to choose the cache sizes we need to know how they relate to the amount of allocated space. -The amount of virtual disk that can be mapped by the L2 and refcount +The part of the virtual disk that can be mapped by the L2 and refcount caches (in bytes) is: disk_size = l2_cache_size * cluster_size / 8 disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits With the default values for cluster_size (64KB) and refcount_bits -(16), that is +(16), this becomes: disk_size = l2_cache_size * 8192 disk_size = refcount_cache_size * 32768 @@ -97,12 +97,16 @@ need: l2_cache_size = disk_size_GB * 131072 refcount_cache_size = disk_size_GB * 32768 -QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount -cache of 256KB (262144 bytes), so using the formulas we've just seen -we have +For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual +image size (given that the default cluster size is used): - 1048576 / 131072 = 8 GB of virtual disk covered by that cache -262144 / 32768 = 8 GB + 8 GB / 8192 = 1 MB + +The refcount cache is 4 times the cluster size by default. With the default +cluster size of 64 KB, it is 256 KB (262144 bytes). This is sufficient for +8 GB of image size: + + 262144 * 32768 = 8 GB How to configure the cache sizes @@ -130,6 +134,9 @@ There are a few things that need to be taken into account: memory as possible to the L2 cache before increasing the refcount cache size. + - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size" + can be set simultaneously. + Unlike L2 tables, refcount blocks are not used during normal I/O but only during allocations and internal snapshots. In most cases they are accessed sequentially (even during random guest I/O) so increasing the diff --git a/qemu-options.hx b/qemu-options.hx index a642ad297f..2db6247eff 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -732,15 +732,18 @@ image file) @item cache-size The maximum total size of the L2 table and refcount block caches in bytes -(default: 1048576 bytes or 8 clusters, whichever is larger) +(default: the sum of l2-cache-size and refcount-cache-size) @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: 4/5 of the total cache size) +(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever +is larger; otherwise, as large as possible or needed within the cache-size, +while permitting the requested or the minimal refcount cache size) @item refcount-cache-size The maximum size of the refcount block cache in bytes -(default: 1/5 of the total cache size) +(default: 4 times the cluster size; or if cache-size is specified, the part of +it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -- 2.17.1
[Qemu-block] [PATCH v11 5/9] qcow2: Assign the L2 cache relatively to the image size
Sufficient L2 cache can noticeably improve the performance when using large images with frequent I/O. Previously, unless 'cache-size' was specified and was large enough, the L2 cache was set to a certain size without taking the virtual image size into account. Now, the L2 cache assignment is aware of the virtual size of the image, and will cover the entire image, unless the cache size needed for that is larger than a certain maximum. This maximum is set to 1 MB by default (enough to cover an 8 GB image with the default cluster size) but can be increased or decreased using the 'l2-cache-size' option. This option was previously documented as the *maximum* L2 cache size, and this patch makes it behave as such, instead of as a constant size. Also, the existing option 'cache-size' can limit the sum of both L2 and refcount caches, as previously. Signed-off-by: Leonid Bloch --- block/qcow2.c | 21 + block/qcow2.h | 4 +--- docs/qcow2-cache.txt | 15 ++- qemu-options.hx| 6 +++--- tests/qemu-iotests/137 | 8 +++- tests/qemu-iotests/137.out | 4 +++- 6 files changed, 33 insertions(+), 25 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index cd0053b6ee..589f6c1b1c 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, uint64_t *refcount_cache_size, Error **errp) { BDRVQcow2State *s = bs->opaque; -uint64_t combined_cache_size; +uint64_t combined_cache_size, l2_cache_max_setting; bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set; int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size; +uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; +uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE); l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE); refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE); combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0); -*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0); +l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, + DEFAULT_L2_CACHE_MAX_SIZE); *refcount_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0); *l2_cache_entry_size = qemu_opt_get_size( opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size); +*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting); + if (combined_cache_size_set) { if (l2_cache_size_set && refcount_cache_size_set) { error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE " and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set " "at the same time"); return; -} else if (*l2_cache_size > combined_cache_size) { +} else if (l2_cache_size_set && + (l2_cache_max_setting > combined_cache_size)) { error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed " QCOW2_OPT_CACHE_SIZE); return; @@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } else if (refcount_cache_size_set) { *l2_cache_size = combined_cache_size - *refcount_cache_size; } else { -uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; -uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); - /* Assign as much memory as possible to the L2 cache, and * use the remainder for the refcount cache */ if (combined_cache_size >= max_l2_cache + min_refcount_cache) { @@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, *l2_cache_size = combined_cache_size - *refcount_cache_size; } } -} else { -if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, - (uint64_t)DEFAULT_L2_CACHE_CLUSTERS - * s->cluster_size); -} } /* l2_cache_size and refcount_cache_size are ensured to have at least * their minimum values in qcow2_update_options_prepare() */ diff --git a/block/qcow2.h b/block/qcow2.h index a8d6f757b1..2f8c1fd15c 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,9 +74,7 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -/* Whichever is more */ -#define DEFAULT_L2_CACHE_CLUSTE
[Qemu-block] [PATCH v11 4/9] qcow2: Avoid duplication in setting the refcount cache size
The refcount cache size does not need to be set to its minimum value in read_cache_sizes(), as it is set to at least its minimum value in qcow2_update_options_prepare(). Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index d2c07ce9fe..cd0053b6ee 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } -if (!refcount_cache_size_set) { -*refcount_cache_size = min_refcount_cache; -} } +/* l2_cache_size and refcount_cache_size are ensured to have at least + * their minimum values in qcow2_update_options_prepare() */ if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) || *l2_cache_entry_size > s->cluster_size || -- 2.17.1
[Qemu-block] [PATCH v11 2/9] include: Add a lookup table of sizes
Adding a lookup table for the powers of two, with the appropriate size prefixes. This is needed when a size has to be stringified, in which case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))' string. Powers of two are used very often for sizes, so such a table will also make it easier and more intuitive to write them. This table is generatred using the following AWK script: BEGIN { suffix="KMGTPE"; for(i=10; i<64; i++) { val=2**i; s=substr(suffix, int(i/10), 1); n=2**(i%10); pad=21-int(log(n)/log(10)); printf("#define S_%d%siB %*d\n", n, s, pad, val); } } Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- include/qemu/units.h | 55 1 file changed, 55 insertions(+) diff --git a/include/qemu/units.h b/include/qemu/units.h index 692db3fbb2..68a7758650 100644 --- a/include/qemu/units.h +++ b/include/qemu/units.h @@ -17,4 +17,59 @@ #define PiB (INT64_C(1) << 50) #define EiB (INT64_C(1) << 60) +#define S_1KiB 1024 +#define S_2KiB 2048 +#define S_4KiB 4096 +#define S_8KiB 8192 +#define S_16KiB16384 +#define S_32KiB32768 +#define S_64KiB65536 +#define S_128KiB 131072 +#define S_256KiB 262144 +#define S_512KiB 524288 +#define S_1MiB 1048576 +#define S_2MiB 2097152 +#define S_4MiB 4194304 +#define S_8MiB 8388608 +#define S_16MiB 16777216 +#define S_32MiB 33554432 +#define S_64MiB 67108864 +#define S_128MiB 134217728 +#define S_256MiB 268435456 +#define S_512MiB 536870912 +#define S_1GiB1073741824 +#define S_2GiB2147483648 +#define S_4GiB4294967296 +#define S_8GiB8589934592 +#define S_16GiB 17179869184 +#define S_32GiB 34359738368 +#define S_64GiB 68719476736 +#define S_128GiB137438953472 +#define S_256GiB274877906944 +#define S_512GiB549755813888 +#define S_1TiB 1099511627776 +#define S_2TiB 219902322 +#define S_4TiB 4398046511104 +#define S_8TiB 8796093022208 +#define S_16TiB 17592186044416 +#define S_32TiB 35184372088832 +#define S_64TiB 70368744177664 +#define S_128TiB 140737488355328 +#define S_256TiB 281474976710656 +#define S_512TiB 562949953421312 +#define S_1PiB 1125899906842624 +#define S_2PiB 2251799813685248 +#define S_4PiB 4503599627370496 +#define S_8PiB 9007199254740992 +#define S_16PiB18014398509481984 +#define S_32PiB36028797018963968 +#define S_64PiB72057594037927936 +#define S_128PiB 144115188075855872 +#define S_256PiB 288230376151711744 +#define S_512PiB 576460752303423488 +#define S_1EiB 1152921504606846976 +#define S_2EiB 2305843009213693952 +#define S_4EiB 4611686018427387904 +#define S_8EiB 9223372036854775808 + #endif -- 2.17.1
[Qemu-block] [PATCH v11 6/9] qcow2: Increase the default upper limit on the L2 cache size
The upper limit on the L2 cache size is increased from 1 MB to 32 MB on Linux platforms, and to 8 MB on other platforms (this difference is caused by the ability to set intervals for cache cleaning on Linux platforms only). This is done in order to allow default full coverage with the L2 cache for images of up to 256 GB in size (was 8 GB). Note, that only the needed amount to cover the full image is allocated. The value which is changed here is just the upper limit on the L2 cache size, beyond which it will not grow, even if the size of the image will require it to. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.h| 6 +- docs/qcow2-cache.txt | 15 +-- qemu-options.hx | 6 +++--- 3 files changed, 17 insertions(+), 10 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index 2f8c1fd15c..0f0e3534bf 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,7 +74,11 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -#define DEFAULT_L2_CACHE_MAX_SIZE S_1MiB +#ifdef CONFIG_LINUX +#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB +#else +#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB +#endif #define DEFAULT_CLUSTER_SIZE S_64KiB diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 750447ea4f..1fcc0658b2 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -125,12 +125,15 @@ There are a few things that need to be taken into account: - Both caches must have a size that is a multiple of the cluster size (or the cache entry size: see "Using smaller cache sizes" below). - - The maximum L2 cache size is 1 MB by default (enough for full coverage - of 8 GB images, with the default cluster size). This value can be - modified using the "l2-cache-size" option. QEMU will not use more memory - than needed to hold all of the image's L2 tables, regardless of this max. - value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see - below). + - The maximum L2 cache size is 32 MB by default on Linux platforms (enough + for full coverage of 256 GB images, with the default cluster size). This + value can be modified using the "l2-cache-size" option. QEMU will not use + more memory than needed to hold all of the image's L2 tables, regardless + of this max. value. + On non-Linux platforms the maximal value is smaller by default (8 MB) and + this difference stems from the fact that on Linux the cache can be cleared + periodically if needed, using the "cache-clean-interval" option (see below). + The minimal L2 cache size is 2 clusters (or 2 cache entries, see below). - The default (and minimum) refcount cache size is 4 clusters. diff --git a/qemu-options.hx b/qemu-options.hx index 6eef0f5651..14aee78c6c 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -736,9 +736,9 @@ The maximum total size of the L2 table and refcount block caches in bytes @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: if cache-size is not specified - 1M; otherwise, as large as possible -within the cache-size, while permitting the requested or the minimal refcount -cache size) +(default: if cache-size is not specified - 32M on Linux platforms, and 8M on +non-Linux platforms; otherwise, as large as possible within the cache-size, +while permitting the requested or the minimal refcount cache size) @item refcount-cache-size The maximum size of the refcount block cache in bytes -- 2.17.1
[Qemu-block] [PATCH v11 9/9] qcow2: Explicit number replaced by a constant
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 3c1859f9cc..983d4361d7 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options, /* 2^(s->refcount_order - 3) is the refcount width in bytes */ s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); s->refcount_block_size = 1 << s->refcount_block_bits; -bs->total_sectors = header.size / 512; +bs->total_sectors = header.size / BDRV_SECTOR_SIZE; s->csize_shift = (62 - (s->cluster_bits - 8)); s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; s->cluster_offset_mask = (1LL << s->csize_shift) - 1; @@ -3450,7 +3450,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, goto fail; } -old_length = bs->total_sectors * 512; +old_length = bs->total_sectors * BDRV_SECTOR_SIZE; new_l1_size = size_to_l1(s, offset); if (offset < old_length) { -- 2.17.1
[Qemu-block] [PATCH v11 7/9] qcow2: Resize the cache upon image resizing
The caches are now recalculated upon image resizing. This is done because the new default behavior of assigning L2 cache relatively to the image size, implies that the cache will be adapted accordingly after an image resize. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index 589f6c1b1c..c68f896c66 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3418,6 +3418,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, uint64_t old_length; int64_t new_l1_size; int ret; +QDict *options; if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA && prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL) @@ -3642,6 +3643,8 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } } +bs->total_sectors = offset / BDRV_SECTOR_SIZE; + /* write updated header.size */ offset = cpu_to_be64(offset); ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size), @@ -3652,6 +3655,13 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } s->l1_vm_state_index = new_l1_size; + +/* Update cache sizes */ +options = qdict_clone_shallow(bs->options); +ret = qcow2_update_options(bs, options, s->flags, errp); +if (ret < 0) { +goto fail; +} ret = 0; fail: qemu_co_mutex_unlock(>lock); -- 2.17.1
[Qemu-block] [PATCH v11 8/9] qcow2: Set the default cache-clean-interval to 10 minutes
The default cache-clean-interval is set to 10 minutes, in order to lower the overhead of the qcow2 caches (before the default was 0, i.e. disabled). * For non-Linux platforms the default is kept at 0, because cache-clean-interval is not supported there yet. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c| 2 +- block/qcow2.h| 4 +++- docs/qcow2-cache.txt | 4 ++-- qapi/block-core.json | 3 ++- qemu-options.hx | 2 +- 5 files changed, 9 insertions(+), 6 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index c68f896c66..3c1859f9cc 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState *bs, /* New interval for cache cleanup timer */ r->cache_clean_interval = qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, -s->cache_clean_interval); +DEFAULT_CACHE_CLEAN_INTERVAL); #ifndef CONFIG_LINUX if (r->cache_clean_interval != 0) { error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL diff --git a/block/qcow2.h b/block/qcow2.h index 0f0e3534bf..ba430316b9 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -76,13 +76,15 @@ #ifdef CONFIG_LINUX #define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB +#define DEFAULT_CACHE_CLEAN_INTERVAL 600 /* seconds */ #else #define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB +/* Cache clean interval is currently available only on Linux, so must be 0 */ +#define DEFAULT_CACHE_CLEAN_INTERVAL 0 #endif #define DEFAULT_CLUSTER_SIZE S_64KiB - #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request" #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot" diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 1fcc0658b2..59358b816f 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -210,8 +210,8 @@ This example removes all unused cache entries every 15 minutes: -drive file=hd.qcow2,cache-clean-interval=900 -If unset, the default value for this parameter is 0 and it disables -this feature. +If unset, the default value for this parameter is 600. Setting it to 0 +disables this feature. Note that this functionality currently relies on the MADV_DONTNEED argument for madvise() to actually free the memory. This is a diff --git a/qapi/block-core.json b/qapi/block-core.json index 4c7a37afdc..08c27b9af7 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2827,7 +2827,8 @@ # # @cache-clean-interval: clean unused entries in the L2 and refcount # caches. The interval is in seconds. The default value -# is 0 and it disables this feature (since 2.5) +# is 600, and 0 disables this feature. (since 2.5) +# # @encrypt: Image decryption options. Mandatory for # encrypted images, except when doing a metadata-only # probe of the image. (since 2.10) diff --git a/qemu-options.hx b/qemu-options.hx index 14aee78c6c..52d9d9f06d 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -747,7 +747,7 @@ it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -The default value is 0 and it disables this feature. +The default value is 600. Setting it to 0 disables this feature. @item pass-discard-request Whether discard requests to the qcow2 device should be forwarded to the data -- 2.17.1
Re: [Qemu-block] [PATCH v10 3/9] qcow2: Make sizes more humanly readable
On 9/24/18 7:59 PM, Eric Blake wrote: On 9/21/18 12:23 PM, Leonid Bloch wrote: Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 2 +- block/qcow2.h | 9 + 2 files changed, 6 insertions(+), 5 deletions(-) +++ b/block/qcow2.h @@ -27,6 +27,7 @@ #include "crypto/block.h" #include "qemu/coroutine.h" +#include "qemu/units.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 @@ -43,11 +44,11 @@ /* 8 MB refcount table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_REFTABLE_SIZE 0x80 +#define QCOW_MAX_REFTABLE_SIZE S_8MiB Why not (8 * MiB)? Where does this get stringified that you have to introduce a new macro? Can we instead fix the stringification location to call a convenience helper that converts arbitrary values into convenient strings at runtime, rather than having to worry about a separate macro just for stringification purposes? - Why not (8 * MiB)? Because we have the lookup table from the previous patch. - Why do we need this table? Because some values (not this one specifically, but for example DEFAULT_CLUSTER_SIZE) get stringified at compile time (not during runtime), and for these values the only two options are to use such a table for convenient writing, or to write the number of bytes explicitly in the definition. Again, while the necessity for such a table comes from stringification, it has an added value of short and clear size definitions. /* 32 MB L1 table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_L1_SIZE 0x200 +#define QCOW_MAX_L1_SIZE S_32MiB Again, why not (32 * MiB)? Same as above. /* Allow for an average of 1k per snapshot table entry, should be plenty of * space for snapshot names and IDs */ @@ -75,9 +76,9 @@ /* Whichever is more */ #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */ -#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */ +#define DEFAULT_L2_CACHE_SIZE S_1MiB and (1 * MiB) Same as above. -#define DEFAULT_CLUSTER_SIZE 65536 +#define DEFAULT_CLUSTER_SIZE S_64KiB and (64 * KiB) Because: block/qcow2.c 4726:.def_value_str = stringify(DEFAULT_CLUSTER_SIZE) (and similar in some other files). Leonid.
Re: [Qemu-block] [PATCH v10 1/9] qcow2: Options' documentation fixes
On 9/24/18 6:04 PM, Alberto Garcia wrote: > On Fri 21 Sep 2018 07:23:02 PM CEST, Leonid Bloch wrote: >> Signed-off-by: Leonid Bloch >> --- >> docs/qcow2-cache.txt | 20 +--- >> qemu-options.hx | 9 ++--- >> 2 files changed, 19 insertions(+), 10 deletions(-) >> >> diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt >> index 8a09a5cc5f..013991e21c 100644 >> --- a/docs/qcow2-cache.txt >> +++ b/docs/qcow2-cache.txt >> @@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be >> configured. >> Choosing the right cache sizes >> -- >> In order to choose the cache sizes we need to know how they relate to >> -the amount of allocated space. >> +the amount of the allocated space. > > I'm not a native English speaker, but the current version sounds correct > to me. Why do you need to add an article there? :-? I'm not a native speaker as well, but "the" seems to be needed there: - Which allocated space? - __The__ allocated space! But I really do not insist on this one - it's clear enough. :) > >> +For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual >> +image size (given that the default cluster size is used): >> >> - 1048576 / 131072 = 8 GB of virtual disk covered by that cache >> -262144 / 32768 = 8 GB >> + 8 GB / 8192 = 1 MB >> + >> +A default refcount cache is 4 times the cluster size, which defaults to >> +256 KB (262144 bytes). This is sufficient for 8 GB of image size: > > "A default ... which defaults to" sounds a bit strange I think. Indeed. I'll reword. Thanks. Leonid. > >> --- a/qemu-options.hx >> +++ b/qemu-options.hx >> @@ -742,15 +742,18 @@ image file) >> >> @item cache-size >> The maximum total size of the L2 table and refcount block caches in bytes >> -(default: 1048576 bytes or 8 clusters, whichever is larger) >> +(default: the sum of l2-cache-size and refcount-cache-size) > > Oh, so this had been wrong all the time? Good that you saw it! > > Berto >
Re: [Qemu-block] [PATCH v10 2/9] include: Add a lookup table of sizes
On 9/24/18 5:09 PM, Eric Blake wrote: > On 9/21/18 12:23 PM, Leonid Bloch wrote: >> Adding a lookup table for the powers of two, with the appropriate size >> prefixes. This is needed when a size has to be stringified, in which >> case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))' >> string. Powers of two are used very often for sizes, so such a table >> will also make it easier and more intuitive to write them. > > Would it be better to provide a generic util function that takes an > arbitrary runtime value and converts it to a human-readable form? > The problem is that the srtingification happens at compile time, and the literal string is written to qcow2.o (the default value). From here comes the need for this table. And the added benefit of it are more concise size notations for power-of-two sizes. Leonid.
[Qemu-block] [PATCH v10 8/9] qcow2: Set the default cache-clean-interval to 10 minutes
The default cache-clean-interval is set to 10 minutes, in order to lower the overhead of the qcow2 caches (before the default was 0, i.e. disabled). * For non-Linux platforms the default is kept at 0, because cache-clean-interval is not supported there yet. Signed-off-by: Leonid Bloch --- block/qcow2.c| 2 +- block/qcow2.h| 4 +++- docs/qcow2-cache.txt | 4 ++-- qapi/block-core.json | 3 ++- qemu-options.hx | 2 +- 5 files changed, 9 insertions(+), 6 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 1445cd5360..f885afa0ed 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState *bs, /* New interval for cache cleanup timer */ r->cache_clean_interval = qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, -s->cache_clean_interval); +DEFAULT_CACHE_CLEAN_INTERVAL); #ifndef CONFIG_LINUX if (r->cache_clean_interval != 0) { error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL diff --git a/block/qcow2.h b/block/qcow2.h index 0f0e3534bf..ba430316b9 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -76,13 +76,15 @@ #ifdef CONFIG_LINUX #define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB +#define DEFAULT_CACHE_CLEAN_INTERVAL 600 /* seconds */ #else #define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB +/* Cache clean interval is currently available only on Linux, so must be 0 */ +#define DEFAULT_CACHE_CLEAN_INTERVAL 0 #endif #define DEFAULT_CLUSTER_SIZE S_64KiB - #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request" #define QCOW2_OPT_DISCARD_SNAPSHOT "pass-discard-snapshot" diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 5965d3d094..15ae797931 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -209,8 +209,8 @@ This example removes all unused cache entries every 15 minutes: -drive file=hd.qcow2,cache-clean-interval=900 -If unset, the default value for this parameter is 0 and it disables -this feature. +If unset, the default value for this parameter is 600. Setting it to 0 +disables this feature. Note that this functionality currently relies on the MADV_DONTNEED argument for madvise() to actually free the memory. This is a diff --git a/qapi/block-core.json b/qapi/block-core.json index 4c7a37afdc..08c27b9af7 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2827,7 +2827,8 @@ # # @cache-clean-interval: clean unused entries in the L2 and refcount # caches. The interval is in seconds. The default value -# is 0 and it disables this feature (since 2.5) +# is 600, and 0 disables this feature. (since 2.5) +# # @encrypt: Image decryption options. Mandatory for # encrypted images, except when doing a metadata-only # probe of the image. (since 2.10) diff --git a/qemu-options.hx b/qemu-options.hx index d5f4bcadd4..2975fdf9f8 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -757,7 +757,7 @@ it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -The default value is 0 and it disables this feature. +The default value is 600. Setting it to 0 disables this feature. @item pass-discard-request Whether discard requests to the qcow2 device should be forwarded to the data -- 2.17.1
[Qemu-block] [PATCH v10 6/9] qcow2: Increase the default upper limit on the L2 cache size
The upper limit on the L2 cache size is increased from 1 MB to 32 MB on Linux platforms, and to 8 MB on other platforms (this difference is caused by the ability to set intervals for cache cleaning on Linux platforms only). This is done in order to allow default full coverage with the L2 cache for images of up to 256 GB in size (was 8 GB). Note, that only the needed amount to cover the full image is allocated. The value which is changed here is just the upper limit on the L2 cache size, beyond which it will not grow, even if the size of the image will require it to. Signed-off-by: Leonid Bloch --- block/qcow2.h| 6 +- docs/qcow2-cache.txt | 15 +-- qemu-options.hx | 6 +++--- 3 files changed, 17 insertions(+), 10 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index 2f8c1fd15c..0f0e3534bf 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,7 +74,11 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -#define DEFAULT_L2_CACHE_MAX_SIZE S_1MiB +#ifdef CONFIG_LINUX +#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB +#else +#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB +#endif #define DEFAULT_CLUSTER_SIZE S_64KiB diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index c84cd69cc7..5965d3d094 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -124,12 +124,15 @@ There are a few things that need to be taken into account: - Both caches must have a size that is a multiple of the cluster size (or the cache entry size: see "Using smaller cache sizes" below). - - The maximum L2 cache size is 1 MB by default (enough for full coverage - of 8 GB images, with the default cluster size). This value can be - modified using the "l2-cache-size" option. QEMU will not use more memory - than needed to hold all of the image's L2 tables, regardless of this max. - value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see - below). + - The maximum L2 cache size is 32 MB by default on Linux platforms (enough + for full coverage of 256 GB images, with the default cluster size). This + value can be modified using the "l2-cache-size" option. QEMU will not use + more memory than needed to hold all of the image's L2 tables, regardless + of this max. value. + On non-Linux platforms the maximal value is smaller by default (8 MB) and + this difference stems from the fact that on Linux the cache can be cleared + periodically if needed, using the "cache-clean-interval" option (see below). + The minimal L2 cache size is 2 clusters (or 2 cache entries, see below). - The default (and minimum) refcount cache size is 4 clusters. diff --git a/qemu-options.hx b/qemu-options.hx index f9fe43a4dc..d5f4bcadd4 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -746,9 +746,9 @@ The maximum total size of the L2 table and refcount block caches in bytes @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: if cache-size is not specified - 1M; otherwise, as large as possible -within the cache-size, while permitting the requested or the minimal refcount -cache size) +(default: if cache-size is not specified - 32M on Linux platforms, and 8M on +non-Linux platforms; otherwise, as large as possible within the cache-size, +while permitting the requested or the minimal refcount cache size) @item refcount-cache-size The maximum size of the refcount block cache in bytes -- 2.17.1
[Qemu-block] [PATCH v10 7/9] qcow2: Resize the cache upon image resizing
The caches are now recalculated upon image resizing. This is done because the new default behavior of assigning L2 cache relatively to the image size, implies that the cache will be adapted accordingly after an image resize. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index 01c39c56c0..1445cd5360 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3418,6 +3418,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, uint64_t old_length; int64_t new_l1_size; int ret; +QDict *options; if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA && prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL) @@ -3642,6 +3643,8 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } } +bs->total_sectors = offset / BDRV_SECTOR_SIZE; + /* write updated header.size */ offset = cpu_to_be64(offset); ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size), @@ -3652,6 +3655,13 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } s->l1_vm_state_index = new_l1_size; + +/* Update cache sizes */ +options = qdict_clone_shallow(bs->options); +ret = qcow2_update_options(bs, options, s->flags, errp); +if (ret < 0) { +goto fail; +} ret = 0; fail: qemu_co_mutex_unlock(>lock); -- 2.17.1
[Qemu-block] [PATCH v10 0/9] Take the image size into account when allocating the L2 cache
This series makes the qcow2 L2 cache assignment aware of the image size, with the intention for it to cover the entire image. The importance of this change is in noticeable performance improvement, especially with heavy random I/O. The memory overhead is not big in most cases, as only 1 MB of cache for every 8 GB of image size is used. For cases with very large images and/or small cluster sizes, or systems with limited RAM resources, there is an upper limit on the default L2 cache: 32 MB for Linux systems, and 8 MB for non-Linux systems (the difference is caused by the fact that it is currently impossible to perform scheduled cache cleaning on non-Linux systems). To modify this upper limit one can use the already existing 'l2-cache-size' and 'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size', as it was documented as the *maximum* L2 cache size, but in practice behaved as the absolute size. To compensate the memory overhead which may (but not necesarily will) be increased following this behavior, the default cache-clean-interval is set to 10 minutes by default (was disabled by default before). The L2 cache is also resized accordingly, by default, if the image is resized. Additionally, few minor changes are made (refactoring and documentation fixes). Differences from v1: * .gitignore modification patch removed (unneeded). * The grammar fix in conflicting cache sizing patch removed (merged). * The update total_sectors when resizing patch squashed with the resizing patch. * L2 cache is now capped at 32 MB. * The default cache-clean-interval is set to 30 seconds. Differences from v2: * Made it clear in the documentation that setting cache-clean-interval to 0 disables this feature. Differences from v3: * The default cache-clean-interval is set to 10 minutes instead of 30 seconds before. * Commit message changes to better explain the patches. * Some refactoring. Differences from v4: * Refactoring. * Documentation and commit message fixes. Differences from v5: * A patch with cosmetic changes is split from the main patch * A patch for avoiding duplication in refcount cache size assignment is split from the main patch. * In the main patch the cap on the L2 cache size is set to only 1 MB (in order to be close to the previous behavior) and a separate patch sets it to 32 MB. * Changes in the documentation fixes patch [1/8]. Differences from v6: * A patch is added to make the defined sizes more humanly readable. Differences from v7: * Documentation and commit message changes. * Fix: when cache-size is specified and is large enough, with the L2 and refcount caches not set, the L2 cache will be as large as needed to cover the entire image, and not as the default max. size. (Return to the previous behavior in this part, which was correct). * Cosmetic changes patch not used. Differences from v8: * Made a lookup table for power-of-two sizes. This will enable writing pretty size values (most of the ones used are powers of two) while allowing correct stringification of these values. * Made the max. L2 cache size 8 MB on non-Linux systems. * Returned the test for too large L2 cache size. * Minor documentation fixes. Differences from v9: * DEFAULT_CACHE_CLEAN_INTERVAL is set to 0 for non-Linux platforms in order to avoid an error, since the cache clean interval feature is currently available only on Linux. Leonid Bloch (9): qcow2: Options' documentation fixes include: Add a lookup table of sizes qcow2: Make sizes more humanly readable qcow2: Avoid duplication in setting the refcount cache size qcow2: Assign the L2 cache relatively to the image size qcow2: Increase the default upper limit on the L2 cache size qcow2: Resize the cache upon image resizing qcow2: Set the default cache-clean-interval to 10 minutes qcow2: Explicit number replaced by a constant block/qcow2.c | 42 - block/qcow2.h | 19 - docs/qcow2-cache.txt | 42 +++-- include/qemu/units.h | 55 ++ qapi/block-core.json | 3 ++- qemu-options.hx| 11 +--- tests/qemu-iotests/137 | 8 +- tests/qemu-iotests/137.out | 4 ++- 8 files changed, 138 insertions(+), 46 deletions(-) -- 2.17.1
[Qemu-block] [PATCH v10 4/9] qcow2: Avoid duplication in setting the refcount cache size
The refcount cache size does not need to be set to its minimum value in read_cache_sizes(), as it is set to at least its minimum value in qcow2_update_options_prepare(). Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 67cc82f0b9..7949d15fc6 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } -if (!refcount_cache_size_set) { -*refcount_cache_size = min_refcount_cache; -} } +/* l2_cache_size and refcount_cache_size are ensured to have at least + * their minimum values in qcow2_update_options_prepare() */ if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) || *l2_cache_entry_size > s->cluster_size || -- 2.17.1
[Qemu-block] [PATCH v10 5/9] qcow2: Assign the L2 cache relatively to the image size
Sufficient L2 cache can noticeably improve the performance when using large images with frequent I/O. Previously, unless 'cache-size' was specified and was large enough, the L2 cache was set to a certain size without taking the virtual image size into account. Now, the L2 cache assignment is aware of the virtual size of the image, and will cover the entire image, unless the cache size needed for that is larger than a certain maximum. This maximum is set to 1 MB by default (enough to cover an 8 GB image with the default cluster size) but can be increased or decreased using the 'l2-cache-size' option. This option was previously documented as the *maximum* L2 cache size, and this patch makes it behave as such, instead of as a constant size. Also, the existing option 'cache-size' can limit the sum of both L2 and refcount caches, as previously. Signed-off-by: Leonid Bloch --- block/qcow2.c | 21 + block/qcow2.h | 4 +--- docs/qcow2-cache.txt | 15 ++- qemu-options.hx| 6 +++--- tests/qemu-iotests/137 | 8 +++- tests/qemu-iotests/137.out | 4 +++- 6 files changed, 33 insertions(+), 25 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 7949d15fc6..01c39c56c0 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, uint64_t *refcount_cache_size, Error **errp) { BDRVQcow2State *s = bs->opaque; -uint64_t combined_cache_size; +uint64_t combined_cache_size, l2_cache_max_setting; bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set; int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size; +uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; +uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE); l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE); refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE); combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0); -*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0); +l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, + DEFAULT_L2_CACHE_MAX_SIZE); *refcount_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0); *l2_cache_entry_size = qemu_opt_get_size( opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size); +*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting); + if (combined_cache_size_set) { if (l2_cache_size_set && refcount_cache_size_set) { error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE " and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set " "at the same time"); return; -} else if (*l2_cache_size > combined_cache_size) { +} else if (l2_cache_size_set && + (l2_cache_max_setting > combined_cache_size)) { error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed " QCOW2_OPT_CACHE_SIZE); return; @@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } else if (refcount_cache_size_set) { *l2_cache_size = combined_cache_size - *refcount_cache_size; } else { -uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; -uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); - /* Assign as much memory as possible to the L2 cache, and * use the remainder for the refcount cache */ if (combined_cache_size >= max_l2_cache + min_refcount_cache) { @@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, *l2_cache_size = combined_cache_size - *refcount_cache_size; } } -} else { -if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, - (uint64_t)DEFAULT_L2_CACHE_CLUSTERS - * s->cluster_size); -} } /* l2_cache_size and refcount_cache_size are ensured to have at least * their minimum values in qcow2_update_options_prepare() */ diff --git a/block/qcow2.h b/block/qcow2.h index a8d6f757b1..2f8c1fd15c 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,9 +74,7 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -/* Whichever is more */ -#define DEFAULT_L2_CACHE_CLUSTE
[Qemu-block] [PATCH v10 9/9] qcow2: Explicit number replaced by a constant
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index f885afa0ed..ffb4a9e4a1 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options, /* 2^(s->refcount_order - 3) is the refcount width in bytes */ s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); s->refcount_block_size = 1 << s->refcount_block_bits; -bs->total_sectors = header.size / 512; +bs->total_sectors = header.size / BDRV_SECTOR_SIZE; s->csize_shift = (62 - (s->cluster_bits - 8)); s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; s->cluster_offset_mask = (1LL << s->csize_shift) - 1; @@ -3450,7 +3450,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, goto fail; } -old_length = bs->total_sectors * 512; +old_length = bs->total_sectors * BDRV_SECTOR_SIZE; new_l1_size = size_to_l1(s, offset); if (offset < old_length) { -- 2.17.1
[Qemu-block] [PATCH v10 2/9] include: Add a lookup table of sizes
Adding a lookup table for the powers of two, with the appropriate size prefixes. This is needed when a size has to be stringified, in which case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))' string. Powers of two are used very often for sizes, so such a table will also make it easier and more intuitive to write them. This table is generatred using the following AWK script: BEGIN { suffix="KMGTPE"; for(i=10; i<64; i++) { val=2**i; s=substr(suffix, int(i/10), 1); n=2**(i%10); pad=21-int(log(n)/log(10)); printf("#define S_%d%siB %*d\n", n, s, pad, val); } } Signed-off-by: Leonid Bloch --- include/qemu/units.h | 55 1 file changed, 55 insertions(+) diff --git a/include/qemu/units.h b/include/qemu/units.h index 692db3fbb2..68a7758650 100644 --- a/include/qemu/units.h +++ b/include/qemu/units.h @@ -17,4 +17,59 @@ #define PiB (INT64_C(1) << 50) #define EiB (INT64_C(1) << 60) +#define S_1KiB 1024 +#define S_2KiB 2048 +#define S_4KiB 4096 +#define S_8KiB 8192 +#define S_16KiB16384 +#define S_32KiB32768 +#define S_64KiB65536 +#define S_128KiB 131072 +#define S_256KiB 262144 +#define S_512KiB 524288 +#define S_1MiB 1048576 +#define S_2MiB 2097152 +#define S_4MiB 4194304 +#define S_8MiB 8388608 +#define S_16MiB 16777216 +#define S_32MiB 33554432 +#define S_64MiB 67108864 +#define S_128MiB 134217728 +#define S_256MiB 268435456 +#define S_512MiB 536870912 +#define S_1GiB1073741824 +#define S_2GiB2147483648 +#define S_4GiB4294967296 +#define S_8GiB8589934592 +#define S_16GiB 17179869184 +#define S_32GiB 34359738368 +#define S_64GiB 68719476736 +#define S_128GiB137438953472 +#define S_256GiB274877906944 +#define S_512GiB549755813888 +#define S_1TiB 1099511627776 +#define S_2TiB 219902322 +#define S_4TiB 4398046511104 +#define S_8TiB 8796093022208 +#define S_16TiB 17592186044416 +#define S_32TiB 35184372088832 +#define S_64TiB 70368744177664 +#define S_128TiB 140737488355328 +#define S_256TiB 281474976710656 +#define S_512TiB 562949953421312 +#define S_1PiB 1125899906842624 +#define S_2PiB 2251799813685248 +#define S_4PiB 4503599627370496 +#define S_8PiB 9007199254740992 +#define S_16PiB18014398509481984 +#define S_32PiB36028797018963968 +#define S_64PiB72057594037927936 +#define S_128PiB 144115188075855872 +#define S_256PiB 288230376151711744 +#define S_512PiB 576460752303423488 +#define S_1EiB 1152921504606846976 +#define S_2EiB 2305843009213693952 +#define S_4EiB 4611686018427387904 +#define S_8EiB 9223372036854775808 + #endif -- 2.17.1
[Qemu-block] [PATCH v10 1/9] qcow2: Options' documentation fixes
Signed-off-by: Leonid Bloch --- docs/qcow2-cache.txt | 20 +--- qemu-options.hx | 9 ++--- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 8a09a5cc5f..013991e21c 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be configured. Choosing the right cache sizes -- In order to choose the cache sizes we need to know how they relate to -the amount of allocated space. +the amount of the allocated space. The amount of virtual disk that can be mapped by the L2 and refcount caches (in bytes) is: @@ -86,7 +86,7 @@ caches (in bytes) is: disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits With the default values for cluster_size (64KB) and refcount_bits -(16), that is +(16), this becomes: disk_size = l2_cache_size * 8192 disk_size = refcount_cache_size * 32768 @@ -97,12 +97,15 @@ need: l2_cache_size = disk_size_GB * 131072 refcount_cache_size = disk_size_GB * 32768 -QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount -cache of 256KB (262144 bytes), so using the formulas we've just seen -we have +For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual +image size (given that the default cluster size is used): - 1048576 / 131072 = 8 GB of virtual disk covered by that cache -262144 / 32768 = 8 GB + 8 GB / 8192 = 1 MB + +A default refcount cache is 4 times the cluster size, which defaults to +256 KB (262144 bytes). This is sufficient for 8 GB of image size: + + 262144 * 32768 = 8 GB How to configure the cache sizes @@ -130,6 +133,9 @@ There are a few things that need to be taken into account: memory as possible to the L2 cache before increasing the refcount cache size. + - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size" + can be set simultaneously. + Unlike L2 tables, refcount blocks are not used during normal I/O but only during allocations and internal snapshots. In most cases they are accessed sequentially (even during random guest I/O) so increasing the diff --git a/qemu-options.hx b/qemu-options.hx index 654ef484d9..ab1a3b240e 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -742,15 +742,18 @@ image file) @item cache-size The maximum total size of the L2 table and refcount block caches in bytes -(default: 1048576 bytes or 8 clusters, whichever is larger) +(default: the sum of l2-cache-size and refcount-cache-size) @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: 4/5 of the total cache size) +(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever +is larger; otherwise, as large as possible or needed within the cache-size, +while permitting the requested or the minimal refcount cache size) @item refcount-cache-size The maximum size of the refcount block cache in bytes -(default: 1/5 of the total cache size) +(default: 4 times the cluster size; or if cache-size is specified, the part of +it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -- 2.17.1
[Qemu-block] [PATCH v10 3/9] qcow2: Make sizes more humanly readable
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 2 +- block/qcow2.h | 9 + 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index ec9e6238a0..67cc82f0b9 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } } else { if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE, +*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } diff --git a/block/qcow2.h b/block/qcow2.h index 81b844e936..a8d6f757b1 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -27,6 +27,7 @@ #include "crypto/block.h" #include "qemu/coroutine.h" +#include "qemu/units.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 @@ -43,11 +44,11 @@ /* 8 MB refcount table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_REFTABLE_SIZE 0x80 +#define QCOW_MAX_REFTABLE_SIZE S_8MiB /* 32 MB L1 table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_L1_SIZE 0x200 +#define QCOW_MAX_L1_SIZE S_32MiB /* Allow for an average of 1k per snapshot table entry, should be plenty of * space for snapshot names and IDs */ @@ -75,9 +76,9 @@ /* Whichever is more */ #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */ -#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */ +#define DEFAULT_L2_CACHE_SIZE S_1MiB -#define DEFAULT_CLUSTER_SIZE 65536 +#define DEFAULT_CLUSTER_SIZE S_64KiB #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" -- 2.17.1
Re: [Qemu-block] [PATCH v9 8/9] qcow2: Set the default cache-clean-interval to 10 minutes
On 9/21/18 3:35 PM, Alberto Garcia wrote: On Tue 18 Sep 2018 05:29:22 PM CEST, Leonid Bloch wrote: /* New interval for cache cleanup timer */ r->cache_clean_interval = qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, -s->cache_clean_interval); +DEFAULT_CACHE_CLEAN_INTERVAL); I just realized we're ignoring the old value (s->cache_clean_interval) here. What's the consequence of this? (this was a change made by Kevin Wolf in 5b0959a7d432062dcd740f8065004285b15695fa). #ifndef CONFIG_LINUX if (r->cache_clean_interval != 0) { error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL " not supported on this host"); Another thing that I noticed (see below)... diff --git a/block/qcow2.h b/block/qcow2.h index 0f0e3534bf..8c863897b0 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -82,6 +82,7 @@ #define DEFAULT_CLUSTER_SIZE S_64KiB +#define DEFAULT_CACHE_CLEAN_INTERVAL 600 /* seconds */ Shouldn't this be set to 0 in non-Linux platforms? Otherwise it will try to set it to 600 by default in all platforms and will complain with the "not supported on this host" error message that I quoted above. You're right! Will fix, thanks. Leonid. Berto
Re: [Qemu-block] [PATCH v9 2/9] include: Add a lookup table of sizes
On 9/21/18 3:17 PM, Alberto Garcia wrote: On Tue 18 Sep 2018 05:29:16 PM CEST, Leonid Bloch wrote: Adding a lookup table for the powers of two, with the appropriate size prefixes. This is needed when a size has to be stringified, in which case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))' string. Powers of two are used very often for sizes, so such a table will also make it easier and more intuitive to write them. I wonder in what cases you want to stringify those literals... if it's something that you want to show the user you either: a) know the value in advance, and then you probably want to show "4 GiB" instead of "4294967296 bytes" Then the value will need to be set in two places at any future change. b) don't know the value in advance (it's a variable), but then you can't use these macros. Am I missing anything? This is needed in some places: $ ag 'stringify\(DEFAULT_CLUSTER_SIZE\)' block/parallels.c 99:.def_value_str = stringify(DEFAULT_CLUSTER_SIZE), block/vdi.c 988:.def_value_str = stringify(DEFAULT_CLUSTER_SIZE) block/qcow2.c 4726:.def_value_str = stringify(DEFAULT_CLUSTER_SIZE) The latter causes failure in test case 137 if defined as (64 * KiB). Of course the values in block/parallels.h and block/vdi.c will need to be changed as well, I'll do it in a separate patch. Besides resolving the cases with the necessity to stringify, It also creates nice shortcuts for all feasible power-of-two sizes, which are very common. Leonid. Berto
[Qemu-block] [PATCH v9 6/9] qcow2: Increase the default upper limit on the L2 cache size
The upper limit on the L2 cache size is increased from 1 MB to 32 MB on Linux platforms, and to 8 MB on other platforms (this difference is caused by the ability to set intervals for cache cleaning on Linux platforms only). This is done in order to allow default full coverage with the L2 cache for images of up to 256 GB in size (was 8 GB). Note, that only the needed amount to cover the full image is allocated. The value which is changed here is just the upper limit on the L2 cache size, beyond which it will not grow, even if the size of the image will require it to. Signed-off-by: Leonid Bloch --- block/qcow2.h| 6 +- docs/qcow2-cache.txt | 15 +-- qemu-options.hx | 6 +++--- 3 files changed, 17 insertions(+), 10 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index 2f8c1fd15c..0f0e3534bf 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,7 +74,11 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -#define DEFAULT_L2_CACHE_MAX_SIZE S_1MiB +#ifdef CONFIG_LINUX +#define DEFAULT_L2_CACHE_MAX_SIZE S_32MiB +#else +#define DEFAULT_L2_CACHE_MAX_SIZE S_8MiB +#endif #define DEFAULT_CLUSTER_SIZE S_64KiB diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index c84cd69cc7..5965d3d094 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -124,12 +124,15 @@ There are a few things that need to be taken into account: - Both caches must have a size that is a multiple of the cluster size (or the cache entry size: see "Using smaller cache sizes" below). - - The maximum L2 cache size is 1 MB by default (enough for full coverage - of 8 GB images, with the default cluster size). This value can be - modified using the "l2-cache-size" option. QEMU will not use more memory - than needed to hold all of the image's L2 tables, regardless of this max. - value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see - below). + - The maximum L2 cache size is 32 MB by default on Linux platforms (enough + for full coverage of 256 GB images, with the default cluster size). This + value can be modified using the "l2-cache-size" option. QEMU will not use + more memory than needed to hold all of the image's L2 tables, regardless + of this max. value. + On non-Linux platforms the maximal value is smaller by default (8 MB) and + this difference stems from the fact that on Linux the cache can be cleared + periodically if needed, using the "cache-clean-interval" option (see below). + The minimal L2 cache size is 2 clusters (or 2 cache entries, see below). - The default (and minimum) refcount cache size is 4 clusters. diff --git a/qemu-options.hx b/qemu-options.hx index f9fe43a4dc..d5f4bcadd4 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -746,9 +746,9 @@ The maximum total size of the L2 table and refcount block caches in bytes @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: if cache-size is not specified - 1M; otherwise, as large as possible -within the cache-size, while permitting the requested or the minimal refcount -cache size) +(default: if cache-size is not specified - 32M on Linux platforms, and 8M on +non-Linux platforms; otherwise, as large as possible within the cache-size, +while permitting the requested or the minimal refcount cache size) @item refcount-cache-size The maximum size of the refcount block cache in bytes -- 2.17.1
[Qemu-block] [PATCH v9 8/9] qcow2: Set the default cache-clean-interval to 10 minutes
The default cache-clean-interval is set to 10 minutes, in order to lower the overhead of the qcow2 caches (before the default was 0, i.e. disabled). Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c| 2 +- block/qcow2.h| 1 + docs/qcow2-cache.txt | 4 ++-- qapi/block-core.json | 3 ++- qemu-options.hx | 2 +- 5 files changed, 7 insertions(+), 5 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 1445cd5360..f885afa0ed 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState *bs, /* New interval for cache cleanup timer */ r->cache_clean_interval = qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, -s->cache_clean_interval); +DEFAULT_CACHE_CLEAN_INTERVAL); #ifndef CONFIG_LINUX if (r->cache_clean_interval != 0) { error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL diff --git a/block/qcow2.h b/block/qcow2.h index 0f0e3534bf..8c863897b0 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -82,6 +82,7 @@ #define DEFAULT_CLUSTER_SIZE S_64KiB +#define DEFAULT_CACHE_CLEAN_INTERVAL 600 /* seconds */ #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request" diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 5965d3d094..15ae797931 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -209,8 +209,8 @@ This example removes all unused cache entries every 15 minutes: -drive file=hd.qcow2,cache-clean-interval=900 -If unset, the default value for this parameter is 0 and it disables -this feature. +If unset, the default value for this parameter is 600. Setting it to 0 +disables this feature. Note that this functionality currently relies on the MADV_DONTNEED argument for madvise() to actually free the memory. This is a diff --git a/qapi/block-core.json b/qapi/block-core.json index 4c7a37afdc..08c27b9af7 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2827,7 +2827,8 @@ # # @cache-clean-interval: clean unused entries in the L2 and refcount # caches. The interval is in seconds. The default value -# is 0 and it disables this feature (since 2.5) +# is 600, and 0 disables this feature. (since 2.5) +# # @encrypt: Image decryption options. Mandatory for # encrypted images, except when doing a metadata-only # probe of the image. (since 2.10) diff --git a/qemu-options.hx b/qemu-options.hx index d5f4bcadd4..2975fdf9f8 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -757,7 +757,7 @@ it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -The default value is 0 and it disables this feature. +The default value is 600. Setting it to 0 disables this feature. @item pass-discard-request Whether discard requests to the qcow2 device should be forwarded to the data -- 2.17.1
[Qemu-block] [PATCH v9 0/9] Take the image size into account when allocating the L2 cache
Sorry for taking such a long pause after v8. I had several extremely urgent issues to attend to. This series makes the qcow2 L2 cache assignment aware of the image size, with the intention for it to cover the entire image. The importance of this change is in noticeable performance improvement, especially with heavy random I/O. The memory overhead is not big in most cases, as only 1 MB of cache for every 8 GB of image size is used. For cases with very large images and/or small cluster sizes, or systems with limited RAM resources, there is an upper limit on the default L2 cache: 32 MB for Linux systems, and 8 MB for non-Linux systems (the difference is caused by the fact that it is currently impossible to perform scheduled cache cleaning on non-Linux systems). To modify this upper limit one can use the already existing 'l2-cache-size' and 'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size', as it was documented as the *maximum* L2 cache size, but in practice behaved as the absolute size. To compensate the memory overhead which may (but not necesarily will) be increased following this behavior, the default cache-clean-interval is set to 10 minutes by default (was disabled by default before). The L2 cache is also resized accordingly, by default, if the image is resized. Additionally, few minor changes are made (refactoring and documentation fixes). Differences from v1: * .gitignore modification patch removed (unneeded). * The grammar fix in conflicting cache sizing patch removed (merged). * The update total_sectors when resizing patch squashed with the resizing patch. * L2 cache is now capped at 32 MB. * The default cache-clean-interval is set to 30 seconds. Differences from v2: * Made it clear in the documentation that setting cache-clean-interval to 0 disables this feature. Differences from v3: * The default cache-clean-interval is set to 10 minutes instead of 30 seconds before. * Commit message changes to better explain the patches. * Some refactoring. Differences from v4: * Refactoring. * Documentation and commit message fixes. Differences from v5: * A patch with cosmetic changes is split from the main patch * A patch for avoiding duplication in refcount cache size assignment is split from the main patch. * In the main patch the cap on the L2 cache size is set to only 1 MB (in order to be close to the previous behavior) and a separate patch sets it to 32 MB. * Changes in the documentation fixes patch [1/8]. Differences from v6: * A patch is added to make the defined sizes more humanly readable. Differences from v7: * Documentation and commit message changes. * Fix: when cache-size is specified and is large enough, with the L2 and refcount caches not set, the L2 cache will be as large as needed to cover the entire image, and not as the default max. size. (Return to the previous behavior in this part, which was correct). * Cosmetic changes patch not used. Differences from v8: * Made a lookup table for power-of-two sizes. This will enable writing pretty size values (most of the ones used are powers of two) while allowing correct stringification of these values. * Made the max. L2 cache size 8 MB on non-Linux systems. * Returned the test for too large L2 cache size. * Minor documentation fixes. Leonid Bloch (9): qcow2: Options' documentation fixes include: Add a lookup table of sizes qcow2: Make sizes more humanly readable qcow2: Avoid duplication in setting the refcount cache size qcow2: Assign the L2 cache relatively to the image size qcow2: Increase the default upper limit on the L2 cache size qcow2: Resize the cache upon image resizing qcow2: Set the default cache-clean-interval to 10 minutes qcow2: Explicit number replaced by a constant block/qcow2.c | 42 - block/qcow2.h | 16 ++- docs/qcow2-cache.txt | 42 +++-- include/qemu/units.h | 55 ++ qapi/block-core.json | 3 ++- qemu-options.hx| 11 +--- tests/qemu-iotests/137 | 8 +- tests/qemu-iotests/137.out | 4 ++- 8 files changed, 136 insertions(+), 45 deletions(-) -- 2.17.1
[Qemu-block] [PATCH v9 9/9] qcow2: Explicit number replaced by a constant
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index f885afa0ed..ffb4a9e4a1 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options, /* 2^(s->refcount_order - 3) is the refcount width in bytes */ s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); s->refcount_block_size = 1 << s->refcount_block_bits; -bs->total_sectors = header.size / 512; +bs->total_sectors = header.size / BDRV_SECTOR_SIZE; s->csize_shift = (62 - (s->cluster_bits - 8)); s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; s->cluster_offset_mask = (1LL << s->csize_shift) - 1; @@ -3450,7 +3450,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, goto fail; } -old_length = bs->total_sectors * 512; +old_length = bs->total_sectors * BDRV_SECTOR_SIZE; new_l1_size = size_to_l1(s, offset); if (offset < old_length) { -- 2.17.1
[Qemu-block] [PATCH v9 5/9] qcow2: Assign the L2 cache relatively to the image size
Sufficient L2 cache can noticeably improve the performance when using large images with frequent I/O. Previously, unless 'cache-size' was specified and was large enough, the L2 cache was set to a certain size without taking the virtual image size into account. Now, the L2 cache assignment is aware of the virtual size of the image, and will cover the entire image, unless the cache size needed for that is larger than a certain maximum. This maximum is set to 1 MB by default (enough to cover an 8 GB image with the default cluster size) but can be increased or decreased using the 'l2-cache-size' option. This option was previously documented as the *maximum* L2 cache size, and this patch makes it behave as such, instead of as a constant size. Also, the existing option 'cache-size' can limit the sum of both L2 and refcount caches, as previously. Signed-off-by: Leonid Bloch --- block/qcow2.c | 21 + block/qcow2.h | 4 +--- docs/qcow2-cache.txt | 15 ++- qemu-options.hx| 6 +++--- tests/qemu-iotests/137 | 8 +++- tests/qemu-iotests/137.out | 4 +++- 6 files changed, 33 insertions(+), 25 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 7949d15fc6..01c39c56c0 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, uint64_t *refcount_cache_size, Error **errp) { BDRVQcow2State *s = bs->opaque; -uint64_t combined_cache_size; +uint64_t combined_cache_size, l2_cache_max_setting; bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set; int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size; +uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; +uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE); l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE); refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE); combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0); -*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0); +l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, + DEFAULT_L2_CACHE_MAX_SIZE); *refcount_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0); *l2_cache_entry_size = qemu_opt_get_size( opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size); +*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting); + if (combined_cache_size_set) { if (l2_cache_size_set && refcount_cache_size_set) { error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE " and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set " "at the same time"); return; -} else if (*l2_cache_size > combined_cache_size) { +} else if (l2_cache_size_set && + (l2_cache_max_setting > combined_cache_size)) { error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed " QCOW2_OPT_CACHE_SIZE); return; @@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } else if (refcount_cache_size_set) { *l2_cache_size = combined_cache_size - *refcount_cache_size; } else { -uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; -uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); - /* Assign as much memory as possible to the L2 cache, and * use the remainder for the refcount cache */ if (combined_cache_size >= max_l2_cache + min_refcount_cache) { @@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, *l2_cache_size = combined_cache_size - *refcount_cache_size; } } -} else { -if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, - (uint64_t)DEFAULT_L2_CACHE_CLUSTERS - * s->cluster_size); -} } /* l2_cache_size and refcount_cache_size are ensured to have at least * their minimum values in qcow2_update_options_prepare() */ diff --git a/block/qcow2.h b/block/qcow2.h index a8d6f757b1..2f8c1fd15c 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,9 +74,7 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -/* Whichever is more */ -#define DEFAULT_L2_CACHE_CLUSTE
[Qemu-block] [PATCH v9 3/9] qcow2: Make sizes more humanly readable
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 2 +- block/qcow2.h | 9 + 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index ec9e6238a0..67cc82f0b9 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } } else { if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE, +*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } diff --git a/block/qcow2.h b/block/qcow2.h index 81b844e936..a8d6f757b1 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -27,6 +27,7 @@ #include "crypto/block.h" #include "qemu/coroutine.h" +#include "qemu/units.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 @@ -43,11 +44,11 @@ /* 8 MB refcount table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_REFTABLE_SIZE 0x80 +#define QCOW_MAX_REFTABLE_SIZE S_8MiB /* 32 MB L1 table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_L1_SIZE 0x200 +#define QCOW_MAX_L1_SIZE S_32MiB /* Allow for an average of 1k per snapshot table entry, should be plenty of * space for snapshot names and IDs */ @@ -75,9 +76,9 @@ /* Whichever is more */ #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */ -#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */ +#define DEFAULT_L2_CACHE_SIZE S_1MiB -#define DEFAULT_CLUSTER_SIZE 65536 +#define DEFAULT_CLUSTER_SIZE S_64KiB #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" -- 2.17.1
[Qemu-block] [PATCH v9 2/9] include: Add a lookup table of sizes
Adding a lookup table for the powers of two, with the appropriate size prefixes. This is needed when a size has to be stringified, in which case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))' string. Powers of two are used very often for sizes, so such a table will also make it easier and more intuitive to write them. This table is generatred using the following AWK script: BEGIN { suffix="KMGTPE"; for(i=10; i<64; i++) { val=2**i; s=substr(suffix, int(i/10), 1); n=2**(i%10); pad=21-int(log(n)/log(10)); printf("#define S_%d%siB %*d\n", n, s, pad, val); } } Signed-off-by: Leonid Bloch --- include/qemu/units.h | 55 1 file changed, 55 insertions(+) diff --git a/include/qemu/units.h b/include/qemu/units.h index 692db3fbb2..68a7758650 100644 --- a/include/qemu/units.h +++ b/include/qemu/units.h @@ -17,4 +17,59 @@ #define PiB (INT64_C(1) << 50) #define EiB (INT64_C(1) << 60) +#define S_1KiB 1024 +#define S_2KiB 2048 +#define S_4KiB 4096 +#define S_8KiB 8192 +#define S_16KiB16384 +#define S_32KiB32768 +#define S_64KiB65536 +#define S_128KiB 131072 +#define S_256KiB 262144 +#define S_512KiB 524288 +#define S_1MiB 1048576 +#define S_2MiB 2097152 +#define S_4MiB 4194304 +#define S_8MiB 8388608 +#define S_16MiB 16777216 +#define S_32MiB 33554432 +#define S_64MiB 67108864 +#define S_128MiB 134217728 +#define S_256MiB 268435456 +#define S_512MiB 536870912 +#define S_1GiB1073741824 +#define S_2GiB2147483648 +#define S_4GiB4294967296 +#define S_8GiB8589934592 +#define S_16GiB 17179869184 +#define S_32GiB 34359738368 +#define S_64GiB 68719476736 +#define S_128GiB137438953472 +#define S_256GiB274877906944 +#define S_512GiB549755813888 +#define S_1TiB 1099511627776 +#define S_2TiB 219902322 +#define S_4TiB 4398046511104 +#define S_8TiB 8796093022208 +#define S_16TiB 17592186044416 +#define S_32TiB 35184372088832 +#define S_64TiB 70368744177664 +#define S_128TiB 140737488355328 +#define S_256TiB 281474976710656 +#define S_512TiB 562949953421312 +#define S_1PiB 1125899906842624 +#define S_2PiB 2251799813685248 +#define S_4PiB 4503599627370496 +#define S_8PiB 9007199254740992 +#define S_16PiB18014398509481984 +#define S_32PiB36028797018963968 +#define S_64PiB72057594037927936 +#define S_128PiB 144115188075855872 +#define S_256PiB 288230376151711744 +#define S_512PiB 576460752303423488 +#define S_1EiB 1152921504606846976 +#define S_2EiB 2305843009213693952 +#define S_4EiB 4611686018427387904 +#define S_8EiB 9223372036854775808 + #endif -- 2.17.1
[Qemu-block] [PATCH v9 4/9] qcow2: Avoid duplication in setting the refcount cache size
The refcount cache size does not need to be set to its minimum value in read_cache_sizes(), as it is set to at least its minimum value in qcow2_update_options_prepare(). Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 67cc82f0b9..7949d15fc6 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } -if (!refcount_cache_size_set) { -*refcount_cache_size = min_refcount_cache; -} } +/* l2_cache_size and refcount_cache_size are ensured to have at least + * their minimum values in qcow2_update_options_prepare() */ if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) || *l2_cache_entry_size > s->cluster_size || -- 2.17.1
[Qemu-block] [PATCH v9 7/9] qcow2: Resize the cache upon image resizing
The caches are now recalculated upon image resizing. This is done because the new default behavior of assigning L2 cache relatively to the image size, implies that the cache will be adapted accordingly after an image resize. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index 01c39c56c0..1445cd5360 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3418,6 +3418,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, uint64_t old_length; int64_t new_l1_size; int ret; +QDict *options; if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA && prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL) @@ -3642,6 +3643,8 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } } +bs->total_sectors = offset / BDRV_SECTOR_SIZE; + /* write updated header.size */ offset = cpu_to_be64(offset); ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size), @@ -3652,6 +3655,13 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } s->l1_vm_state_index = new_l1_size; + +/* Update cache sizes */ +options = qdict_clone_shallow(bs->options); +ret = qcow2_update_options(bs, options, s->flags, errp); +if (ret < 0) { +goto fail; +} ret = 0; fail: qemu_co_mutex_unlock(>lock); -- 2.17.1
[Qemu-block] [PATCH v9 1/9] qcow2: Options' documentation fixes
Signed-off-by: Leonid Bloch --- docs/qcow2-cache.txt | 20 +--- qemu-options.hx | 9 ++--- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 8a09a5cc5f..013991e21c 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be configured. Choosing the right cache sizes -- In order to choose the cache sizes we need to know how they relate to -the amount of allocated space. +the amount of the allocated space. The amount of virtual disk that can be mapped by the L2 and refcount caches (in bytes) is: @@ -86,7 +86,7 @@ caches (in bytes) is: disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits With the default values for cluster_size (64KB) and refcount_bits -(16), that is +(16), this becomes: disk_size = l2_cache_size * 8192 disk_size = refcount_cache_size * 32768 @@ -97,12 +97,15 @@ need: l2_cache_size = disk_size_GB * 131072 refcount_cache_size = disk_size_GB * 32768 -QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount -cache of 256KB (262144 bytes), so using the formulas we've just seen -we have +For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual +image size (given that the default cluster size is used): - 1048576 / 131072 = 8 GB of virtual disk covered by that cache -262144 / 32768 = 8 GB + 8 GB / 8192 = 1 MB + +A default refcount cache is 4 times the cluster size, which defaults to +256 KB (262144 bytes). This is sufficient for 8 GB of image size: + + 262144 * 32768 = 8 GB How to configure the cache sizes @@ -130,6 +133,9 @@ There are a few things that need to be taken into account: memory as possible to the L2 cache before increasing the refcount cache size. + - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size" + can be set simultaneously. + Unlike L2 tables, refcount blocks are not used during normal I/O but only during allocations and internal snapshots. In most cases they are accessed sequentially (even during random guest I/O) so increasing the diff --git a/qemu-options.hx b/qemu-options.hx index 654ef484d9..ab1a3b240e 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -742,15 +742,18 @@ image file) @item cache-size The maximum total size of the L2 table and refcount block caches in bytes -(default: 1048576 bytes or 8 clusters, whichever is larger) +(default: the sum of l2-cache-size and refcount-cache-size) @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: 4/5 of the total cache size) +(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever +is larger; otherwise, as large as possible or needed within the cache-size, +while permitting the requested or the minimal refcount cache size) @item refcount-cache-size The maximum size of the refcount block cache in bytes -(default: 1/5 of the total cache size) +(default: 4 times the cluster size; or if cache-size is specified, the part of +it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -- 2.17.1
Re: [Qemu-block] [PATCH v8 0/8] Take the image size into account when allocating the L2 cache
Hi Kevin & All, Sorry it took so long to send a new version! I had some very urgent things popping up on several fronts. I will send the new version over this weekend. Thanks for the reminder, and sorry again. Leonid. ___ On 9/10/18 5:33 PM, Kevin Wolf wrote: Hi Leonid, Am 13.08.2018 um 03:07 hat Leonid Bloch geschrieben: This series makes the qcow2 L2 cache assignment aware of the image size, with the intention for it to cover the entire image. The importance of this change is in noticeable performance improvement, especially with heavy random I/O. The memory overhead is not big in most cases, as only 1 MB of cache for every 8 GB of image size is used. For cases with very large images and/or small cluster sizes, or systems with limited RAM resources, there is an upper limit on the default L2 cache: 32 MB. To modify this limit one can use the already existing 'l2-cache-size' and 'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size', as it was documented as the *maximum* L2 cache size, but in practice behaved as the absolute size. To compensate the memory overhead which may be increased following this behavior, the default cache-clean-interval is set to 10 minutes by default (was disabled by default before). The L2 cache is also resized accordingly, by default, if the image is resized. after digging through the most urgent part of my post-vacation backlog, I saw this series again. It's been a while, but I don't think much is missing, so let's get this finished now. :-) We had some discussion in the v7 thread after you posted v8, and you mentioned there that you'd do some changes in a v9 (at first sight, just some test case stuff). Are you still planning to send that v9? Kevin
Re: [Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
On 8/14/18 2:44 PM, Kevin Wolf wrote: Am 14.08.2018 um 13:34 hat Leonid Bloch geschrieben: On 8/14/18 11:18 AM, Kevin Wolf wrote: Am 13.08.2018 um 18:42 hat Leonid Bloch geschrieben: I don't actually think it's so bad to keep the cache permanently allocated, but I wouldn't object to a lower default for non-Linux hosts either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB) might be more adequate. My typical desktop VMs are larger than 8 GB, but smaller than 32 GB. And for a Windows VM just the OS installation takes above 40 GB. While we probably are not running Windows VMs for our own needs, it is very common that a customer of, for example, some cloud service uses QEMU (unknowingly) for a full-blown Windows. So 100 GB+ images which are quite heavily used is not a rare scenario. 256 GB - yeah, that would be on the higher end. The OS installation is mostly sequential access, though. You only need that much cache when you have completely random I/O across the whole image. Otherwise the LRU based approach of the cache is good enough to keep those tables cached that are actually in use. Sorry, by "OS installation" I meant the installed size of the OS, which should be available for fast and frequent access, not the installation process itself. Obviously for one-time tasks like the installation process it's not worth it, unless one installs all the time, instead of using ready images, for some reason. :) But you never use everything that is present in an OS installation of 40 GB (is it really _that_ huge these days?), and you don't read OS files non-stop. The most frequently used parts of the OS are actually in the guest RAM. Yes, Windows 8.1, with all the desktop bloat - just above 40 GB. :] I did a proper benchmarking indeed only on heavy I/O load, where full cache did show above 50% improvement, although just regular usage felt faster as well, but maybe it's just psychosomatic. :) Leonid. I don't think you'll really notice the difference in qcow2 unless you have a really I/O intensive workload - and that is not usually for OS files, but for user data. For only occasional accesses, the additional 64k for the metadata table wouldn't play a big role. Kevin
Re: [Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
On 8/14/18 11:18 AM, Kevin Wolf wrote: Am 13.08.2018 um 18:42 hat Leonid Bloch geschrieben: I don't actually think it's so bad to keep the cache permanently allocated, but I wouldn't object to a lower default for non-Linux hosts either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB) might be more adequate. My typical desktop VMs are larger than 8 GB, but smaller than 32 GB. And for a Windows VM just the OS installation takes above 40 GB. While we probably are not running Windows VMs for our own needs, it is very common that a customer of, for example, some cloud service uses QEMU (unknowingly) for a full-blown Windows. So 100 GB+ images which are quite heavily used is not a rare scenario. 256 GB - yeah, that would be on the higher end. The OS installation is mostly sequential access, though. You only need that much cache when you have completely random I/O across the whole image. Otherwise the LRU based approach of the cache is good enough to keep those tables cached that are actually in use. Sorry, by "OS installation" I meant the installed size of the OS, which should be available for fast and frequent access, not the installation process itself. Obviously for one-time tasks like the installation process it's not worth it, unless one installs all the time, instead of using ready images, for some reason. :) The maximum cache size is maybe for huge databases or indeed random I/O benchmarks, both of which are important to support (on Linux at least), but probably not the most common use case. So 16 MB would indeed be a reasonable default for the *max.* L2 cache now, although below that would be too little, I think. 32 MB - if we want some future-proofing. I think we all agree that 32 MB + cache-clean-interval is okay. It's just that for non-Linux guests, cache-clean-interval doesn't work. However, we probably care less about those large random I/O cases there, so a smaller cache size like 1 MB can do on non-Linux. Kevin
Re: [Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
I don't actually think it's so bad to keep the cache permanently allocated, but I wouldn't object to a lower default for non-Linux hosts either. 1 MB may still be a little too low, 4 MB (covers up to 32 GB) might be more adequate. My typical desktop VMs are larger than 8 GB, but smaller than 32 GB. Kevin And for a Windows VM just the OS installation takes above 40 GB. While we probably are not running Windows VMs for our own needs, it is very common that a customer of, for example, some cloud service uses QEMU (unknowingly) for a full-blown Windows. So 100 GB+ images which are quite heavily used is not a rare scenario. 256 GB - yeah, that would be on the higher end. So 16 MB would indeed be a reasonable default for the *max.* L2 cache now, although below that would be too little, I think. 32 MB - if we want some future-proofing. Leonid.
Re: [Qemu-block] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size
On 8/13/18 2:33 PM, Kevin Wolf wrote: Am 11.08.2018 um 21:19 hat Leonid Bloch geschrieben: @item refcount-cache-size The maximum size of the refcount block cache in bytes diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137 index 87965625d8..e3fb078588 100755 --- a/tests/qemu-iotests/137 +++ b/tests/qemu-iotests/137 @@ -109,7 +109,6 @@ $QEMU_IO \ -c "reopen -o cache-size=1M,l2-cache-size=64k,refcount-cache-size=64k" \ -c "reopen -o cache-size=1M,l2-cache-size=2M" \ -c "reopen -o cache-size=1M,refcount-cache-size=2M" \ --c "reopen -o l2-cache-size=256T" \ The "L2 cache size too big" error can still be tested, but you will need to create an image large enough to allow such a big cache. $ qemu-img create -f qcow2 -o cluster_size=256k hd.qcow2 32P $ $QEMU -drive file=hd.qcow2,l2-cache-entry-size=512,l2-cache-size=1T * 32P qcow2 will take 33M - is it OK to create it just for a test? * Is it worth to create a special test scenario, with a separate image creation, just for that case? We're creating larger images than that during tests, so I think this is fine. You don't have to create a new separate test file or anything, just increase the size of the used test image from 64M to 32P or whatever is necessary. OK, sure, will do in v9. Leonid. Kevin
Re: [Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
On August 13, 2018 4:39:35 AM EEST, Max Reitz wrote: >On 2018-08-10 14:00, Alberto Garcia wrote: >> On Fri 10 Aug 2018 08:26:44 AM CEST, Leonid Bloch wrote: >>> The upper limit on the L2 cache size is increased from 1 MB to 32 >MB. >>> This is done in order to allow default full coverage with the L2 >cache >>> for images of up to 256 GB in size (was 8 GB). Note, that only the >>> needed amount to cover the full image is allocated. The value which >is >>> changed here is just the upper limit on the L2 cache size, beyond >which >>> it will not grow, even if the size of the image will require it to. >>> >>> Signed-off-by: Leonid Bloch >> >> Reviewed-by: Alberto Garcia >> >>> -#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB) >>> +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB) >> >> The patch looks perfect to me now and I'm fine with this change, but >> this is quite an increase from the previous default value. If anyone >> thinks that this is too aggressive (or too little :)) I'm all ears. > >This is just noise from the sidelines (so nothing too serious), but >anyway, I don't like it very much. > >My first point is that the old limit doesn't mean you can only use 8 GB >qcow2 images. You can use more, you just can't access more than 8 GB >randomly. I know I'm naive, but I think that the number of use cases >where you need random IOPS spread out over more than 8 GB of an image >are limited. > >My second point is that qemu still allocated 128 MB of RAM by default. >Using 1/4th of that for every qcow2 image you attach to the VM seems a >bit much. > >Now it gets a bit complicated. This series makes cache-clean-interval >default to 10 minutes, so it shouldn't be an issue in practice. But >one >thing to note is that this is a Linux-specific feature, so on every >other system, this really means 32 MB per image. (Also, 10 minutes >means that whenever I boot up my VM with a couple of disks with random >accesses all over the images during boot, I might end up using 32 MB >per >image again (for 10 min), even though I don't really need that >performance.) > >Now if we really rely on that cache-clean-interval, why not make it >always cover the whole image by default? I don't really see why we >should now say "256 GB seems reasonable, and 32 MB doesn't sound like >too much, let's go there". (Well, OK, I do see how you end up using 32 >MB as basically a safety margin, where you'd say that anything above it >is just unreasonable.) > >Do we update the limit in a couple of years again because people have >more RAM and larger disks then? (Maybe we do?) > >My personal opinion is this: Most users should be fine with 8 GB of >randomly accessible image space (this may be wrong). Whenever a user >does have an application that uses more than 8 GB, they are probably in >an area where they want to do some performance tuning anyway. >Requiring >them to set l2-cache-full in that case seems reasonable to me. Pushing >the default to 256 GB to me looks a bit like just letting them run into >the problem later. It doesn't solve the issue that you need to do some >performance tuning if you have a bit of a special use case (but maybe >I'm wrong and accessing more than 8 GB randomly is what everybody does >with their VMs). > >(Maybe it's even a good thing to limit it to a smaller number so users >run into the issue sooner than later...) > >OTOH, this change means that everyone on a non-Linux system will have >to >use 32 MB of their RAM per qcow2 image, and everyone on a Linux system >will potentially use it e.g. during boot when you do access a lot >randomly (even though the performance usually is not of utmost >importance then (important, but not extremely so)). But then again, >this will probably only affect a single disk (the one with the OS on >it), so it won't be too bad. > >So my stance is: > >(1) Is it really worth pushing the default to 256 GB if you probably >have to do a bit of performance tuning anyway when you get past 8 GB >random IOPS? I think it's reasonable to ask users to use l2-cache-full >or adjust the cache to their needs. > >(2) For non-Linux systems, this seems to really mean 32 MB of RAM per >qcow2 image. That's 1/4th of default VM RAM. Is that worth it? > >(3) For Linux, I don't like it much either, but that's because I'm >stupid. The fact that if you don't need this much random I/O only your >boot disk may cause a RAM usage spike, and even then it's going to go >down after 10 minutes, is probably enough to justify this change. > > >I suppose my moaning would subside if we only increased the default on >systems th
[Qemu-block] [PATCH v8 1/8] qcow2: Options' documentation fixes
Signed-off-by: Leonid Bloch --- docs/qcow2-cache.txt | 20 +--- qemu-options.hx | 9 ++--- 2 files changed, 19 insertions(+), 10 deletions(-) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 8a09a5cc5f..2326db01b9 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -77,7 +77,7 @@ aforementioned L2 cache, and its size can also be configured. Choosing the right cache sizes -- In order to choose the cache sizes we need to know how they relate to -the amount of allocated space. +the amount of the allocated space. The amount of virtual disk that can be mapped by the L2 and refcount caches (in bytes) is: @@ -86,7 +86,7 @@ caches (in bytes) is: disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits With the default values for cluster_size (64KB) and refcount_bits -(16), that is +(16), this becomes: disk_size = l2_cache_size * 8192 disk_size = refcount_cache_size * 32768 @@ -97,12 +97,15 @@ need: l2_cache_size = disk_size_GB * 131072 refcount_cache_size = disk_size_GB * 32768 -QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount -cache of 256KB (262144 bytes), so using the formulas we've just seen -we have +For example, 1MB of L2 cache is needed to cover every 8 GB of the virtual +image size (given that the default cluster size is used): - 1048576 / 131072 = 8 GB of virtual disk covered by that cache -262144 / 32768 = 8 GB + 8 * 131072 = 1 MB + +A default refcount cache is 4 times the cluster size, which defaults to +256 KB (262144 bytes). This is sufficient for 8 GB of image size: + + 262144 / 32768 = 8 GB How to configure the cache sizes @@ -130,6 +133,9 @@ There are a few things that need to be taken into account: memory as possible to the L2 cache before increasing the refcount cache size. + - At most two of "l2-cache-size", "refcount-cache-size", and "cache-size" + can be set simultaneously. + Unlike L2 tables, refcount blocks are not used during normal I/O but only during allocations and internal snapshots. In most cases they are accessed sequentially (even during random guest I/O) so increasing the diff --git a/qemu-options.hx b/qemu-options.hx index b1bf0f485f..f6804758d3 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -752,15 +752,18 @@ image file) @item cache-size The maximum total size of the L2 table and refcount block caches in bytes -(default: 1048576 bytes or 8 clusters, whichever is larger) +(default: the sum of l2-cache-size and refcount-cache-size) @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: 4/5 of the total cache size) +(default: if cache-size is not defined - 1048576 bytes or 8 clusters, whichever +is larger; otherwise, as large as possible or needed within the cache-size, +while permitting the requested or the minimal refcount cache size) @item refcount-cache-size The maximum size of the refcount block cache in bytes -(default: 1/5 of the total cache size) +(default: 4 times the cluster size; or if cache-size is specified, the part of +it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -- 2.17.1
[Qemu-block] [PATCH v8 0/8] Take the image size into account when allocating the L2 cache
This series makes the qcow2 L2 cache assignment aware of the image size, with the intention for it to cover the entire image. The importance of this change is in noticeable performance improvement, especially with heavy random I/O. The memory overhead is not big in most cases, as only 1 MB of cache for every 8 GB of image size is used. For cases with very large images and/or small cluster sizes, or systems with limited RAM resources, there is an upper limit on the default L2 cache: 32 MB. To modify this limit one can use the already existing 'l2-cache-size' and 'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size', as it was documented as the *maximum* L2 cache size, but in practice behaved as the absolute size. To compensate the memory overhead which may be increased following this behavior, the default cache-clean-interval is set to 10 minutes by default (was disabled by default before). The L2 cache is also resized accordingly, by default, if the image is resized. Additionally, few minor changes are made (refactoring and documentation fixes). Differences from v1: * .gitignore modification patch removed (unneeded). * The grammar fix in conflicting cache sizing patch removed (merged). * The update total_sectors when resizing patch squashed with the resizing patch. * L2 cache is now capped at 32 MB. * The default cache-clean-interval is set to 30 seconds. Differences from v2: * Made it clear in the documentation that setting cache-clean-interval to 0 disables this feature. Differences from v3: * The default cache-clean-interval is set to 10 minutes instead of 30 seconds before. * Commit message changes to better explain the patches. * Some refactoring. Differences from v4: * Refactoring. * Documentation and commit message fixes. Differences from v5: * A patch with cosmetic changes is split from the main patch * A patch for avoiding duplication in refcount cache size assignment is split from the main patch. * In the main patch the cap on the L2 cache size is set to only 1 MB (in order to be close to the previous behavior) and a separate patch sets it to 32 MB. * Changes in the documentation fixes patch [1/8]. Differences from v6: * A patch is added to make the defined sizes more humanly readable. Differences from v7: * Documentation and commit message changes. * Fix: when cache-size is specified and is large enough, with the L2 and refcount caches not set, the L2 cache will be as large as needed to cover the entire image, and not as the default max. size. (Return to the previous behavior in this part, which was correct). * Cosmetic changes patch not used. Leonid Bloch (8): qcow2: Options' documentation fixes qcow2: Make sizes more humanly readable qcow2: Avoid duplication in setting the refcount cache size qcow2: Assign the L2 cache relatively to the image size qcow2: Increase the default upper limit on the L2 cache size qcow2: Resize the cache upon image resizing qcow2: Set the default cache-clean-interval to 10 minutes qcow2: Explicit number replaced by a constant block/qcow2.c | 42 ++ block/qcow2.h | 12 +-- docs/qcow2-cache.txt | 39 ++- qapi/block-core.json | 3 ++- qemu-options.hx| 11 ++ tests/qemu-iotests/137 | 1 - tests/qemu-iotests/137.out | 1 - 7 files changed, 64 insertions(+), 45 deletions(-) -- 2.17.1
[Qemu-block] [PATCH v8 8/8] qcow2: Explicit number replaced by a constant
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index f885afa0ed..ffb4a9e4a1 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -1324,7 +1324,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState *bs, QDict *options, /* 2^(s->refcount_order - 3) is the refcount width in bytes */ s->refcount_block_bits = s->cluster_bits - (s->refcount_order - 3); s->refcount_block_size = 1 << s->refcount_block_bits; -bs->total_sectors = header.size / 512; +bs->total_sectors = header.size / BDRV_SECTOR_SIZE; s->csize_shift = (62 - (s->cluster_bits - 8)); s->csize_mask = (1 << (s->cluster_bits - 8)) - 1; s->cluster_offset_mask = (1LL << s->csize_shift) - 1; @@ -3450,7 +3450,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, goto fail; } -old_length = bs->total_sectors * 512; +old_length = bs->total_sectors * BDRV_SECTOR_SIZE; new_l1_size = size_to_l1(s, offset); if (offset < old_length) { -- 2.17.1
[Qemu-block] [PATCH v8 3/8] qcow2: Avoid duplication in setting the refcount cache size
The refcount cache size does not need to be set to its minimum value in read_cache_sizes(), as it is set to at least its minimum value in qcow2_update_options_prepare(). Signed-off-by: Leonid Bloch --- block/qcow2.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 67cc82f0b9..7949d15fc6 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -834,10 +834,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } -if (!refcount_cache_size_set) { -*refcount_cache_size = min_refcount_cache; -} } +/* l2_cache_size and refcount_cache_size are ensured to have at least + * their minimum values in qcow2_update_options_prepare() */ if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) || *l2_cache_entry_size > s->cluster_size || -- 2.17.1
[Qemu-block] [PATCH v8 6/8] qcow2: Resize the cache upon image resizing
The caches are now recalculated upon image resizing. This is done because the new default behavior of assigning L2 cache relatively to the image size, implies that the cache will be adapted accordingly after an image resize. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/block/qcow2.c b/block/qcow2.c index 01c39c56c0..1445cd5360 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -3418,6 +3418,7 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, uint64_t old_length; int64_t new_l1_size; int ret; +QDict *options; if (prealloc != PREALLOC_MODE_OFF && prealloc != PREALLOC_MODE_METADATA && prealloc != PREALLOC_MODE_FALLOC && prealloc != PREALLOC_MODE_FULL) @@ -3642,6 +3643,8 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } } +bs->total_sectors = offset / BDRV_SECTOR_SIZE; + /* write updated header.size */ offset = cpu_to_be64(offset); ret = bdrv_pwrite_sync(bs->file, offsetof(QCowHeader, size), @@ -3652,6 +3655,13 @@ static int coroutine_fn qcow2_co_truncate(BlockDriverState *bs, int64_t offset, } s->l1_vm_state_index = new_l1_size; + +/* Update cache sizes */ +options = qdict_clone_shallow(bs->options); +ret = qcow2_update_options(bs, options, s->flags, errp); +if (ret < 0) { +goto fail; +} ret = 0; fail: qemu_co_mutex_unlock(>lock); -- 2.17.1
[Qemu-block] [PATCH v8 4/8] qcow2: Assign the L2 cache relatively to the image size
Sufficient L2 cache can noticeably improve the performance when using large images with frequent I/O. Previously, unless 'cache-size' was specified and was large enough, the L2 cache was set to a certain size without taking the virtual image size into account. Now, the L2 cache assignment is aware of the virtual size of the image, and will cover the entire image, unless the cache size needed for that is larger than a certain maximum. This maximum is set to 1 MB by default (enough to cover an 8 GB image with the default cluster size) but can be increased or decreased using the 'l2-cache-size' option. This option was previously documented as the *maximum* L2 cache size, and this patch makes it behave as such, instead of as a constant size. Also, the existing option 'cache-size' can limit the sum of both L2 and refcount caches, as previously. Signed-off-by: Leonid Bloch --- block/qcow2.c | 21 + block/qcow2.h | 4 +--- docs/qcow2-cache.txt | 15 ++- qemu-options.hx| 6 +++--- tests/qemu-iotests/137 | 1 - tests/qemu-iotests/137.out | 1 - 6 files changed, 23 insertions(+), 25 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 7949d15fc6..01c39c56c0 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -777,29 +777,35 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, uint64_t *refcount_cache_size, Error **errp) { BDRVQcow2State *s = bs->opaque; -uint64_t combined_cache_size; +uint64_t combined_cache_size, l2_cache_max_setting; bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set; int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size; +uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; +uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE); l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE); refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE); combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0); -*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0); +l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, + DEFAULT_L2_CACHE_MAX_SIZE); *refcount_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0); *l2_cache_entry_size = qemu_opt_get_size( opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size); +*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting); + if (combined_cache_size_set) { if (l2_cache_size_set && refcount_cache_size_set) { error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE " and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set " "at the same time"); return; -} else if (*l2_cache_size > combined_cache_size) { +} else if (l2_cache_size_set && + (l2_cache_max_setting > combined_cache_size)) { error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed " QCOW2_OPT_CACHE_SIZE); return; @@ -814,9 +820,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } else if (refcount_cache_size_set) { *l2_cache_size = combined_cache_size - *refcount_cache_size; } else { -uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; -uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); - /* Assign as much memory as possible to the L2 cache, and * use the remainder for the refcount cache */ if (combined_cache_size >= max_l2_cache + min_refcount_cache) { @@ -828,12 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, *l2_cache_size = combined_cache_size - *refcount_cache_size; } } -} else { -if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, - (uint64_t)DEFAULT_L2_CACHE_CLUSTERS - * s->cluster_size); -} } /* l2_cache_size and refcount_cache_size are ensured to have at least * their minimum values in qcow2_update_options_prepare() */ diff --git a/block/qcow2.h b/block/qcow2.h index 39e1b279f8..d917b5f577 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,9 +74,7 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -/* Whichever is more */ -#define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters
[Qemu-block] [PATCH v8 5/8] qcow2: Increase the default upper limit on the L2 cache size
The upper limit on the L2 cache size is increased from 1 MB to 32 MB. This is done in order to allow default full coverage with the L2 cache for images of up to 256 GB in size (was 8 GB). Note, that only the needed amount to cover the full image is allocated. The value which is changed here is just the upper limit on the L2 cache size, beyond which it will not grow, even if the size of the image will require it to. Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.h| 2 +- docs/qcow2-cache.txt | 4 ++-- qemu-options.hx | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index d917b5f577..e699a55d02 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,7 +74,7 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB) +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB) #define DEFAULT_CLUSTER_SIZE (64 * KiB) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index e89e74b372..0fc438f397 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -124,8 +124,8 @@ There are a few things that need to be taken into account: - Both caches must have a size that is a multiple of the cluster size (or the cache entry size: see "Using smaller cache sizes" below). - - The maximum L2 cache size is 1 MB by default (enough for full coverage - of 8 GB images, with the default cluster size). This value can be + - The maximum L2 cache size is 32 MB by default (enough for full coverage + of 256 GB images, with the default cluster size). This value can be modified using the "l2-cache-size" option. QEMU will not use more memory than needed to hold all of the image's L2 tables, regardless of this max. value. The minimal L2 cache size is 2 clusters (or 2 cache entries, see diff --git a/qemu-options.hx b/qemu-options.hx index 22e8e2d113..4c44cdbc23 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -756,7 +756,7 @@ The maximum total size of the L2 table and refcount block caches in bytes @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: if cache-size is not specified - 1M; otherwise, as large as possible +(default: if cache-size is not specified - 32M; otherwise, as large as possible within the cache-size, while permitting the requested or the minimal refcount cache size) -- 2.17.1
[Qemu-block] [PATCH v8 2/8] qcow2: Make sizes more humanly readable
Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c | 2 +- block/qcow2.h | 9 + 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index ec9e6238a0..67cc82f0b9 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -830,7 +830,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } } else { if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE, +*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } diff --git a/block/qcow2.h b/block/qcow2.h index 81b844e936..39e1b279f8 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -27,6 +27,7 @@ #include "crypto/block.h" #include "qemu/coroutine.h" +#include "qemu/units.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 @@ -43,11 +44,11 @@ /* 8 MB refcount table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_REFTABLE_SIZE 0x80 +#define QCOW_MAX_REFTABLE_SIZE (8 * MiB) /* 32 MB L1 table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_L1_SIZE 0x200 +#define QCOW_MAX_L1_SIZE (32 * MiB) /* Allow for an average of 1k per snapshot table entry, should be plenty of * space for snapshot names and IDs */ @@ -75,9 +76,9 @@ /* Whichever is more */ #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */ -#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */ +#define DEFAULT_L2_CACHE_SIZE (1 * MiB) -#define DEFAULT_CLUSTER_SIZE 65536 +#define DEFAULT_CLUSTER_SIZE (64 * KiB) #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" -- 2.17.1
Re: [Qemu-block] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size
On 8/10/18 5:39 PM, Alberto Garcia wrote: On Fri 10 Aug 2018 08:26:43 AM CEST, Leonid Bloch wrote: Previously, the L2 cache was allocated without considering the image size, and an option existed to manually determine its size. This is not quite correct: the L2 cache was set to the maximum needed for the image when "cache-size" was large enough: if (combined_cache_size >= max_l2_cache + min_refcount_cache) { *l2_cache_size = max_l2_cache; } ... See below for an example. You're right. I'll fix that. } else { -uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; -uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); - /* Assign as much memory as possible to the L2 cache, and * use the remainder for the refcount cache */ -if (combined_cache_size >= max_l2_cache + min_refcount_cache) { -*l2_cache_size = max_l2_cache; +if (combined_cache_size >= *l2_cache_size + min_refcount_cache) { This has an (unintended?) side effect: If you have a 1TB qcow2 image and open it with e.g. cache-size=200M, with the previous code you would get a 128MB L2 cache (max_l2_cache). With the new code you only get 32MB. Good catch!! Thanks! The rest of the function looks good to me now. We're getting close :) - - The default L2 cache size is 8 clusters or 1MB (whichever is more), - and the minimum is 2 clusters (or 2 cache entries, see below). + - The default L2 cache size will cover the entire virtual size of an + image, up to a certain maximum. This maximum is 1 MB by default + (enough for image sizes of up to 8 GB with the default cluster size) + and it can be reduced or enlarged using the "l2-cache-size" option. + The minimum is 2 clusters (or 2 cache entries, see below). It's not clear with this wording if this "certain maximum" refers to the cache size or the image size. "This maximum is 1 MB by default (enough for image sizes of up to 8 GB ..." - to me it's quite clear that it speaks of the image size. But I guess I can change the wording. I'm thinking that perhaps it's clearer if you leave that item unchanged and add a new one below, something like: I can't leave it unchanged for two reasons: (1) "8 clusters or 1MB (whichever is more)" (2) "default" - now it is not default, but maximal. OK, I'll change it to be more clear, incorporating your suggestion from below. - The L2 cache size setting (regardless of whether it takes the default value or it's set by the user) refers to the maximum size of the L2 cache. If QEMU needs less memory to hold all of the image's L2 tables, then that's the actual size of the cache that will be allocated. > - If the L2 cache is big enough to hold all of the image's L2 tables - (as explained in the "Choosing the right cache sizes" section - earlier in this document) then none of this is necessary and you - can omit the "l2-cache-entry-size" parameter altogether. + (the default behavior for images of up to 8 GB in size) then none + of this is necessary and you can omit the "l2-cache-entry-size" + parameter altogether. I think this change is unnecessary :-? Maybe I'll just mention the default here? Or refer to the section which speaks of the defaults ('as explained in the "Choosing the right cache sizes" and "How to configure the cache sizes" sections ...")? @item refcount-cache-size The maximum size of the refcount block cache in bytes diff --git a/tests/qemu-iotests/137 b/tests/qemu-iotests/137 index 87965625d8..e3fb078588 100755 --- a/tests/qemu-iotests/137 +++ b/tests/qemu-iotests/137 @@ -109,7 +109,6 @@ $QEMU_IO \ -c "reopen -o cache-size=1M,l2-cache-size=64k,refcount-cache-size=64k" \ -c "reopen -o cache-size=1M,l2-cache-size=2M" \ -c "reopen -o cache-size=1M,refcount-cache-size=2M" \ --c "reopen -o l2-cache-size=256T" \ The "L2 cache size too big" error can still be tested, but you will need to create an image large enough to allow such a big cache. $ qemu-img create -f qcow2 -o cluster_size=256k hd.qcow2 32P $ $QEMU -drive file=hd.qcow2,l2-cache-entry-size=512,l2-cache-size=1T * 32P qcow2 will take 33M - is it OK to create it just for a test? * Is it worth to create a special test scenario, with a separate image creation, just for that case? Leonid. Berto
Re: [Qemu-block] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size
On 8/10/18 4:14 PM, Alberto Garcia wrote: On Fri 10 Aug 2018 08:26:42 AM CEST, Leonid Bloch wrote: The refcount cache size does not need to be set to its minimum value in read_cache_sizes(), as it is set to at least its minimum value in qcow2_update_options_prepare(). Signed-off-by: Leonid Bloch -} else { -if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, - (uint64_t)DEFAULT_L2_CACHE_CLUSTERS - * s->cluster_size); -} -if (!refcount_cache_size_set) { -*refcount_cache_size = min_refcount_cache; -} Since you're getting rid of the rest of the code later anyway, I think it's cleaner to only remove these last three lines here and leave the rest untouched. It makes the patch shorter and easier to read. But a correct formatting is important, I think. In every patch individually. +/* If refcount-cache-size is not specified, it will be set to minimum + * in qcow2_update_options_prepare(). No need to set it here. */ This is not quite correct, because apart from the "not specified" case, refcount_cache_size is also overridden in other circumstances: (a) the value is specified but is too low, or (b) the value is set indirectly via "cache-size" but the end result is too low[*]. Yes, I'll fix the comment. But I think that it's important that it remains, because someone who looks at the code does not necessarily looks at the commit message, and it may be puzzling that a minimal size is not enforced there. How about the following: "l2_cache_size and refcount_cache_size are ensured to have at least their minimum values in qcow2_update_options_prepare()" Also, the same thing happens to l2-cache-size: if you set it manually but it's too low then it will be overridden. I'd personally remove the comment altogether, I think the explanation in the commit message is enough. Berto [*] Now that I think of it: if you set e.g. cache-size = 16M and l2-cache-size = 16MB then you'll end up with refcount-cache-size = 16M - 16M = 0, which will be overridden and set to the minimum. But then refcount-cache-size + l2-cache-size > cache-size Yes, I have noticed this behavior, and I think that it is OK: the errors should be a response to a wrong input of the user, who does not necessarily understands the internals. So if a user inputs l2-cache-size which is larger than cache-size, that's obviously a mistake. But if they are equal, it's totally OK if QEMU will use some 256 or 128 KB for the minimal caches transparently. This is really negligible relatively to the total QEMU overhead. Now the error message is: "l2-cache-size may not exceed cache-size". Is it reasonable to make it: "l2-cache-size may not exceed the size of cache-size minus the minimal refcount cache size" (or similar)? :) So perhaps this should produce an error, and it may make sense to take the opportunity to move all the logic that ensures that MIN_SIZE <= size <= INT_MAX to read_cache_sizes(). But that's a possible task for the future and I wouldn't worry about it in this series. Yeah, but it would be a good idea to move *all* the size-determining logic in one function. Leonid.
Re: [Qemu-block] [PATCH v7 1/9] qcow2: Options' documentation fixes
On 8/10/18 2:50 PM, Alberto Garcia wrote: On Fri 10 Aug 2018 08:26:39 AM CEST, Leonid Bloch wrote: Signed-off-by: Leonid Bloch --- docs/qcow2-cache.txt | 16 +++- qemu-options.hx | 9 ++--- 2 files changed, 17 insertions(+), 8 deletions(-) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 8a09a5cc5f..0f157d859a 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -97,12 +97,15 @@ need: l2_cache_size = disk_size_GB * 131072 refcount_cache_size = disk_size_GB * 32768 -QEMU has a default L2 cache of 1MB (1048576 bytes) and a refcount -cache of 256KB (262144 bytes), so using the formulas we've just seen -we have +With the default cluster size, to cover each 8 GB of the virtual image +size, 1MB of L2 cache is needed: - 1048576 / 131072 = 8 GB of virtual disk covered by that cache -262144 / 32768 = 8 GB + 65536 / 8 = 8192 = 8 GB / 1 MB Where does this 65536 / 8 come from? The line that you changed follows directly from the formula immediately before that paragraph: l2_cache_size = disk_size_GB * 131072 that is l2_cache_size / 131072 = disk_size_GB 1048576 / 131072 = 8 GB Berto How about the following (all the relevant section reproduced below, for a continuous readability): """""""""" Choosing the right cache sizes -- In order to choose the cache sizes we need to know how they relate to the amount of the allocated space. The amount of virtual disk that can be mapped by the L2 and refcount caches (in bytes) is: disk_size = l2_cache_size * cluster_size / 8 disk_size = refcount_cache_size * cluster_size * 8 / refcount_bits With the default values for cluster_size (64KB) and refcount_bits (16), this becomes: disk_size = l2_cache_size * 8192 disk_size = refcount_cache_size * 32768 So in order to cover n GB of disk space with the default values we need: l2_cache_size = disk_size_GB * 131072 refcount_cache_size = disk_size_GB * 32768 For example, 1MB of L2 cache is needed to cover each 8 GB of the virtual image size (given that the default cluster size is used): 8 * 131072 = 1 MB A default refcount cache is 4 times the cluster size, which defaults to 256 KB (262144 bytes). This is sufficient for 8 GB of image size: 262144 / 32768 = 8 GB """"""""""
[Qemu-block] [PATCH v7 3/9] qcow2: Make sizes more humanly readable
Signed-off-by: Leonid Bloch --- block/qcow2.c | 2 +- block/qcow2.h | 9 + 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 3f4abc394e..7a2d7a1d48 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -831,7 +831,7 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } } else { if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_BYTE_SIZE, +*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, (uint64_t)DEFAULT_L2_CACHE_CLUSTERS * s->cluster_size); } diff --git a/block/qcow2.h b/block/qcow2.h index 81b844e936..39e1b279f8 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -27,6 +27,7 @@ #include "crypto/block.h" #include "qemu/coroutine.h" +#include "qemu/units.h" //#define DEBUG_ALLOC //#define DEBUG_ALLOC2 @@ -43,11 +44,11 @@ /* 8 MB refcount table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_REFTABLE_SIZE 0x80 +#define QCOW_MAX_REFTABLE_SIZE (8 * MiB) /* 32 MB L1 table is enough for 2 PB images at 64k cluster size * (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */ -#define QCOW_MAX_L1_SIZE 0x200 +#define QCOW_MAX_L1_SIZE (32 * MiB) /* Allow for an average of 1k per snapshot table entry, should be plenty of * space for snapshot names and IDs */ @@ -75,9 +76,9 @@ /* Whichever is more */ #define DEFAULT_L2_CACHE_CLUSTERS 8 /* clusters */ -#define DEFAULT_L2_CACHE_BYTE_SIZE 1048576 /* bytes */ +#define DEFAULT_L2_CACHE_SIZE (1 * MiB) -#define DEFAULT_CLUSTER_SIZE 65536 +#define DEFAULT_CLUSTER_SIZE (64 * KiB) #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" -- 2.17.1
[Qemu-block] [PATCH v7 8/9] qcow2: Set the default cache-clean-interval to 10 minutes
The default cache-clean-interval is set to 10 minutes, in order to lower the overhead of the qcow2 caches (before the default was 0, i.e. disabled). Signed-off-by: Leonid Bloch Reviewed-by: Alberto Garcia --- block/qcow2.c| 2 +- block/qcow2.h| 1 + docs/qcow2-cache.txt | 4 ++-- qapi/block-core.json | 3 ++- qemu-options.hx | 2 +- 5 files changed, 7 insertions(+), 5 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index ba4dfae735..b4f291765b 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -944,7 +944,7 @@ static int qcow2_update_options_prepare(BlockDriverState *bs, /* New interval for cache cleanup timer */ r->cache_clean_interval = qemu_opt_get_number(opts, QCOW2_OPT_CACHE_CLEAN_INTERVAL, -s->cache_clean_interval); +DEFAULT_CACHE_CLEAN_INTERVAL); #ifndef CONFIG_LINUX if (r->cache_clean_interval != 0) { error_setg(errp, QCOW2_OPT_CACHE_CLEAN_INTERVAL diff --git a/block/qcow2.h b/block/qcow2.h index e699a55d02..5e94f7ffc4 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -78,6 +78,7 @@ #define DEFAULT_CLUSTER_SIZE (64 * KiB) +#define DEFAULT_CACHE_CLEAN_INTERVAL 600 /* seconds */ #define QCOW2_OPT_LAZY_REFCOUNTS "lazy-refcounts" #define QCOW2_OPT_DISCARD_REQUEST "pass-discard-request" diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 6ad1081d1a..684147ad45 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -204,8 +204,8 @@ This example removes all unused cache entries every 15 minutes: -drive file=hd.qcow2,cache-clean-interval=900 -If unset, the default value for this parameter is 0 and it disables -this feature. +If unset, the default value for this parameter is 600. Setting it to 0 +disables this feature. Note that this functionality currently relies on the MADV_DONTNEED argument for madvise() to actually free the memory. This is a diff --git a/qapi/block-core.json b/qapi/block-core.json index 5b9084a394..9a6a708a37 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -2830,7 +2830,8 @@ # # @cache-clean-interval: clean unused entries in the L2 and refcount # caches. The interval is in seconds. The default value -# is 0 and it disables this feature (since 2.5) +# is 600, and 0 disables this feature. (since 2.5) +# # @encrypt: Image decryption options. Mandatory for # encrypted images, except when doing a metadata-only # probe of the image. (since 2.10) diff --git a/qemu-options.hx b/qemu-options.hx index 4c44cdbc23..6abf3631ec 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -767,7 +767,7 @@ it which is not used for the L2 cache) @item cache-clean-interval Clean unused entries in the L2 and refcount caches. The interval is in seconds. -The default value is 0 and it disables this feature. +The default value is 600. Setting it to 0 disables this feature. @item pass-discard-request Whether discard requests to the qcow2 device should be forwarded to the data -- 2.17.1
[Qemu-block] [PATCH v7 6/9] qcow2: Increase the default upper limit on the L2 cache size
The upper limit on the L2 cache size is increased from 1 MB to 32 MB. This is done in order to allow default full coverage with the L2 cache for images of up to 256 GB in size (was 8 GB). Note, that only the needed amount to cover the full image is allocated. The value which is changed here is just the upper limit on the L2 cache size, beyond which it will not grow, even if the size of the image will require it to. Signed-off-by: Leonid Bloch --- block/qcow2.h| 2 +- docs/qcow2-cache.txt | 6 +++--- qemu-options.hx | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/block/qcow2.h b/block/qcow2.h index d917b5f577..e699a55d02 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -74,7 +74,7 @@ /* Must be at least 4 to cover all cases of refcount table growth */ #define MIN_REFCOUNT_CACHE_SIZE 4 /* clusters */ -#define DEFAULT_L2_CACHE_MAX_SIZE (1 * MiB) +#define DEFAULT_L2_CACHE_MAX_SIZE (32 * MiB) #define DEFAULT_CLUSTER_SIZE (64 * KiB) diff --git a/docs/qcow2-cache.txt b/docs/qcow2-cache.txt index 69af306267..6ad1081d1a 100644 --- a/docs/qcow2-cache.txt +++ b/docs/qcow2-cache.txt @@ -125,8 +125,8 @@ There are a few things that need to be taken into account: (or the cache entry size: see "Using smaller cache sizes" below). - The default L2 cache size will cover the entire virtual size of an - image, up to a certain maximum. This maximum is 1 MB by default - (enough for image sizes of up to 8 GB with the default cluster size) + image, up to a certain maximum. This maximum is 32 MB by default + (enough for image sizes of up to 256 GB with the default cluster size) and it can be reduced or enlarged using the "l2-cache-size" option. The minimum is 2 clusters (or 2 cache entries, see below). @@ -186,7 +186,7 @@ Some things to take into account: always uses the cluster size as the entry size. - If the L2 cache is big enough to hold all of the image's L2 tables - (the default behavior for images of up to 8 GB in size) then none + (the default behavior for images of up to 256 GB in size) then none of this is necessary and you can omit the "l2-cache-entry-size" parameter altogether. diff --git a/qemu-options.hx b/qemu-options.hx index 22e8e2d113..4c44cdbc23 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -756,7 +756,7 @@ The maximum total size of the L2 table and refcount block caches in bytes @item l2-cache-size The maximum size of the L2 table cache in bytes -(default: if cache-size is not specified - 1M; otherwise, as large as possible +(default: if cache-size is not specified - 32M; otherwise, as large as possible within the cache-size, while permitting the requested or the minimal refcount cache size) -- 2.17.1
[Qemu-block] [PATCH v7 4/9] qcow2: Avoid duplication in setting the refcount cache size
The refcount cache size does not need to be set to its minimum value in read_cache_sizes(), as it is set to at least its minimum value in qcow2_update_options_prepare(). Signed-off-by: Leonid Bloch --- block/qcow2.c | 15 ++- 1 file changed, 6 insertions(+), 9 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 7a2d7a1d48..053242f94e 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -829,16 +829,13 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, *l2_cache_size = combined_cache_size - *refcount_cache_size; } } -} else { -if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, - (uint64_t)DEFAULT_L2_CACHE_CLUSTERS - * s->cluster_size); -} -if (!refcount_cache_size_set) { -*refcount_cache_size = min_refcount_cache; -} +} else if (!l2_cache_size_set) { +*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE, + (uint64_t)DEFAULT_L2_CACHE_CLUSTERS + * s->cluster_size); } +/* If refcount-cache-size is not specified, it will be set to minimum + * in qcow2_update_options_prepare(). No need to set it here. */ if (*l2_cache_entry_size < (1 << MIN_CLUSTER_BITS) || *l2_cache_entry_size > s->cluster_size || -- 2.17.1
[Qemu-block] [PATCH v7 0/9] Take the image size into account when allocating the L2 cache
This series makes the qcow2 L2 cache assignment aware of the image size, with the intention for it to cover the entire image. The importance of this change is in noticeable performance improvement, especially with heavy random I/O. The memory overhead is not big in most cases, as only 1 MB of cache for every 8 GB of image size is used. For cases with very large images and/or small cluster sizes, or systems with limited RAM resources, there is an upper limit on the default L2 cache: 32 MB. To modify this limit one can use the already existing 'l2-cache-size' and 'cache-size' options. Moreover, this fixes the behavior of 'l2-cache-size', as it was documented as the *maximum* L2 cache size, but in practice behaved as the absolute size. To compensate the memory overhead which may be increased following this behavior, the default cache-clean-interval is set to 10 minutes by default (was disabled by default before). The L2 cache is also resized accordingly, by default, if the image is resized. Additionally, few minor changes are made (refactoring and documentation fixes). Differences from v1: * .gitignore modification patch removed (unneeded). * The grammar fix in conflicting cache sizing patch removed (merged). * The update total_sectors when resizing patch squashed with the resizing patch. * L2 cache is now capped at 32 MB. * The default cache-clean-interval is set to 30 seconds. Differences from v2: * Made it clear in the documentation that setting cache-clean-interval to 0 disables this feature. Differences from v3: * The default cache-clean-interval is set to 10 minutes instead of 30 seconds before. * Commit message changes to better explain the patches. * Some refactoring. Differences from v4: * Refactoring. * Documentation and commit message fixes. Differences from v5: * A patch with cosmetic changes is split from the main patch * A patch for avoiding duplication in refcount cache size assignment is split from the main patch. * In the main patch the cap on the L2 cache size is set to only 1 MB (in order to be close to the previous behavior) and a separate patch sets it to 32 MB. * Changes in the documentation fixes patch [1/8]. Differences from v6: * A patch is added to make the defined sizes more humanly readable Leonid Bloch (9): qcow2: Options' documentation fixes qcow2: Cosmetic changes qcow2: Make sizes more humanly readable qcow2: Avoid duplication in setting the refcount cache size qcow2: Assign the L2 cache relatively to the image size qcow2: Increase the default upper limit on the L2 cache size qcow2: Resize the cache upon image resizing qcow2: Set the default cache-clean-interval to 10 minutes qcow2: Explicit number replaced by a constant block/qcow2.c | 54 +- block/qcow2.h | 12 - docs/qcow2-cache.txt | 33 ++- qapi/block-core.json | 3 ++- qemu-options.hx| 11 +--- tests/qemu-iotests/137 | 1 - tests/qemu-iotests/137.out | 1 - 7 files changed, 66 insertions(+), 49 deletions(-) -- 2.17.1
[Qemu-block] [PATCH v7 2/9] qcow2: Cosmetic changes
Some refactoring for better readability is done here. Signed-off-by: Leonid Bloch --- block/qcow2.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index ec9e6238a0..3f4abc394e 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -790,8 +790,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, *refcount_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0); -*l2_cache_entry_size = qemu_opt_get_size( -opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size); +*l2_cache_entry_size = qemu_opt_get_size(opts, + QCOW2_OPT_L2_CACHE_ENTRY_SIZE, + s->cluster_size); if (combined_cache_size_set) { if (l2_cache_size_set && refcount_cache_size_set) { @@ -823,8 +824,8 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, *l2_cache_size = max_l2_cache; *refcount_cache_size = combined_cache_size - *l2_cache_size; } else { -*refcount_cache_size = -MIN(combined_cache_size, min_refcount_cache); +*refcount_cache_size = MIN(combined_cache_size, + min_refcount_cache); *l2_cache_size = combined_cache_size - *refcount_cache_size; } } -- 2.17.1
[Qemu-block] [PATCH v7 5/9] qcow2: Assign the L2 cache relatively to the image size
Sufficient L2 cache can noticeably improve the performance when using large images with frequent I/O. Previously, the L2 cache was allocated without considering the image size, and an option existed to manually determine its size. Thus to achieve a full coverage of an image by the L2 cache (i.e. use more than the default value of MAX(1 MB, 8 clusters)), a user needed to calculate the required size manually, or with a script, and pass this value to the 'l2-cache-size' option. Now, the L2 cache is assigned taking the virtual image size into account, and will cover the entire image, unless the size needed for that is larger than a certain maximum. This maximum is set to 1 MB by default (enough to cover an 8 GB image with the default cluster size) but can be increased or decreased using the 'l2-cache-size' option. This option was previously documented as the *maximum* L2 cache size, and this patch makes it behave as such, instead of as a constant size. Also, the existing option 'cache-size' can limit the sum of both L2 and refcount caches, as previously. Signed-off-by: Leonid Bloch --- block/qcow2.c | 22 ++ block/qcow2.h | 4 +--- docs/qcow2-cache.txt | 13 - qemu-options.hx| 6 +++--- tests/qemu-iotests/137 | 1 - tests/qemu-iotests/137.out | 1 - 6 files changed, 22 insertions(+), 25 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index 053242f94e..434fb89076 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -777,16 +777,19 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, uint64_t *refcount_cache_size, Error **errp) { BDRVQcow2State *s = bs->opaque; -uint64_t combined_cache_size; +uint64_t combined_cache_size, l2_cache_max_setting; bool l2_cache_size_set, refcount_cache_size_set, combined_cache_size_set; int min_refcount_cache = MIN_REFCOUNT_CACHE_SIZE * s->cluster_size; +uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; +uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); combined_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_CACHE_SIZE); l2_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_L2_CACHE_SIZE); refcount_cache_size_set = qemu_opt_get(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE); combined_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_CACHE_SIZE, 0); -*l2_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, 0); +l2_cache_max_setting = qemu_opt_get_size(opts, QCOW2_OPT_L2_CACHE_SIZE, + DEFAULT_L2_CACHE_MAX_SIZE); *refcount_cache_size = qemu_opt_get_size(opts, QCOW2_OPT_REFCOUNT_CACHE_SIZE, 0); @@ -794,13 +797,16 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, QCOW2_OPT_L2_CACHE_ENTRY_SIZE, s->cluster_size); +*l2_cache_size = MIN(max_l2_cache, l2_cache_max_setting); + if (combined_cache_size_set) { if (l2_cache_size_set && refcount_cache_size_set) { error_setg(errp, QCOW2_OPT_CACHE_SIZE ", " QCOW2_OPT_L2_CACHE_SIZE " and " QCOW2_OPT_REFCOUNT_CACHE_SIZE " may not be set " "at the same time"); return; -} else if (*l2_cache_size > combined_cache_size) { +} else if (l2_cache_size_set && + (l2_cache_max_setting > combined_cache_size)) { error_setg(errp, QCOW2_OPT_L2_CACHE_SIZE " may not exceed " QCOW2_OPT_CACHE_SIZE); return; @@ -815,13 +821,9 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, } else if (refcount_cache_size_set) { *l2_cache_size = combined_cache_size - *refcount_cache_size; } else { -uint64_t virtual_disk_size = bs->total_sectors * BDRV_SECTOR_SIZE; -uint64_t max_l2_cache = virtual_disk_size / (s->cluster_size / 8); - /* Assign as much memory as possible to the L2 cache, and * use the remainder for the refcount cache */ -if (combined_cache_size >= max_l2_cache + min_refcount_cache) { -*l2_cache_size = max_l2_cache; +if (combined_cache_size >= *l2_cache_size + min_refcount_cache) { *refcount_cache_size = combined_cache_size - *l2_cache_size; } else { *refcount_cache_size = MIN(combined_cache_size, @@ -829,10 +831,6 @@ static void read_cache_sizes(BlockDriverState *bs, QemuOpts *opts, *l2_cache_size = combined_cache_size - *refcount_cache_size; } } -} else if (!l2_cache_size_set) { -*l2_cache_size = MAX(DEFAULT_L2_CACHE_SIZE,