Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-19 Thread Frank Myhr
On 10/18/2016 11:25 AM, Alberto Garcia wrote:
> On Fri 07 Oct 2016 03:58:29 PM CEST, Ed Swierk wrote:
>> Same here, using libvirt. l2-cache-size=max would be ideal. Or if
>> there were a cache-coverage-size option that takes an absolute number,
>> libvirt could set it to the image size.
> 
> I can see the benefit of both approaches: setting the disk size covered
> by the cache or setting a percentage, I personally like a bit more the
> former but it wouldn't provide a way to say "create the largest cache
> needed for this disk".

I prefer percentage (or just 'max') because it eliminates a source of
performance problems when increasing the size of a VM's disk(s). Take the case
of a VM doing heavy random I/O on a 10GB disk. Admin1 initially sets
l2-cache-size=1.25M or l2-cache-coverage-size=10G. At some point the disk fills
up as they always do, and admin2 (or admin1 on a bad day) increases its size to
20GB: problem solved. Except admin2 forgot to also double l2-cache-size or
l2-cache-coverage-size. So performance with larger drive takes a nosedive (~*6X*
worse in your worst-case benchmark) until some admin figures it out.

https://blogs.igalia.com/berto/2015/12/17/improving-disk-io-performance-in-qemu-2-5-with-the-qcow2-l2-cache/

Sure, admins shouldn't make this mistake. But guaranteed they (cough, I) will.
Using percentage makes this mistake impossible.


> Would 'l2-cache-coverage-size' work for everyone? It would simply take
> the disk size (16G, 1T, etc) and it would conflict with l2-cache-size.

The above said, l2-cache-coverage-size would work for me if libvirt could set
it. Or if it defaulted to entire image size, eliminating the need for libvirt to
set it, and making impossible the admin mistake above.


> Do we need something similar for the refcount cache?

I don't know. AFAICT the current default refcount behavior is fine. But might be
good to keep l2 and refcount options symmetrical.

Frank




Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-18 Thread Alberto Garcia
On Fri 07 Oct 2016 03:58:29 PM CEST, Ed Swierk wrote:

>> I know the disk image size, and can set cache size in
>> bytes. l2-cache-size=max would be a convenience feature, *especially
>> if it could become the default*. Then I could forget thinking about
>> whether the image is larger than the current 8GB fully-cached
>> threshold, and only have to calculate cache size in bytes in case of
>> very large images, multiple backing files, or very tight memory.
>
> Same here, using libvirt. l2-cache-size=max would be ideal. Or if
> there were a cache-coverage-size option that takes an absolute number,
> libvirt could set it to the image size.

I can see the benefit of both approaches: setting the disk size covered
by the cache or setting a percentage, I personally like a bit more the
former but it wouldn't provide a way to say "create the largest cache
needed for this disk".

Would 'l2-cache-coverage-size' work for everyone? It would simply take
the disk size (16G, 1T, etc) and it would conflict with l2-cache-size.

Do we need something similar for the refcount cache?

Berto



Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-07 Thread Frank Myhr

On 10/07/2016 06:56 AM, Alberto Garcia wrote:

I would like to know what's the use case you (Frank, Ed)
are thinking about:



- Are we talking about command-line options, QMP, or both?


Command-line options alone are sufficient for my use case, which is configuring 
and starting VMs using libvirt.




- Do you know the size of the disk image or you want to be able to set
  the cache size without having access to that information? In other
  words, what's preventing you from calculating the cache size that you
  need?


I know the disk image size, and can set cache size in bytes. l2-cache-size=max 
would be a convenience feature, *especially if it could become the default*. 
Then I could forget thinking about whether the image is larger than the current 
8GB fully-cached threshold, and only have to calculate cache size in bytes in 
case of very large images, multiple backing files, or very tight memory.


Frank



Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-07 Thread Ed Swierk
On Fri, Oct 7, 2016 at 6:46 AM, Frank Myhr  wrote:
> On 10/07/2016 06:56 AM, Alberto Garcia wrote:
>>
>> I would like to know what's the use case you (Frank, Ed)
>> are thinking about:
>
>
>> - Are we talking about command-line options, QMP, or both?
>
>
> Command-line options alone are sufficient for my use case, which is 
> configuring and starting VMs using libvirt.
>
>
>> - Do you know the size of the disk image or you want to be able to set
>>   the cache size without having access to that information? In other
>>   words, what's preventing you from calculating the cache size that you
>>   need?
>
>
> I know the disk image size, and can set cache size in bytes. 
> l2-cache-size=max would be a convenience feature, *especially if it could 
> become the default*. Then I could forget thinking about whether the image is 
> larger than the current 8GB fully-cached threshold, and only have to 
> calculate cache size in bytes in case of very large images, multiple backing 
> files, or very tight memory.

Same here, using libvirt. l2-cache-size=max would be ideal. Or if
there were a cache-coverage-size option that takes an absolute number,
libvirt could set it to the image size.

--Ed



Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-07 Thread Alberto Garcia
On Fri 07 Oct 2016 10:14:32 AM CEST, Kevin Wolf wrote:

>> Maybe pass a negative integer to indicate percentage, so string is not 
>> required?:
>> l2-cache-size=-100(l2-cache-size=max, covers entire drive)
>> l2-cache-size=-50 (l2-cache-size=50% of max, covers half of drive)
>> l2-cache-size=2097152 (l2-cache-size=2M)
>
> No, this is the kind of magic that is best to avoid.
>
> If you want to have two different ways to configure the cache, make it
> two different options.

I agree. Still, I would like to know what's the use case you (Frank, Ed)
are thinking about:

- Are we talking about command-line options, QMP, or both?
- Do you know the size of the disk image or you want to be able to set
  the cache size without having access to that information? In other
  words, what's preventing you from calculating the cache size that you
  need?

Berto



Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-07 Thread Kevin Wolf
[ Restoring the CC list, adding Eric for QAPI ]

Am 06.10.2016 um 20:36 hat Frank Myhr geschrieben:
> Thanks Alberto for pointing me to Ed Swierk's patch and this discussion.
> 
> On Wed 05 Oct 2016 16:57:57 +0200, Alberto Garcia wrote:
> 
> > Frank Myhr's suggestion (in bugzilla) is that we allow specifying a % of
> > the disk size, so
> >
> > l2-cache-size=100%  (that would be cache-size=max)
> > l2-cache-size=80%
> > ...
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=1377735#c3
> >
> > [This] would change however the data type of
> > 'l2-cache-size' from int to string in BlockdevOptionsQcow2, can we
> > actually do that?
> 
> Maybe pass a negative integer to indicate percentage, so string is not 
> required?:
> l2-cache-size=-100(l2-cache-size=max, covers entire drive)
> l2-cache-size=-50 (l2-cache-size=50% of max, covers half of drive)
> l2-cache-size=2097152 (l2-cache-size=2M)

No, this is the kind of magic that is best to avoid.

If you want to have two different ways to configure the cache, make it
two different options.

> On Wed 5 Oct 2016 17:13:39 +0200, Max Reitz wrote;
> 
> > maybe it would make more sense to introduce a whole new set of
> > options called {,l2-,refcount-}cache-coverage or something. These
> > options would then accept both an absolute and a relative number.
> 
> On Wed 5 Oct 2016 17:24:08 +0200, Kevin Wolf wrote:
> 
> > If we want easy, make it easy: cache-coverage and apply it to both.
> 
> refcount-cache-size if unspecified is now automatically set to
> l2-cache-size / 4, correct? So if we could specify l2-cache-size as
> "max" or in percentage, it seems that refcount-cache-size is already
> taken care if we if just leave it unspecified.

Yes.

> On Thu 06 Oct 2016 16:34:16 +0200, Alberto Garcia wrote:
> > I don't think it's a good
> > idea to have two ways to specify exactly the same thing, when one can be
> > converted to the other with a very simple operation.
> 
> I agree. For better or worse we already have l2-cache-size. If its
> accepted values can be slightly expanded, using either a "max" enum
> as Max Reitz suggested, or somehow indicating a relative
> (percentage) value rather than an absolute number of bytes, this
> seems to offer sufficient flexibility for a broad range of users.
> 
> Although the percentage of max l2 cache size isn't exactly an
> intuitive number to specify, it's also (as Alberto notes) equal to
> the percentage of drive size covered by the cache. That's only
> marginally harder to think about than the absolute drive size
> covered by cache that a cache-coverage parameter would specify.

I think we can add "max" as an additional accepted value to
l2-cache-size, but if we want to allow percentages, that should be a
separate option (because we can't easily parse "50%" with QAPI except if
treating it as a string, but that wouldn't be quite right).

Kevin



Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-06 Thread Alberto Garcia
On Wed 05 Oct 2016 05:35:20 PM CEST, Max Reitz  wrote:

>>> Apart from that, I have to say I think it would be a bit more useful
>>> if one would specify the area covered by the metadata caches as an
>>> absolute number instead of a relative one (I guess it's generally
>>> easier to know what area your applications will perform random
>>> accesses on than the relative size, but maybe that's just me).
>> 
>> I'm not sure if I'm following you, can you give an example of what
>> the area covered by the cache exactly means?
>
> Let's take the L2 table cache. Every eight bytes of that cache point
> to one cluster in the image, so if it's 8 MB in size, for instance,
> then you can cover 1M clusters. With the default 64 kB clusters, you
> would cover 64 GB of any image (although you have some constraints on
> that range, of course; an 8 MB cache would contain not just 1M cluster
> pointers but actually 128 L2 tables (8 MB / 64 kB), so it would
> actually cover 128 continuous 512 MB ranges).

Ah! You mean that instead of saying l2-cache-size=1M, you would say
l2-cache-coverage=8G, so the user doesn't need to do the numbers. I
thought you were talking about something different, that's why it didn't
make any sense to me :-)

Well, I don't know, if we had to design the interface again from the
beginning I guess it could make sense, but I don't think it's a good
idea to have two ways to specify exactly the same thing, when one can be
converted to the other with a very simple operation.

I admit the formula is not directly obvious (that's why I documented it
in docs/qcow2-cache.txt), but it's very simple.

I guess it wouldn't be so bad if we restricted it only to the command
line (for humans), since any program that communicates using QMP can do
the numbers itself. I don't know... how about a helper script that
checks the cluster size and tells you how much cache you need?

Berto



Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-05 Thread Max Reitz
On 05.10.2016 17:19, Alberto Garcia wrote:
> On Wed 05 Oct 2016 05:13:39 PM CEST, Max Reitz wrote:
> 
> At least giving users a way to skip the math would be an
> improvement.  Would you be okay with an explicitly-set option like
> l2_cache_size=auto or =max that optimizes for performance at the
> expense of memory?

>>> Frank Myhr's suggestion (in bugzilla) is that we allow specifying a %
>>> of the disk size, so
>>>
>>> l2-cache-size=100%  (that would be cache-size=max)
>>> l2-cache-size=80%
> 
>> For me, either works fine.
>>
>> Apart from that, I have to say I think it would be a bit more useful
>> if one would specify the area covered by the metadata caches as an
>> absolute number instead of a relative one (I guess it's generally
>> easier to know what area your applications will perform random
>> accesses on than the relative size, but maybe that's just me).
> 
> I'm not sure if I'm following you, can you give an example of what the
> area covered by the cache exactly means?

Let's take the L2 table cache. Every eight bytes of that cache point to
one cluster in the image, so if it's 8 MB in size, for instance, then
you can cover 1M clusters. With the default 64 kB clusters, you would
cover 64 GB of any image (although you have some constraints on that
range, of course; an 8 MB cache would contain not just 1M cluster
pointers but actually 128 L2 tables (8 MB / 64 kB), so it would actually
cover 128 continuous 512 MB ranges).

So as long as you perform any kind of access pattern within these 64 GB,
the cache will cover that. But when you're doing random accesses over
more than 64 GB, the cache cannot cover it.

So that's what I mean with coverage: The area you can access without
having to suffer cache misses.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-05 Thread Kevin Wolf
Am 05.10.2016 um 17:13 hat Max Reitz geschrieben:
> On 05.10.2016 16:57, Alberto Garcia wrote:
> > On Tue 04 Oct 2016 05:51:26 PM CEST, Max Reitz wrote:
> > 
> >>> At least giving users a way to skip the math would be an improvement.
> >>> Would you be okay with an explicitly-set option like
> >>> l2_cache_size=auto or =max that optimizes for performance at the
> >>> expense of memory?
> >>
> >> That (cache-size=max) is actually something Stefan Hajnoczi has
> >> proposed at KVM Forum 2015.
> >>
> >> I agree that implementing the formula in qemu's qcow2 driver itself
> >> makes sense and will help users; however, I do think it is appropriate
> >> to expect the user to pass cache-size=max if they need it.
> > 
> > Frank Myhr's suggestion (in bugzilla) is that we allow specifying a % of
> > the disk size, so
> > 
> > l2-cache-size=100%  (that would be cache-size=max)
> > l2-cache-size=80%
> > ...
> > 
> > https://bugzilla.redhat.com/show_bug.cgi?id=1377735#c3
> > 
> > Either one looks good to me. They would change however the data type of
> > 'l2-cache-size' from int to string in BlockdevOptionsQcow2, can we
> > actually do that?
> 
> I think we can do that with an alternate data type which accepts both an
> integer and a string. If we only had "max", we'd probably want to make
> it an enumeration instead of a free-form string, though. But with a %
> suffix, we'd probably need a string.
> 
> For me, either works fine.

I'm not sure if we want to support such syntax in QMP. It sounds like a
typical convenience option for human users.

> Apart from that, I have to say I think it would be a bit more useful if
> one would specify the area covered by the metadata caches as an absolute
> number instead of a relative one (I guess it's generally easier to know
> what area your applications will perform random accesses on than the
> relative size, but maybe that's just me).
> 
> Supporting that with cache-size is difficult, though, because how are
> you going to decide between whether the user is specifying the actual
> size of the cache or the area it covers?
> 
> So maybe it would make more sense to introduce a whole new set of
> options called {,l2-,refcount-}cache-coverage or something. These
> options would then accept both an absolute and a relative number.

If we want easy, make it easy: cache-coverage and apply it to both.

Kevin


pgpvbiNn_Ig2T.pgp
Description: PGP signature


Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-05 Thread Alberto Garcia
On Wed 05 Oct 2016 05:13:39 PM CEST, Max Reitz wrote:

 At least giving users a way to skip the math would be an
 improvement.  Would you be okay with an explicitly-set option like
 l2_cache_size=auto or =max that optimizes for performance at the
 expense of memory?
>>>
>> Frank Myhr's suggestion (in bugzilla) is that we allow specifying a %
>> of the disk size, so
>> 
>> l2-cache-size=100%  (that would be cache-size=max)
>> l2-cache-size=80%

> For me, either works fine.
>
> Apart from that, I have to say I think it would be a bit more useful
> if one would specify the area covered by the metadata caches as an
> absolute number instead of a relative one (I guess it's generally
> easier to know what area your applications will perform random
> accesses on than the relative size, but maybe that's just me).

I'm not sure if I'm following you, can you give an example of what the
area covered by the cache exactly means?

Berto



Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-05 Thread Max Reitz
On 05.10.2016 16:57, Alberto Garcia wrote:
> On Tue 04 Oct 2016 05:51:26 PM CEST, Max Reitz wrote:
> 
>>> At least giving users a way to skip the math would be an improvement.
>>> Would you be okay with an explicitly-set option like
>>> l2_cache_size=auto or =max that optimizes for performance at the
>>> expense of memory?
>>
>> That (cache-size=max) is actually something Stefan Hajnoczi has
>> proposed at KVM Forum 2015.
>>
>> I agree that implementing the formula in qemu's qcow2 driver itself
>> makes sense and will help users; however, I do think it is appropriate
>> to expect the user to pass cache-size=max if they need it.
> 
> Frank Myhr's suggestion (in bugzilla) is that we allow specifying a % of
> the disk size, so
> 
> l2-cache-size=100%  (that would be cache-size=max)
> l2-cache-size=80%
> ...
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1377735#c3
> 
> Either one looks good to me. They would change however the data type of
> 'l2-cache-size' from int to string in BlockdevOptionsQcow2, can we
> actually do that?

I think we can do that with an alternate data type which accepts both an
integer and a string. If we only had "max", we'd probably want to make
it an enumeration instead of a free-form string, though. But with a %
suffix, we'd probably need a string.

For me, either works fine.

Apart from that, I have to say I think it would be a bit more useful if
one would specify the area covered by the metadata caches as an absolute
number instead of a relative one (I guess it's generally easier to know
what area your applications will perform random accesses on than the
relative size, but maybe that's just me).

Supporting that with cache-size is difficult, though, because how are
you going to decide between whether the user is specifying the actual
size of the cache or the area it covers?

So maybe it would make more sense to introduce a whole new set of
options called {,l2-,refcount-}cache-coverage or something. These
options would then accept both an absolute and a relative number.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-05 Thread Alberto Garcia
On Tue 04 Oct 2016 05:51:26 PM CEST, Max Reitz wrote:

>> At least giving users a way to skip the math would be an improvement.
>> Would you be okay with an explicitly-set option like
>> l2_cache_size=auto or =max that optimizes for performance at the
>> expense of memory?
>
> That (cache-size=max) is actually something Stefan Hajnoczi has
> proposed at KVM Forum 2015.
>
> I agree that implementing the formula in qemu's qcow2 driver itself
> makes sense and will help users; however, I do think it is appropriate
> to expect the user to pass cache-size=max if they need it.

Frank Myhr's suggestion (in bugzilla) is that we allow specifying a % of
the disk size, so

l2-cache-size=100%  (that would be cache-size=max)
l2-cache-size=80%
...

https://bugzilla.redhat.com/show_bug.cgi?id=1377735#c3

Either one looks good to me. They would change however the data type of
'l2-cache-size' from int to string in BlockdevOptionsQcow2, can we
actually do that?

Berto



Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-04 Thread Max Reitz
On 04.10.2016 17:36, Ed Swierk wrote:
> On Tue, Oct 4, 2016 at 7:31 AM, Alberto Garcia  wrote:
>> That might be a lot of memory if the image is big. 1 TB qcow2 image ==
>> 128 MB L2 cache.
>>
>> See https://bugzilla.redhat.com/show_bug.cgi?id=1377735#c2
>>
>> If we want to add this feature, I think I'd rather make it explicit.
> 
> I agree explicit is generally better than automagical, but unlike say
> the VM RAM size, the qcow L2 table cache is an implementation detail
> no user should be expected to know exists, let alone needs tweaking
> according to a specific formula to avoid an arbitrary performance
> cliff.

I think a user can be expected to know it when they really want optimal
performance for huge workloads.

> At least giving users a way to skip the math would be an improvement.
> Would you be okay with an explicitly-set option like
> l2_cache_size=auto or =max that optimizes for performance at the
> expense of memory?

That (cache-size=max) is actually something Stefan Hajnoczi has proposed
at KVM Forum 2015.

I agree that implementing the formula in qemu's qcow2 driver itself
makes sense and will help users; however, I do think it is appropriate
to expect the user to pass cache-size=max if they need it.

We shouldn't fix this lack of knowledge by making cache-size=max the
default but by documenting it properly (although I'm yet unsure where to
do that, other than in some blog...). Also, we have the
cache-clean-interval option which makes the qcow2 driver automatically
clean up unused entries in the metadata caches, which would be very
useful for cache-size=max; we thus shouldn't forget to mention that in
the documentation as well.

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-04 Thread Ed Swierk
On Tue, Oct 4, 2016 at 7:31 AM, Alberto Garcia  wrote:
> That might be a lot of memory if the image is big. 1 TB qcow2 image ==
> 128 MB L2 cache.
>
> See https://bugzilla.redhat.com/show_bug.cgi?id=1377735#c2
>
> If we want to add this feature, I think I'd rather make it explicit.

I agree explicit is generally better than automagical, but unlike say
the VM RAM size, the qcow L2 table cache is an implementation detail
no user should be expected to know exists, let alone needs tweaking
according to a specific formula to avoid an arbitrary performance
cliff.

At least giving users a way to skip the math would be an improvement.
Would you be okay with an explicitly-set option like
l2_cache_size=auto or =max that optimizes for performance at the
expense of memory?

--Ed



Re: [Qemu-devel] [Qemu-block] [PATCH] qcow2: Optimize L2 table cache size based on image and cluster sizes

2016-10-04 Thread Alberto Garcia
On Tue 04 Oct 2016 03:39:52 PM CEST, Ed Swierk wrote:

> This change replaces the hardcoded default L2 table cache size with a
> value computed from the image size and cluster size. The resulting
> default cache size is just large enough to service random accesses in
> the entire image.

That might be a lot of memory if the image is big. 1 TB qcow2 image ==
128 MB L2 cache.

See https://bugzilla.redhat.com/show_bug.cgi?id=1377735#c2

If we want to add this feature, I think I'd rather make it explicit.

Berto