On Wed 21 Feb 2018 07:33:55 PM CET, Kevin Wolf wrote: [docs/qcow2-cache.txt] > While reviewing this, I read the whole document and stumbled across > these paragraphs: > >> The reason for this 1/4 ratio is to ensure that both caches cover the >> same amount of disk space. Note however that this is only valid with >> the default value of refcount_bits (16). If you are using a different >> value you might want to calculate both cache sizes yourself since >> QEMU will always use the same 1/4 ratio. > > Sounds like we should fix our defaults?
We could do that, yes, we would be "breaking" compatibility with previous versions, but we're not talking about semantic changes here, so it's not a big deal in and on itself. Of course this would be a problem if the new defaults make things work slower, but I don't think that's the case here. Just to confirm: do you suggest reducing the refcount cache size (because it's not so necessary) or changing the formula so the number of refcount_bits is taken into account so that both caches cover the same amount of disk space in all cases? I suppose it's the former based on what you write below. > While we're at it, would l2-cache-entry-size = MIN(cluster_size, 64k) > make sense as a default? Any reason why you choose 64k, or is it because it's the default cluster size? In general I'd be cautious to reduce the default entry size unconditionally because with rotating HDDs it may even have a negative (although slight) effect on performance. But the larger the cluster, the larger the area mapped by L2 entries, so we need less metadata and it makes more sense to read less in general. In summary: while I think it's probably a good idea, it would be good to make some tests before changing the default. >> It's also worth mentioning that there's no strict need for both >> caches to cover the same amount of disk space. The refcount cache is >> used much less often than the L2 cache, so it's perfectly reasonable >> to keep it small. > > More precisely, it is only used for cluster allocation, not for read > or for rewrites. Usually this means that it's indeed accessed a lot > less, though especially in benchmarks, this isn't necessarily less > often. > > However, the more important part is that even for allocating writes > with random I/O, the refcount cache is still accessed sequentially and > we don't really take advantage of having more than a single refcount > block in memory. This only stops being true as soon as you add > something that can free clusters (discards, overwriting compressed > cluster, deleting internal snapshots). > > We have a minimum refcount block cache size of 4 clusters because of > the possible recursion during refcount table growth, which leaves some > room to hold the refcount block for an occasional discard (and > subsequent reallocation). > > So should we default to this minimum on the grounds that for most > people, refcounts blocks are probably only accessed sequentially in > practice? The remaining memory of the total cache size seems to help > the average case more if it's added to the L2 cache instead. That sounds like a good idea. We should double check that the minimum is indeed the required minimum, and run some tests, but otherwise I'm all for it. I think I can take care of this if you want. Berto