[Qemu-block] Performance impact of the qcow2 overlap checks

Alberto Garcia Wed, 18 Jan 2017 07:09:02 -0800

Hey,

I was benchmarking the performance of the qcow2 file format and I
noticed that if the storage backend is fast enough then the qcow2
overlap checks can have a very significant impact in performance.


One important bottleneck that applies to all images is in the
refcount-block check, that checks the refcount table and sees if a
write request overlaps with an existing refcount block.

The problem is that, unlike the L1 table, the refcount table takes
entire clusters. With the default cluster size of 64k, a normal
refcount table has 8192 entries, all of which have to be checked for
each write request. This is an expensive operation.

Refcounts are used for host clusters, and we need one refcount block
(and therefore one entry in the refcount table) per 2GB in the qcow2
file. This means that the default refcount table can address up to
16TB (we're talking about actual image size, not virtual size).

In other words: the vast majority of the entries in the refcount table
are probably not going to be used ever, but we're still checking them
for each write request. One user reported a >200% performance increase
on a fast SSD when using overlap-check=constant.

I think this is at least worth documenting a bit better (unless
there's existing documentation that I have missed), but my main
question is: does it make sense to try to optimize these checks, or is
it better to simply tell the user to disable them in these scenarios?

Berto

[Qemu-block] Performance impact of the qcow2 overlap checks

Reply via email to