Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-05-04 Thread Max Reitz
On 30.04.20 15:56, Denis Plotnikov wrote:
> 
> 
> On 30.04.2020 14:47, Max Reitz wrote:
>> On 30.04.20 11:48, Denis Plotnikov wrote:
>>>
>>> On 30.04.2020 11:26, Max Reitz wrote:
 On 29.04.20 15:02, Vladimir Sementsov-Ogievskiy wrote:
> 29.04.2020 15:17, Max Reitz wrote:
>> On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote:
>>> 29.04.2020 13:24, Max Reitz wrote:
 On 28.04.20 22:00, Denis Plotnikov wrote:
> zstd significantly reduces cluster compression time.
> It provides better compression performance maintaining
> the same level of the compression ratio in comparison with
> zlib, which, at the moment, is the only compression
> method available.
>
> The performance test results:
> Test compresses and decompresses qemu qcow2 image with just
> installed rhel-7.6 guest.
> Image cluster size: 64K. Image on disk size: 2.2G
>
> The test was conducted with brd disk to reduce the influence
> of disk subsystem to the test results.
> The results is given in seconds.
>
> compress cmd:
>   time ./qemu-img convert -O qcow2 -c -o
> compression_type=[zlib|zstd]
>   src.img [zlib|zstd]_compressed.img
> decompress cmd
>   time ./qemu-img convert -O qcow2
>   [zlib|zstd]_compressed.img uncompressed.img
>
>    compression   decompression
>  zlib   zstd   zlib zstd
> 
> real 65.5   16.3 (-75 %)    1.9  1.6 (-16 %)
> user 65.0   15.8    5.3  2.5
> sys   3.3    0.2    2.0  2.0
>
> Both ZLIB and ZSTD gave the same compression ratio: 1.57
> compressed image size in both cases: 1.4G
>
> Signed-off-by: Denis Plotnikov 
> QAPI part:
> Acked-by: Markus Armbruster 
> ---
>  docs/interop/qcow2.txt |   1 +
>  configure  |   2 +-
>  qapi/block-core.json   |   3 +-
>  block/qcow2-threads.c  | 169
> +
>  block/qcow2.c  |   7 ++
>  slirp  |   2 +-
>  6 files changed, 181 insertions(+), 3 deletions(-)
 [...]

> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
> index 7dbaf53489..a0b12e1b15 100644
> --- a/block/qcow2-threads.c
> +++ b/block/qcow2-threads.c
 [...]

> +static ssize_t qcow2_zstd_decompress(void *dest, size_t
> dest_size,
> + const void *src, size_t
> src_size)
> +{
 [...]

> +    /*
> + * The compressed stream from the input buffer may consist of
> more
> + * than one zstd frame.
 Can it?
>>> If not, we must require it in the specification.
>> Actually, now that you mention it, it would make sense anyway to add
>> some note to the specification on what exactly compressed with zstd
>> means.
>>
>>> Hmm. If at some point
>>> we'll want multi-threaded compression of one big (2M) cluster..
>>> Could
>>> this be implemented with zstd lib, if multiple frames are allowed,
>>> will
>>> allowing multiple frames help? I don't know actually, but I think
>>> better
>>> not to forbid it. On the other hand, I don't see any benefit in
>>> large
>>> compressed clusters. At least, in our scenarios (for compressed
>>> backups)
>>> we use 64k compressed clusters, for good granularity of incremental
>>> backups (when for running vm we use 1M clusters).
>> Is it really that important?  Naïvely, it sounds rather
>> complicated to
>> introduce multithreading into block drivers.
> It is already here: compression and encryption already multithreaded.
> But of course, one cluster is handled in one thread.
 Ah, good.  I forgot.

>> (Also, as for compression, it can only be used in backup scenarios
>> anyway, where you write many clusters at once.  So parallelism on the
>> cluster level should sufficient to get high usage, and it would
>> benefit
>> all compression types and cluster sizes.)
>>
> Yes it works in this way already :)
 Well, OK then.

> So, we don't know do we want one frame restriction or not. Do you
> have a
> preference?
 *shrug*

 Seems like it would be preferential to allow multiple frames still.  A
 note in the spec would be nice (i.e., streaming format, multiple frames
 per cluster possible).
>>> We don't mention anything about zlib compressing details 

Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-30 Thread Denis Plotnikov




On 30.04.2020 14:47, Max Reitz wrote:

On 30.04.20 11:48, Denis Plotnikov wrote:


On 30.04.2020 11:26, Max Reitz wrote:

On 29.04.20 15:02, Vladimir Sementsov-Ogievskiy wrote:

29.04.2020 15:17, Max Reitz wrote:

On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote:

29.04.2020 13:24, Max Reitz wrote:

On 28.04.20 22:00, Denis Plotnikov wrote:

zstd significantly reduces cluster compression time.
It provides better compression performance maintaining
the same level of the compression ratio in comparison with
zlib, which, at the moment, is the only compression
method available.

The performance test results:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G

The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.

compress cmd:
  time ./qemu-img convert -O qcow2 -c -o
compression_type=[zlib|zstd]
  src.img [zlib|zstd]_compressed.img
decompress cmd
  time ./qemu-img convert -O qcow2
  [zlib|zstd]_compressed.img uncompressed.img

   compression   decompression
     zlib   zstd   zlib zstd

real 65.5   16.3 (-75 %)    1.9  1.6 (-16 %)
user 65.0   15.8    5.3  2.5
sys   3.3    0.2    2.0  2.0

Both ZLIB and ZSTD gave the same compression ratio: 1.57
compressed image size in both cases: 1.4G

Signed-off-by: Denis Plotnikov 
QAPI part:
Acked-by: Markus Armbruster 
---
     docs/interop/qcow2.txt |   1 +
     configure  |   2 +-
     qapi/block-core.json   |   3 +-
     block/qcow2-threads.c  | 169
+
     block/qcow2.c  |   7 ++
     slirp  |   2 +-
     6 files changed, 181 insertions(+), 3 deletions(-)

[...]


diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 7dbaf53489..a0b12e1b15 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c

[...]


+static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
+ const void *src, size_t
src_size)
+{

[...]


+    /*
+ * The compressed stream from the input buffer may consist of
more
+ * than one zstd frame.

Can it?

If not, we must require it in the specification.

Actually, now that you mention it, it would make sense anyway to add
some note to the specification on what exactly compressed with zstd
means.


Hmm. If at some point
we'll want multi-threaded compression of one big (2M) cluster.. Could
this be implemented with zstd lib, if multiple frames are allowed,
will
allowing multiple frames help? I don't know actually, but I think
better
not to forbid it. On the other hand, I don't see any benefit in large
compressed clusters. At least, in our scenarios (for compressed
backups)
we use 64k compressed clusters, for good granularity of incremental
backups (when for running vm we use 1M clusters).

Is it really that important?  Naïvely, it sounds rather complicated to
introduce multithreading into block drivers.

It is already here: compression and encryption already multithreaded.
But of course, one cluster is handled in one thread.

Ah, good.  I forgot.


(Also, as for compression, it can only be used in backup scenarios
anyway, where you write many clusters at once.  So parallelism on the
cluster level should sufficient to get high usage, and it would benefit
all compression types and cluster sizes.)


Yes it works in this way already :)

Well, OK then.


So, we don't know do we want one frame restriction or not. Do you have a
preference?

*shrug*

Seems like it would be preferential to allow multiple frames still.  A
note in the spec would be nice (i.e., streaming format, multiple frames
per cluster possible).

We don't mention anything about zlib compressing details in the spec.

Yep.  Which is bad enough.


If we mention anything about ZSTD compressing details we'll have to do
it for
zlib as well.

I personally don’t like “If you can’t make it perfect, you shouldn’t do
it at all”.  Mentioning it for zstd would still be an improvement.


Good approach. I like it. But I'm trying to be cautious.


Also, I believe that “zlib compression” is pretty much unambiguous,
considering all the code does is call deflate()/inflate().

I’m not saying we need to amend the spec in this series, but I don’t see
a good argument against doing so at some point.


So, I think we have two possibilities for the spec:
1. mention for both
2. don't mention at all

I think the 2nd is better. It gives more degree of freedom for the
future improvements.

No, it absolutely doesn’t.  There is a de-facto format, namely what qemu
accepts.  Changing that would be an incompatible change.  Just because
we don’t write what’s allowed into the spec doesn’t change that 

Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-30 Thread Max Reitz
On 30.04.20 11:48, Denis Plotnikov wrote:
> 
> 
> On 30.04.2020 11:26, Max Reitz wrote:
>> On 29.04.20 15:02, Vladimir Sementsov-Ogievskiy wrote:
>>> 29.04.2020 15:17, Max Reitz wrote:
 On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote:
> 29.04.2020 13:24, Max Reitz wrote:
>> On 28.04.20 22:00, Denis Plotnikov wrote:
>>> zstd significantly reduces cluster compression time.
>>> It provides better compression performance maintaining
>>> the same level of the compression ratio in comparison with
>>> zlib, which, at the moment, is the only compression
>>> method available.
>>>
>>> The performance test results:
>>> Test compresses and decompresses qemu qcow2 image with just
>>> installed rhel-7.6 guest.
>>> Image cluster size: 64K. Image on disk size: 2.2G
>>>
>>> The test was conducted with brd disk to reduce the influence
>>> of disk subsystem to the test results.
>>> The results is given in seconds.
>>>
>>> compress cmd:
>>>  time ./qemu-img convert -O qcow2 -c -o
>>> compression_type=[zlib|zstd]
>>>  src.img [zlib|zstd]_compressed.img
>>> decompress cmd
>>>  time ./qemu-img convert -O qcow2
>>>  [zlib|zstd]_compressed.img uncompressed.img
>>>
>>>   compression   decompression
>>>     zlib   zstd   zlib zstd
>>> 
>>> real 65.5   16.3 (-75 %)    1.9  1.6 (-16 %)
>>> user 65.0   15.8    5.3  2.5
>>> sys   3.3    0.2    2.0  2.0
>>>
>>> Both ZLIB and ZSTD gave the same compression ratio: 1.57
>>> compressed image size in both cases: 1.4G
>>>
>>> Signed-off-by: Denis Plotnikov 
>>> QAPI part:
>>> Acked-by: Markus Armbruster 
>>> ---
>>>     docs/interop/qcow2.txt |   1 +
>>>     configure  |   2 +-
>>>     qapi/block-core.json   |   3 +-
>>>     block/qcow2-threads.c  | 169
>>> +
>>>     block/qcow2.c  |   7 ++
>>>     slirp  |   2 +-
>>>     6 files changed, 181 insertions(+), 3 deletions(-)
>> [...]
>>
>>> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
>>> index 7dbaf53489..a0b12e1b15 100644
>>> --- a/block/qcow2-threads.c
>>> +++ b/block/qcow2-threads.c
>> [...]
>>
>>> +static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
>>> + const void *src, size_t
>>> src_size)
>>> +{
>> [...]
>>
>>> +    /*
>>> + * The compressed stream from the input buffer may consist of
>>> more
>>> + * than one zstd frame.
>> Can it?
> If not, we must require it in the specification.
 Actually, now that you mention it, it would make sense anyway to add
 some note to the specification on what exactly compressed with zstd
 means.

> Hmm. If at some point
> we'll want multi-threaded compression of one big (2M) cluster.. Could
> this be implemented with zstd lib, if multiple frames are allowed,
> will
> allowing multiple frames help? I don't know actually, but I think
> better
> not to forbid it. On the other hand, I don't see any benefit in large
> compressed clusters. At least, in our scenarios (for compressed
> backups)
> we use 64k compressed clusters, for good granularity of incremental
> backups (when for running vm we use 1M clusters).
 Is it really that important?  Naïvely, it sounds rather complicated to
 introduce multithreading into block drivers.
>>> It is already here: compression and encryption already multithreaded.
>>> But of course, one cluster is handled in one thread.
>> Ah, good.  I forgot.
>>
 (Also, as for compression, it can only be used in backup scenarios
 anyway, where you write many clusters at once.  So parallelism on the
 cluster level should sufficient to get high usage, and it would benefit
 all compression types and cluster sizes.)

>>> Yes it works in this way already :)
>> Well, OK then.
>>
>>> So, we don't know do we want one frame restriction or not. Do you have a
>>> preference?
>> *shrug*
>>
>> Seems like it would be preferential to allow multiple frames still.  A
>> note in the spec would be nice (i.e., streaming format, multiple frames
>> per cluster possible).
> 
> We don't mention anything about zlib compressing details in the spec.

Yep.  Which is bad enough.

> If we mention anything about ZSTD compressing details we'll have to do
> it for
> zlib as well.

I personally don’t like “If you can’t make it perfect, you shouldn’t do
it at all”.  Mentioning it for zstd would still be an improvement.

Also, I believe that “zlib compression” is pretty much unambiguous,
considering 

Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-30 Thread Denis Plotnikov




On 30.04.2020 11:26, Max Reitz wrote:

On 29.04.20 15:02, Vladimir Sementsov-Ogievskiy wrote:

29.04.2020 15:17, Max Reitz wrote:

On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote:

29.04.2020 13:24, Max Reitz wrote:

On 28.04.20 22:00, Denis Plotnikov wrote:

zstd significantly reduces cluster compression time.
It provides better compression performance maintaining
the same level of the compression ratio in comparison with
zlib, which, at the moment, is the only compression
method available.

The performance test results:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G

The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.

compress cmd:
     time ./qemu-img convert -O qcow2 -c -o
compression_type=[zlib|zstd]
     src.img [zlib|zstd]_compressed.img
decompress cmd
     time ./qemu-img convert -O qcow2
     [zlib|zstd]_compressed.img uncompressed.img

  compression   decompression
    zlib   zstd   zlib zstd

real 65.5   16.3 (-75 %)    1.9  1.6 (-16 %)
user 65.0   15.8    5.3  2.5
sys   3.3    0.2    2.0  2.0

Both ZLIB and ZSTD gave the same compression ratio: 1.57
compressed image size in both cases: 1.4G

Signed-off-by: Denis Plotnikov 
QAPI part:
Acked-by: Markus Armbruster 
---
    docs/interop/qcow2.txt |   1 +
    configure  |   2 +-
    qapi/block-core.json   |   3 +-
    block/qcow2-threads.c  | 169
+
    block/qcow2.c  |   7 ++
    slirp  |   2 +-
    6 files changed, 181 insertions(+), 3 deletions(-)

[...]


diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 7dbaf53489..a0b12e1b15 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c

[...]


+static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
+ const void *src, size_t
src_size)
+{

[...]


+    /*
+ * The compressed stream from the input buffer may consist of
more
+ * than one zstd frame.

Can it?

If not, we must require it in the specification.

Actually, now that you mention it, it would make sense anyway to add
some note to the specification on what exactly compressed with zstd
means.


Hmm. If at some point
we'll want multi-threaded compression of one big (2M) cluster.. Could
this be implemented with zstd lib, if multiple frames are allowed, will
allowing multiple frames help? I don't know actually, but I think better
not to forbid it. On the other hand, I don't see any benefit in large
compressed clusters. At least, in our scenarios (for compressed backups)
we use 64k compressed clusters, for good granularity of incremental
backups (when for running vm we use 1M clusters).

Is it really that important?  Naïvely, it sounds rather complicated to
introduce multithreading into block drivers.

It is already here: compression and encryption already multithreaded.
But of course, one cluster is handled in one thread.

Ah, good.  I forgot.


(Also, as for compression, it can only be used in backup scenarios
anyway, where you write many clusters at once.  So parallelism on the
cluster level should sufficient to get high usage, and it would benefit
all compression types and cluster sizes.)


Yes it works in this way already :)

Well, OK then.


So, we don't know do we want one frame restriction or not. Do you have a
preference?

*shrug*

Seems like it would be preferential to allow multiple frames still.  A
note in the spec would be nice (i.e., streaming format, multiple frames
per cluster possible).


We don't mention anything about zlib compressing details in the spec.

If we mention anything about ZSTD compressing details we'll have to do 
it for

zlib as well. So, I think we have two possibilities for the spec:
1. mention for both
2. don't mention at all

I think the 2nd is better. It gives more degree of freedom for the 
future improvements.

and we've already covered both cases (one frame, may frames) in the code.
I'm note sure I'm right. Any other opinions?

Denis


Max






Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-30 Thread Max Reitz
On 29.04.20 15:02, Vladimir Sementsov-Ogievskiy wrote:
> 29.04.2020 15:17, Max Reitz wrote:
>> On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote:
>>> 29.04.2020 13:24, Max Reitz wrote:
 On 28.04.20 22:00, Denis Plotnikov wrote:
> zstd significantly reduces cluster compression time.
> It provides better compression performance maintaining
> the same level of the compression ratio in comparison with
> zlib, which, at the moment, is the only compression
> method available.
>
> The performance test results:
> Test compresses and decompresses qemu qcow2 image with just
> installed rhel-7.6 guest.
> Image cluster size: 64K. Image on disk size: 2.2G
>
> The test was conducted with brd disk to reduce the influence
> of disk subsystem to the test results.
> The results is given in seconds.
>
> compress cmd:
>     time ./qemu-img convert -O qcow2 -c -o
> compression_type=[zlib|zstd]
>     src.img [zlib|zstd]_compressed.img
> decompress cmd
>     time ./qemu-img convert -O qcow2
>     [zlib|zstd]_compressed.img uncompressed.img
>
>  compression   decompression
>    zlib   zstd   zlib zstd
> 
> real 65.5   16.3 (-75 %)    1.9  1.6 (-16 %)
> user 65.0   15.8    5.3  2.5
> sys   3.3    0.2    2.0  2.0
>
> Both ZLIB and ZSTD gave the same compression ratio: 1.57
> compressed image size in both cases: 1.4G
>
> Signed-off-by: Denis Plotnikov 
> QAPI part:
> Acked-by: Markus Armbruster 
> ---
>    docs/interop/qcow2.txt |   1 +
>    configure  |   2 +-
>    qapi/block-core.json   |   3 +-
>    block/qcow2-threads.c  | 169
> +
>    block/qcow2.c  |   7 ++
>    slirp  |   2 +-
>    6 files changed, 181 insertions(+), 3 deletions(-)

 [...]

> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
> index 7dbaf53489..a0b12e1b15 100644
> --- a/block/qcow2-threads.c
> +++ b/block/qcow2-threads.c

 [...]

> +static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
> + const void *src, size_t
> src_size)
> +{

 [...]

> +    /*
> + * The compressed stream from the input buffer may consist of
> more
> + * than one zstd frame.

 Can it?
>>>
>>> If not, we must require it in the specification.
>>
>> Actually, now that you mention it, it would make sense anyway to add
>> some note to the specification on what exactly compressed with zstd
>> means.
>>
>>> Hmm. If at some point
>>> we'll want multi-threaded compression of one big (2M) cluster.. Could
>>> this be implemented with zstd lib, if multiple frames are allowed, will
>>> allowing multiple frames help? I don't know actually, but I think better
>>> not to forbid it. On the other hand, I don't see any benefit in large
>>> compressed clusters. At least, in our scenarios (for compressed backups)
>>> we use 64k compressed clusters, for good granularity of incremental
>>> backups (when for running vm we use 1M clusters).
>>
>> Is it really that important?  Naïvely, it sounds rather complicated to
>> introduce multithreading into block drivers.
> 
> It is already here: compression and encryption already multithreaded.
> But of course, one cluster is handled in one thread.

Ah, good.  I forgot.

>> (Also, as for compression, it can only be used in backup scenarios
>> anyway, where you write many clusters at once.  So parallelism on the
>> cluster level should sufficient to get high usage, and it would benefit
>> all compression types and cluster sizes.)
>>
> 
> Yes it works in this way already :)

Well, OK then.

> So, we don't know do we want one frame restriction or not. Do you have a
> preference?

*shrug*

Seems like it would be preferential to allow multiple frames still.  A
note in the spec would be nice (i.e., streaming format, multiple frames
per cluster possible).

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-29 Thread Eric Blake

On 4/29/20 8:02 AM, Vladimir Sementsov-Ogievskiy wrote:


+    /*
+ * The compressed stream from the input buffer may consist of 
more

+ * than one zstd frame.


Can it?


If not, we must require it in the specification.


Actually, now that you mention it, it would make sense anyway to add
some note to the specification on what exactly compressed with zstd 
means.


So, we don't know do we want one frame restriction or not. Do you have a 
preference?




I'm a fan of 'be strict in what you produce, liberal in what you 
accept'. While our implementation always produces only a single frame of 
compressed data, I think our decoder should be prepared to see more than 
one frame from other implementations, as that is more liberal than 
tightening the specification to require that encoding must produce 
exactly one frame.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-29 Thread Vladimir Sementsov-Ogievskiy

29.04.2020 15:17, Max Reitz wrote:

On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote:

29.04.2020 13:24, Max Reitz wrote:

On 28.04.20 22:00, Denis Plotnikov wrote:

zstd significantly reduces cluster compression time.
It provides better compression performance maintaining
the same level of the compression ratio in comparison with
zlib, which, at the moment, is the only compression
method available.

The performance test results:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G

The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.

compress cmd:
    time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
    src.img [zlib|zstd]_compressed.img
decompress cmd
    time ./qemu-img convert -O qcow2
    [zlib|zstd]_compressed.img uncompressed.img

     compression   decompression
   zlib   zstd   zlib zstd

real 65.5   16.3 (-75 %)    1.9  1.6 (-16 %)
user 65.0   15.8    5.3  2.5
sys   3.3    0.2    2.0  2.0

Both ZLIB and ZSTD gave the same compression ratio: 1.57
compressed image size in both cases: 1.4G

Signed-off-by: Denis Plotnikov 
QAPI part:
Acked-by: Markus Armbruster 
---
   docs/interop/qcow2.txt |   1 +
   configure  |   2 +-
   qapi/block-core.json   |   3 +-
   block/qcow2-threads.c  | 169 +
   block/qcow2.c  |   7 ++
   slirp  |   2 +-
   6 files changed, 181 insertions(+), 3 deletions(-)


[...]


diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 7dbaf53489..a0b12e1b15 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c


[...]


+static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
+ const void *src, size_t src_size)
+{


[...]


+    /*
+ * The compressed stream from the input buffer may consist of more
+ * than one zstd frame.


Can it?


If not, we must require it in the specification.


Actually, now that you mention it, it would make sense anyway to add
some note to the specification on what exactly compressed with zstd means.


Hmm. If at some point
we'll want multi-threaded compression of one big (2M) cluster.. Could
this be implemented with zstd lib, if multiple frames are allowed, will
allowing multiple frames help? I don't know actually, but I think better
not to forbid it. On the other hand, I don't see any benefit in large
compressed clusters. At least, in our scenarios (for compressed backups)
we use 64k compressed clusters, for good granularity of incremental
backups (when for running vm we use 1M clusters).


Is it really that important?  Naïvely, it sounds rather complicated to
introduce multithreading into block drivers.


It is already here: compression and encryption already multithreaded.
But of course, one cluster is handled in one thread.



(Also, as for compression, it can only be used in backup scenarios
anyway, where you write many clusters at once.  So parallelism on the
cluster level should sufficient to get high usage, and it would benefit
all compression types and cluster sizes.)



Yes it works in this way already :)

===

So, we don't know do we want one frame restriction or not. Do you have a 
preference?

--
Best regards,
Vladimir



Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-29 Thread Max Reitz
On 29.04.20 12:38, Denis Plotnikov wrote:
> 
> 
> On 29.04.2020 13:24, Max Reitz wrote:
>> On 28.04.20 22:00, Denis Plotnikov wrote:
>>> zstd significantly reduces cluster compression time.
>>> It provides better compression performance maintaining
>>> the same level of the compression ratio in comparison with
>>> zlib, which, at the moment, is the only compression
>>> method available.
>>>
>>> The performance test results:
>>> Test compresses and decompresses qemu qcow2 image with just
>>> installed rhel-7.6 guest.
>>> Image cluster size: 64K. Image on disk size: 2.2G
>>>
>>> The test was conducted with brd disk to reduce the influence
>>> of disk subsystem to the test results.
>>> The results is given in seconds.
>>>
>>> compress cmd:
>>>    time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
>>>    src.img [zlib|zstd]_compressed.img
>>> decompress cmd
>>>    time ./qemu-img convert -O qcow2
>>>    [zlib|zstd]_compressed.img uncompressed.img
>>>
>>>     compression   decompression
>>>   zlib   zstd   zlib zstd
>>> 
>>> real 65.5   16.3 (-75 %)    1.9  1.6 (-16 %)
>>> user 65.0   15.8    5.3  2.5
>>> sys   3.3    0.2    2.0  2.0
>>>
>>> Both ZLIB and ZSTD gave the same compression ratio: 1.57
>>> compressed image size in both cases: 1.4G
>>>
>>> Signed-off-by: Denis Plotnikov 
>>> QAPI part:
>>> Acked-by: Markus Armbruster 
>>> ---
>>>   docs/interop/qcow2.txt |   1 +
>>>   configure  |   2 +-
>>>   qapi/block-core.json   |   3 +-
>>>   block/qcow2-threads.c  | 169 +
>>>   block/qcow2.c  |   7 ++
>>>   slirp  |   2 +-
>>>   6 files changed, 181 insertions(+), 3 deletions(-)
>> [...]
>>
>>> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
>>> index 7dbaf53489..a0b12e1b15 100644
>>> --- a/block/qcow2-threads.c
>>> +++ b/block/qcow2-threads.c
>> [...]
>>
>>> +static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
>>> + const void *src, size_t src_size)
>>> +{
>> [...]
>>
>>> +    /*
>>> + * The compressed stream from the input buffer may consist of more
>>> + * than one zstd frame.
>> Can it?
> 
> Potentially, it can, if another implemention of qcow2 saves a couple of
> frames for some reason.

Well, it can’t do that if qemu cannot decode it.  Maybe there should be
a note in the specification on the precise format to expect.

If we decide that it might make sense for someone to encode a cluster
with multiple frames, then OK.

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-29 Thread Max Reitz
On 29.04.20 12:37, Vladimir Sementsov-Ogievskiy wrote:
> 29.04.2020 13:24, Max Reitz wrote:
>> On 28.04.20 22:00, Denis Plotnikov wrote:
>>> zstd significantly reduces cluster compression time.
>>> It provides better compression performance maintaining
>>> the same level of the compression ratio in comparison with
>>> zlib, which, at the moment, is the only compression
>>> method available.
>>>
>>> The performance test results:
>>> Test compresses and decompresses qemu qcow2 image with just
>>> installed rhel-7.6 guest.
>>> Image cluster size: 64K. Image on disk size: 2.2G
>>>
>>> The test was conducted with brd disk to reduce the influence
>>> of disk subsystem to the test results.
>>> The results is given in seconds.
>>>
>>> compress cmd:
>>>    time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
>>>    src.img [zlib|zstd]_compressed.img
>>> decompress cmd
>>>    time ./qemu-img convert -O qcow2
>>>    [zlib|zstd]_compressed.img uncompressed.img
>>>
>>>     compression   decompression
>>>   zlib   zstd   zlib zstd
>>> 
>>> real 65.5   16.3 (-75 %)    1.9  1.6 (-16 %)
>>> user 65.0   15.8    5.3  2.5
>>> sys   3.3    0.2    2.0  2.0
>>>
>>> Both ZLIB and ZSTD gave the same compression ratio: 1.57
>>> compressed image size in both cases: 1.4G
>>>
>>> Signed-off-by: Denis Plotnikov 
>>> QAPI part:
>>> Acked-by: Markus Armbruster 
>>> ---
>>>   docs/interop/qcow2.txt |   1 +
>>>   configure  |   2 +-
>>>   qapi/block-core.json   |   3 +-
>>>   block/qcow2-threads.c  | 169 +
>>>   block/qcow2.c  |   7 ++
>>>   slirp  |   2 +-
>>>   6 files changed, 181 insertions(+), 3 deletions(-)
>>
>> [...]
>>
>>> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
>>> index 7dbaf53489..a0b12e1b15 100644
>>> --- a/block/qcow2-threads.c
>>> +++ b/block/qcow2-threads.c
>>
>> [...]
>>
>>> +static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
>>> + const void *src, size_t src_size)
>>> +{
>>
>> [...]
>>
>>> +    /*
>>> + * The compressed stream from the input buffer may consist of more
>>> + * than one zstd frame.
>>
>> Can it?
> 
> If not, we must require it in the specification.

Actually, now that you mention it, it would make sense anyway to add
some note to the specification on what exactly compressed with zstd means.

> Hmm. If at some point
> we'll want multi-threaded compression of one big (2M) cluster.. Could
> this be implemented with zstd lib, if multiple frames are allowed, will
> allowing multiple frames help? I don't know actually, but I think better
> not to forbid it. On the other hand, I don't see any benefit in large
> compressed clusters. At least, in our scenarios (for compressed backups)
> we use 64k compressed clusters, for good granularity of incremental
> backups (when for running vm we use 1M clusters).

Is it really that important?  Naïvely, it sounds rather complicated to
introduce multithreading into block drivers.

(Also, as for compression, it can only be used in backup scenarios
anyway, where you write many clusters at once.  So parallelism on the
cluster level should sufficient to get high usage, and it would benefit
all compression types and cluster sizes.)

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-29 Thread Denis Plotnikov




On 29.04.2020 13:24, Max Reitz wrote:

On 28.04.20 22:00, Denis Plotnikov wrote:

zstd significantly reduces cluster compression time.
It provides better compression performance maintaining
the same level of the compression ratio in comparison with
zlib, which, at the moment, is the only compression
method available.

The performance test results:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G

The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.

compress cmd:
   time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
   src.img [zlib|zstd]_compressed.img
decompress cmd
   time ./qemu-img convert -O qcow2
   [zlib|zstd]_compressed.img uncompressed.img

compression   decompression
  zlib   zstd   zlib zstd

real 65.5   16.3 (-75 %)1.9  1.6 (-16 %)
user 65.0   15.85.3  2.5
sys   3.30.22.0  2.0

Both ZLIB and ZSTD gave the same compression ratio: 1.57
compressed image size in both cases: 1.4G

Signed-off-by: Denis Plotnikov 
QAPI part:
Acked-by: Markus Armbruster 
---
  docs/interop/qcow2.txt |   1 +
  configure  |   2 +-
  qapi/block-core.json   |   3 +-
  block/qcow2-threads.c  | 169 +
  block/qcow2.c  |   7 ++
  slirp  |   2 +-
  6 files changed, 181 insertions(+), 3 deletions(-)

[...]


diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 7dbaf53489..a0b12e1b15 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c

[...]


+static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
+ const void *src, size_t src_size)
+{

[...]


+/*
+ * The compressed stream from the input buffer may consist of more
+ * than one zstd frame.

Can it?


Potentially, it can, if another implemention of qcow2 saves a couple of 
frames for some reason.


Denis


Max






Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-29 Thread Vladimir Sementsov-Ogievskiy

29.04.2020 13:24, Max Reitz wrote:

On 28.04.20 22:00, Denis Plotnikov wrote:

zstd significantly reduces cluster compression time.
It provides better compression performance maintaining
the same level of the compression ratio in comparison with
zlib, which, at the moment, is the only compression
method available.

The performance test results:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G

The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.

compress cmd:
   time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
   src.img [zlib|zstd]_compressed.img
decompress cmd
   time ./qemu-img convert -O qcow2
   [zlib|zstd]_compressed.img uncompressed.img

compression   decompression
  zlib   zstd   zlib zstd

real 65.5   16.3 (-75 %)1.9  1.6 (-16 %)
user 65.0   15.85.3  2.5
sys   3.30.22.0  2.0

Both ZLIB and ZSTD gave the same compression ratio: 1.57
compressed image size in both cases: 1.4G

Signed-off-by: Denis Plotnikov 
QAPI part:
Acked-by: Markus Armbruster 
---
  docs/interop/qcow2.txt |   1 +
  configure  |   2 +-
  qapi/block-core.json   |   3 +-
  block/qcow2-threads.c  | 169 +
  block/qcow2.c  |   7 ++
  slirp  |   2 +-
  6 files changed, 181 insertions(+), 3 deletions(-)


[...]


diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
index 7dbaf53489..a0b12e1b15 100644
--- a/block/qcow2-threads.c
+++ b/block/qcow2-threads.c


[...]


+static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
+ const void *src, size_t src_size)
+{


[...]


+/*
+ * The compressed stream from the input buffer may consist of more
+ * than one zstd frame.


Can it?


If not, we must require it in the specification. Hmm. If at some point we'll 
want multi-threaded compression of one big (2M) cluster.. Could this be 
implemented with zstd lib, if multiple frames are allowed, will allowing 
multiple frames help? I don't know actually, but I think better not to forbid 
it. On the other hand, I don't see any benefit in large compressed clusters. At 
least, in our scenarios (for compressed backups) we use 64k compressed 
clusters, for good granularity of incremental backups (when for running vm we 
use 1M clusters).



--
Best regards,
Vladimir



Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-29 Thread Max Reitz
On 28.04.20 22:00, Denis Plotnikov wrote:
> zstd significantly reduces cluster compression time.
> It provides better compression performance maintaining
> the same level of the compression ratio in comparison with
> zlib, which, at the moment, is the only compression
> method available.
> 
> The performance test results:
> Test compresses and decompresses qemu qcow2 image with just
> installed rhel-7.6 guest.
> Image cluster size: 64K. Image on disk size: 2.2G
> 
> The test was conducted with brd disk to reduce the influence
> of disk subsystem to the test results.
> The results is given in seconds.
> 
> compress cmd:
>   time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
>   src.img [zlib|zstd]_compressed.img
> decompress cmd
>   time ./qemu-img convert -O qcow2
>   [zlib|zstd]_compressed.img uncompressed.img
> 
>compression   decompression
>  zlib   zstd   zlib zstd
> 
> real 65.5   16.3 (-75 %)1.9  1.6 (-16 %)
> user 65.0   15.85.3  2.5
> sys   3.30.22.0  2.0
> 
> Both ZLIB and ZSTD gave the same compression ratio: 1.57
> compressed image size in both cases: 1.4G
> 
> Signed-off-by: Denis Plotnikov 
> QAPI part:
> Acked-by: Markus Armbruster 
> ---
>  docs/interop/qcow2.txt |   1 +
>  configure  |   2 +-
>  qapi/block-core.json   |   3 +-
>  block/qcow2-threads.c  | 169 +
>  block/qcow2.c  |   7 ++
>  slirp  |   2 +-
>  6 files changed, 181 insertions(+), 3 deletions(-)

[...]

> diff --git a/block/qcow2-threads.c b/block/qcow2-threads.c
> index 7dbaf53489..a0b12e1b15 100644
> --- a/block/qcow2-threads.c
> +++ b/block/qcow2-threads.c

[...]

> +static ssize_t qcow2_zstd_decompress(void *dest, size_t dest_size,
> + const void *src, size_t src_size)
> +{

[...]

> +/*
> + * The compressed stream from the input buffer may consist of more
> + * than one zstd frame.

Can it?

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v22 3/4] qcow2: add zstd cluster compression

2020-04-28 Thread Eric Blake

On 4/28/20 3:00 PM, Denis Plotnikov wrote:

zstd significantly reduces cluster compression time.
It provides better compression performance maintaining
the same level of the compression ratio in comparison with
zlib, which, at the moment, is the only compression
method available.

The performance test results:
Test compresses and decompresses qemu qcow2 image with just
installed rhel-7.6 guest.
Image cluster size: 64K. Image on disk size: 2.2G

The test was conducted with brd disk to reduce the influence
of disk subsystem to the test results.
The results is given in seconds.

compress cmd:
   time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib|zstd]
   src.img [zlib|zstd]_compressed.img
decompress cmd
   time ./qemu-img convert -O qcow2
   [zlib|zstd]_compressed.img uncompressed.img

compression   decompression
  zlib   zstd   zlib zstd

real 65.5   16.3 (-75 %)1.9  1.6 (-16 %)
user 65.0   15.85.3  2.5
sys   3.30.22.0  2.0

Both ZLIB and ZSTD gave the same compression ratio: 1.57
compressed image size in both cases: 1.4G

Signed-off-by: Denis Plotnikov 
QAPI part:
Acked-by: Markus Armbruster 
---



+static ssize_t qcow2_zstd_compress(void *dest, size_t dest_size,
+   const void *src, size_t src_size)
+{
+ssize_t ret;
+size_t zstd_ret;
+ZSTD_outBuffer output = {
+.dst = dest,
+.size = dest_size,
+.pos = 0
+};



+
+/* make sure we can safely return compressed buffer size with ssize_t */
+assert(output.pos <= SSIZE_MAX);


Seems rather vague, since we know that .pos won't exceed .size which was 
initialized by dest_size which is <= 2M.  A tighter assertion:


assert(output.pos <= dest_size)

seems like it is more realistic to your real constraint (namely, that 
zstd did not overflow dest).



+ret = output.pos;
+out:
+ZSTD_freeCCtx(cctx);
+return ret;
+}
+



+++ b/slirp
@@ -1 +1 @@
-Subproject commit 2faae0f778f818fadc873308f983289df697eb93
+Subproject commit 55ab21c9a36852915b81f1b41ebaf3b6509dd8ba


Umm, you definitely don't want that.

A maintainer could touch up both of those, but I'm not sure which block 
maintainer will be accepting this series.  Maybe wait for that question 
to be answered before trying to post a v23.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org