Re: ImageInfo oddities regarding compression

2020-11-30 Thread Daniel P . Berrangé
On Fri, Nov 27, 2020 at 05:52:09PM +0100, Markus Armbruster wrote:
> Kevin Wolf  writes:
> 
> > Am 27.11.2020 um 13:21 hat Markus Armbruster geschrieben:
> >> >> I fell down this (thankfully shallow) rabbit hole because we also have
> >> >> 
> >> >> { 'enum': 'MultiFDCompression',
> >> >>   'data': [ 'none', 'zlib',
> >> >> { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }
> >> >> 
> >> >> I wonder whether we could merge them into a common type.
> >> 
> >> Looks like we could: current code would never report the additional
> >> value 'none'.  Introspection would show it, though.  Seems unlikely to
> >> cause trouble.  Observation, not demand.
> >
> > Forgot to comment on this one...
> >
> > Technically we could probably, but would it make sense? Support for
> > compression formats has to be implemented separately for both cases, so
> > that they currently happen to support the same list is more of a
> > coincidence.
> >
> > If we ever add a third compression format to qcow2, would we add the
> > same format to migration, too, or would we split the schema into two
> > types again?
> 
> I figure if a compression method is worth adding to one, it's probably
> worth adding to the other.

I'm not entirely sure about that, as compression algorithms have different
tradeoffs.

For migration compression we're streaming large volumes of data in a live
session, both compression and decompression speed is very important.

For qcow2 we're handling relative small discrete blocks (a qcow2
cluster at a time) and decompression speed is much more important that
compression speed, since we generally compress once out of band, and
decompress many times while live.

So we could see new compression methods desired for migration that are
not relevant for qcow2, or vica-verca.

> Having two separate enums isn't too bad.  A third has been proposed[*],
> but I hope we can reuse migration's existing enum there.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




Re: ImageInfo oddities regarding compression

2020-11-30 Thread Eric Blake
On 11/27/20 4:32 AM, Kevin Wolf wrote:

>>>##
>>># @Qcow2CompressionType:
>>>#
>>># Compression type used in qcow2 image file
>>>#
>>># @zlib: zlib compression, see 
>>># @zstd: zstd compression, see 
>>>#
>>># Since: 5.1
>>>##
>>>{ 'enum': 'Qcow2CompressionType',
>>>  'data': [ 'zlib', { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }
>>>
>>> Apparently, you can't have a qcow2 image without compression.  Correct?
>>>
>>> Can you imagine a use case for "without compression"?
>>
>> Yes & no. An image always has a compression type, either implicitly
>> zlib as the historical default, or explicitly as a type recorded in
>> the header.  This doesn't mean compression is used.
>>
>> There may be zero or more clusters actually using compression.
>> Typically compression will never be used, but there's still an
>> associated compression type regardless.
> 
> Right, so the correct answer to "is this image compressed?" is "unknown"
> for qcow2. Providing an answer would require checking all clusters in
> the image for the compression flag, which is not something we want to do
> for query-block. And even if we did that, it would still be unclear what
> to do with a partially compressed image.

If we truly need it, we could define three new autoclear bits in the
qcow2 spec:

bit 2: set if bits 3-4 are known to describe entire image
bit 3: set any time a compressed cluster is written to the image
bit 4: set any time an uncompressed cluster is written to the image

Any edit to the image by an older qemu will clear all three bits,
whereas new qemu would set bit 2 on image creation, and set the
appropriate bit 3 or 4 on any cluster write, so that we then have the
following knowledge:

234
===
000 - image was created or modified by older qemu; we have no idea if
clusters are written, or if compression was used, without expensive scan
001 - image contains at least one normal cluster, but no idea if it also
contains compressed clusters without expensive scan
010 - image contains at least one compressed cluster, but no idea if it
is fully compressed without expensive scan
011 - image contains mix of normal and compressed clusters
100 - image is newly created with no written clusters
101 - image contains only normal clusters; compression type could be
changed without risk
110 - image contains only compressed clusters
111 - image contains mix of normal and compressed clusters

But I'm not sure we need it.

> 
> The @compression-type only tells you what format a compressed cluster
> uses if any compressed cluster exists in the image (or if a new
> compressed cluster is written to it). It doesn't mean that such clusters
> currently exist.
> 
> Kevin
> 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: ImageInfo oddities regarding compression

2020-11-27 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 27.11.2020 um 13:21 hat Markus Armbruster geschrieben:
>> >> I fell down this (thankfully shallow) rabbit hole because we also have
>> >> 
>> >> { 'enum': 'MultiFDCompression',
>> >>   'data': [ 'none', 'zlib',
>> >> { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }
>> >> 
>> >> I wonder whether we could merge them into a common type.
>> 
>> Looks like we could: current code would never report the additional
>> value 'none'.  Introspection would show it, though.  Seems unlikely to
>> cause trouble.  Observation, not demand.
>
> Forgot to comment on this one...
>
> Technically we could probably, but would it make sense? Support for
> compression formats has to be implemented separately for both cases, so
> that they currently happen to support the same list is more of a
> coincidence.
>
> If we ever add a third compression format to qcow2, would we add the
> same format to migration, too, or would we split the schema into two
> types again?

I figure if a compression method is worth adding to one, it's probably
worth adding to the other.

Having two separate enums isn't too bad.  A third has been proposed[*],
but I hope we can reuse migration's existing enum there.


[*] [PATCH 1/6] migration: Add multi-thread compress method




Re: ImageInfo oddities regarding compression

2020-11-27 Thread Kevin Wolf
Am 27.11.2020 um 13:21 hat Markus Armbruster geschrieben:
> >> I fell down this (thankfully shallow) rabbit hole because we also have
> >> 
> >> { 'enum': 'MultiFDCompression',
> >>   'data': [ 'none', 'zlib',
> >> { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }
> >> 
> >> I wonder whether we could merge them into a common type.
> 
> Looks like we could: current code would never report the additional
> value 'none'.  Introspection would show it, though.  Seems unlikely to
> cause trouble.  Observation, not demand.

Forgot to comment on this one...

Technically we could probably, but would it make sense? Support for
compression formats has to be implemented separately for both cases, so
that they currently happen to support the same list is more of a
coincidence.

If we ever add a third compression format to qcow2, would we add the
same format to migration, too, or would we split the schema into two
types again?

Kevin




Re: ImageInfo oddities regarding compression

2020-11-27 Thread Markus Armbruster
Kevin Wolf  writes:

> Am 27.11.2020 um 11:14 hat Daniel P. Berrangé geschrieben:
>> On Fri, Nov 27, 2020 at 11:06:36AM +0100, Markus Armbruster wrote:
>> > ImageInfo has an optional member @compressed:
>> > 
>> > ##
>> > # @ImageInfo:
>> > [...]
>> > # @compressed: true if the image is compressed (Since 1.7)
>> > 
>> > Doc bug: neglects to specify the default.  I guess it's false.
>
> I think rather than false (definitely not compressed) it might just mean
> unknown, see below.
>
>> > The only user appears to be vmdk_get_extent_info().  Goes back to
>> > v1.7.0's commits
>> > 
>> > f4c129a38a vmdk: Implment bdrv_get_specific_info
>> > cbe82d7fb3 qapi: Add optional field 'compressed' to ImageInfo

Should it be in ImageInfoSpecificVmdk instead?

I know we can't just move it, and copy, deprecate, wait, delete may not
be worth the bother.  Sill, I'm curious :)

>> > ImageInfo also has an optional member @format-specific.
>> > 
>> > Doc bug: neglects to specify when it's present.  I assume it's always
>> > present when member @format has a value that has a non-empty variant in
>> > @format-specific's type ImageInfoSpecific.
>
> This seems to be the intention, yes.
>
>> > Format qcow2's variant is ImageInfoSpecificQCow2.  It has a mandatory
>> > member @compression-type.
>> > 
>> >##
>> ># @Qcow2CompressionType:
>> >#
>> ># Compression type used in qcow2 image file
>> >#
>> ># @zlib: zlib compression, see 
>> ># @zstd: zstd compression, see 
>> >#
>> ># Since: 5.1
>> >##
>> >{ 'enum': 'Qcow2CompressionType',
>> >  'data': [ 'zlib', { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }
>> > 
>> > Apparently, you can't have a qcow2 image without compression.  Correct?
>> > 
>> > Can you imagine a use case for "without compression"?
>> 
>> Yes & no. An image always has a compression type, either implicitly
>> zlib as the historical default, or explicitly as a type recorded in
>> the header.  This doesn't mean compression is used.
>> 
>> There may be zero or more clusters actually using compression.
>> Typically compression will never be used, but there's still an
>> associated compression type regardless.
>
> Right, so the correct answer to "is this image compressed?" is "unknown"
> for qcow2. Providing an answer would require checking all clusters in
> the image for the compression flag, which is not something we want to do
> for query-block. And even if we did that, it would still be unclear what
> to do with a partially compressed image.
>
> The @compression-type only tells you what format a compressed cluster
> uses if any compressed cluster exists in the image (or if a new
> compressed cluster is written to it). It doesn't mean that such clusters
> currently exist.

Makes sense.

If we had a compression type 'none', it would effectively tell us that
the image is definitely not compressed, because any "compressed"
clusters would use method none, i.e. be not compressed.  Observation,
not feature request.

>> I fell down this (thankfully shallow) rabbit hole because we also have
>> 
>> { 'enum': 'MultiFDCompression',
>>   'data': [ 'none', 'zlib',
>> { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }
>> 
>> I wonder whether we could merge them into a common type.

Looks like we could: current code would never report the additional
value 'none'.  Introspection would show it, though.  Seems unlikely to
cause trouble.  Observation, not demand.




Re: ImageInfo oddities regarding compression

2020-11-27 Thread Kevin Wolf
Am 27.11.2020 um 11:14 hat Daniel P. Berrangé geschrieben:
> On Fri, Nov 27, 2020 at 11:06:36AM +0100, Markus Armbruster wrote:
> > ImageInfo has an optional member @compressed:
> > 
> > ##
> > # @ImageInfo:
> > [...]
> > # @compressed: true if the image is compressed (Since 1.7)
> > 
> > Doc bug: neglects to specify the default.  I guess it's false.

I think rather than false (definitely not compressed) it might just mean
unknown, see below.

> > The only user appears to be vmdk_get_extent_info().  Goes back to
> > v1.7.0's commits
> > 
> > f4c129a38a vmdk: Implment bdrv_get_specific_info
> > cbe82d7fb3 qapi: Add optional field 'compressed' to ImageInfo
> > 
> > ImageInfo also has an optional member @format-specific.
> > 
> > Doc bug: neglects to specify when it's present.  I assume it's always
> > present when member @format has a value that has a non-empty variant in
> > @format-specific's type ImageInfoSpecific.

This seems to be the intention, yes.

> > Format qcow2's variant is ImageInfoSpecificQCow2.  It has a mandatory
> > member @compression-type.
> > 
> >##
> ># @Qcow2CompressionType:
> >#
> ># Compression type used in qcow2 image file
> >#
> ># @zlib: zlib compression, see 
> ># @zstd: zstd compression, see 
> >#
> ># Since: 5.1
> >##
> >{ 'enum': 'Qcow2CompressionType',
> >  'data': [ 'zlib', { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }
> > 
> > Apparently, you can't have a qcow2 image without compression.  Correct?
> > 
> > Can you imagine a use case for "without compression"?
> 
> Yes & no. An image always has a compression type, either implicitly
> zlib as the historical default, or explicitly as a type recorded in
> the header.  This doesn't mean compression is used.
> 
> There may be zero or more clusters actually using compression.
> Typically compression will never be used, but there's still an
> associated compression type regardless.

Right, so the correct answer to "is this image compressed?" is "unknown"
for qcow2. Providing an answer would require checking all clusters in
the image for the compression flag, which is not something we want to do
for query-block. And even if we did that, it would still be unclear what
to do with a partially compressed image.

The @compression-type only tells you what format a compressed cluster
uses if any compressed cluster exists in the image (or if a new
compressed cluster is written to it). It doesn't mean that such clusters
currently exist.

Kevin




Re: ImageInfo oddities regarding compression

2020-11-27 Thread Daniel P . Berrangé
On Fri, Nov 27, 2020 at 11:06:36AM +0100, Markus Armbruster wrote:
> ImageInfo has an optional member @compressed:
> 
> ##
> # @ImageInfo:
> [...]
> # @compressed: true if the image is compressed (Since 1.7)
> 
> Doc bug: neglects to specify the default.  I guess it's false.
> 
> The only user appears to be vmdk_get_extent_info().  Goes back to
> v1.7.0's commits
> 
> f4c129a38a vmdk: Implment bdrv_get_specific_info
> cbe82d7fb3 qapi: Add optional field 'compressed' to ImageInfo
> 
> ImageInfo also has an optional member @format-specific.
> 
> Doc bug: neglects to specify when it's present.  I assume it's always
> present when member @format has a value that has a non-empty variant in
> @format-specific's type ImageInfoSpecific.
> 
> Format qcow2's variant is ImageInfoSpecificQCow2.  It has a mandatory
> member @compression-type.
> 
>##
># @Qcow2CompressionType:
>#
># Compression type used in qcow2 image file
>#
># @zlib: zlib compression, see 
># @zstd: zstd compression, see 
>#
># Since: 5.1
>##
>{ 'enum': 'Qcow2CompressionType',
>  'data': [ 'zlib', { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }
> 
> Apparently, you can't have a qcow2 image without compression.  Correct?
> 
> Can you imagine a use case for "without compression"?

Yes & no. An image always has a compression type, either implicitly
zlib as the historical default, or explicitly as a type recorded in
the header.  This doesn't mean compression is used.

There may be zero or more clusters actually using compression.
Typically compression will never be used, but there's still an
associated compression type regardless.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|




ImageInfo oddities regarding compression

2020-11-27 Thread Markus Armbruster
ImageInfo has an optional member @compressed:

##
# @ImageInfo:
[...]
# @compressed: true if the image is compressed (Since 1.7)

Doc bug: neglects to specify the default.  I guess it's false.

The only user appears to be vmdk_get_extent_info().  Goes back to
v1.7.0's commits

f4c129a38a vmdk: Implment bdrv_get_specific_info
cbe82d7fb3 qapi: Add optional field 'compressed' to ImageInfo

ImageInfo also has an optional member @format-specific.

Doc bug: neglects to specify when it's present.  I assume it's always
present when member @format has a value that has a non-empty variant in
@format-specific's type ImageInfoSpecific.

Format qcow2's variant is ImageInfoSpecificQCow2.  It has a mandatory
member @compression-type.

   ##
   # @Qcow2CompressionType:
   #
   # Compression type used in qcow2 image file
   #
   # @zlib: zlib compression, see 
   # @zstd: zstd compression, see 
   #
   # Since: 5.1
   ##
   { 'enum': 'Qcow2CompressionType',
 'data': [ 'zlib', { 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }

Apparently, you can't have a qcow2 image without compression.  Correct?

Can you imagine a use case for "without compression"?

I fell down this (thankfully shallow) rabbit hole because we also have

{ 'enum': 'MultiFDCompression',
  'data': [ 'none', 'zlib',
{ 'name': 'zstd', 'if': 'defined(CONFIG_ZSTD)' } ] }

I wonder whether we could merge them into a common type.