Hi Guillem,

would you maybe reconsider adding zstd decompression support at this
time?

On Sun, Mar 18, 2018 at 04:38:15AM +0100, Guillem Jover wrote:
> So, the items that come to mind (most from the dpkg FAQ [F]:
> 
> * Availability in general Unix systems would be one. I think the code
>   should be portable, but I've not checked properly.

Given the number of places it has been vendored and used at this time, I
suppose we'd have seen issues. While there is optimized assembly for
x86_64 cpus and uses e.g. __builtin_ctzll, all those uses are carefully
guarded and have portable alternative implementations. Do you see any
particular unixes to watch out here? From a processor architecture pov,
I've never seen issues with zstd in e.g. rebootstrap. (The present
failure for riscv64 likely isn't caused by zstd itself.)

> * Size of the shared library another, it would be by far the fattest
>   compression lib used by dpkg. It's not entirely clear whether the
>   shlib embeds a zlib library?

What made you think so? Is it the zlibWrapper directory in the source?
That's an api adapter of the gzip interface to the zstd compressor.
The size remains a possible issue otherwise.

> * Increase in the (build-)essential set (directly and transitively).

We're now in a place where libzstd1 is transitively essential.

> * It also seems the format has changed quite some times already, and
>   it's probably the reason for the fat shlib. Not sure if the format
>   has stabilized enough to use this as good long-term storage format,
>   and what's the policy regarding supporting old formats for example,
>   given that this is intended mainly to be used for real-time and
>   streaming content and similar. For example the Makefile for libzstd
>   defaults to supporting v0.4+ only, which does not look great.

Given the state of development and the wide adoption, it would seem
unlikely to me to have it break more compatibility. Also note that there
is a trade-off here between size and compatibility. You cannot have both
a small size and support all ancient formats.

Beyond these cases, I think compatibility also goes the other way round.
If a significant portion of .debs in the wild are compressed using zstd
(and that's what we're seeing), dpkg should be able to decompress them
even if it wasn't the one that introduced them. You care very much about
being able to decompress each and every ancient .deb, but in practice we
also care about decompressing those .debs that currently reside in
Ubuntu's PPAs.

In my personal workflow, I decompress very many packages into tmpfs (or
ram). This is bottle-necked on CPU. In my experience, zstd decompression
is almost 100 times faster than xz decompression. That's a fairly big
improvement. At this time, I'm convinced that zstd is better for the
"compress once, decompress often" use case than xz, which still excels
at "compress once, decompress rarely". I admit that I only get the
benefits if dpkg also supports zstd as a compressor and many relevant
packages switch to it.

So at this point, I think that supporting zstd decompression is
something reasonable to add to dpkg. Please reconsider your decision.

Helmut

Reply via email to