The FreeBSD Foundation has graciously sponsored me to complete the work
I started a few years ago to integrate ZStandard compression into OpenZFS.

After getting caught up on the changes that have occurred while I was
away from the project, I started working on adding additional testing to
catch possible errors.

https://github.com/openzfs/zfs/pull/10278/commits/7ab55cbafe2aa126914e9d57550ad39c538967f9
- added kstat counters to track allocation failures. ZSTD can require a
lot of memory to compress at the higher levels, and we do these
allocations as non-blocking (if there is not enough memory, just store
the block uncompressed, rather than waiting for memory). However, we
want to count the number of occurrences of this condition, because a
user will see a much lower than expected compression ratio if this
occurs. In previous benchmarks, this has also resulted in incorrect results.

My original design has been modified slightly to store additional
information in a header before the compressed block contents. For LZ4 we
used the first 32 bits to store the compressed size of the block (big
endian encoded), so that we could avoid feeding the slack between that
size and the end of the sector to the decompression function. For ZSTD
we extended this to also store the zstd compression level, since we do
not store this level of detail in the block pointer (the patch supports
40 levels, ranging from zstd1 - 19, and zstd-fast-1 - 1000). We have
since further extended this to store the version of zstd that was used
to do the compression. The idea is that this will allow us to safely
upgrade to a newer version of zstd in the future, possibly be keeping
both versions. There are some cases (nop-write, l2arc) where we might
need to be able to recompression a block with the same settings
(compression level, same version of zstd), to get the same checksum.

https://github.com/openzfs/zfs/pull/10278/commits/d707accd14f289010118973fb5acee9a917d5d83
- To facilitate testing that the zfs zstd header (32bit size, 24 bit
version, 8 bit level) is properly saved and restored from disk, I have
extended zdb with a new -Z flag, to decode the information for a block.


Over the next week I plan to work on adding tests around the inheritance
of the zstd compression property. To avoid changing the user interface,
and to keep the atomicity of setting the compression type (zstd) and
level, the user does `zfs set compress=zstd-12 dataset`, but this is
actually settings the property compress=zstd and compress_level=12. We
want to ensure that this won't cause invalid configurations when the
compress property is inherited.

I also plan to dig into the issues around using zstd with compressed_arc
disabled, and how that interacts with the L2ARC. The subject of the
ability to disable the compressed_arc is on the agenda for the OpenZFS
Leadership call tomorrow, so this will somewhat depend on the outcome of
that discussion.

There are also a number of compatibility issues around the 'zfs send'
when using ZSTD compression.


I would also like to thank the various members of the ZFS-on-Linux
community to have helped with the integration and testing on Linux, a
platform I am not very familiar with.

-- 
Allan Jude

Attachment: signature.asc
Description: OpenPGP digital signature

This is a multi-part message in MIME format...

------------=_1592860836-943342-1--

Reply via email to