On Wed, 25 Oct 2017, Stefan Priebe - Profihost AG wrote:
> Hello,
> 
> in the lumious release notes is stated that zstd is not supported by
> bluestor due to performance reason. I'm wondering why btrfs instead
> states that zstd is as fast as lz4 but compresses as good as zlib.
> 
> Why is zlib than supported by bluestor? And why does btrfs / facebook
> behave different?
> 
> "BlueStore supports inline compression using zlib, snappy, or LZ4. (Ceph
> also supports zstd for RGW compression but zstd is not recommended for
> BlueStore for performance reasons.)"

zstd will work but in our testing the performance wasn't great for 
bluestore in particular.  The problem was that for each compression run 
there is a relatively high start-up cost initializing the zstd 
context/state (IIRC a memset of a huge memory buffer) that dominated the 
execution time... primarily because bluestore is generally compressing 
pretty small chunks of data at a time, not big buffers or streams.

Take a look at unittest_compression timings on compressing 16KB buffers 
(smaller than bluestore needs usually, but illustrated of the problem):

[ RUN      ] Compressor/CompressorTest.compress_16384/0
[plugin zlib (zlib/isal)]
[       OK ] Compressor/CompressorTest.compress_16384/0 (294 ms)
[ RUN      ] Compressor/CompressorTest.compress_16384/1
[plugin zlib (zlib/noisal)]
[       OK ] Compressor/CompressorTest.compress_16384/1 (1755 ms)
[ RUN      ] Compressor/CompressorTest.compress_16384/2
[plugin snappy (snappy)]
[       OK ] Compressor/CompressorTest.compress_16384/2 (169 ms)
[ RUN      ] Compressor/CompressorTest.compress_16384/3
[plugin zstd (zstd)]
[       OK ] Compressor/CompressorTest.compress_16384/3 (4528 ms)

It's an order of magnitude slower than zlib or snappy, which probably 
isn't acceptable--even if it is a bit smaller.

We just updated to a newer zstd the other day but I haven't been paying 
attention to the zstd code changes.  When I was working on this the plugin 
was initially also misusing the zstd API, but it was also pointed out 
that the size of the memset is dependent on the compression level.  
Maybe a different (default) choice there woudl help.

https://github.com/facebook/zstd/issues/408#issuecomment-252163241

sage
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to