On Thu, 26 Oct 2017, Stefan Priebe - Profihost AG wrote:
> Hi Sage,
>
> Am 25.10.2017 um 21:54 schrieb Sage Weil:
> > On Wed, 25 Oct 2017, Stefan Priebe - Profihost AG wrote:
> >> Hello,
> >>
> >> in the lumious release notes is stated that zstd is not supported by
> >> bluestor due to performance reason. I'm wondering why btrfs instead
> >> states that zstd is as fast as lz4 but compresses as good as zlib.
> >>
> >> Why is zlib than supported by bluestor? And why does btrfs / facebook
> >> behave different?
> >>
> >> "BlueStore supports inline compression using zlib, snappy, or LZ4. (Ceph
> >> also supports zstd for RGW compression but zstd is not recommended for
> >> BlueStore for performance reasons.)"
> >
> > zstd will work but in our testing the performance wasn't great for
> > bluestore in particular. The problem was that for each compression run
> > there is a relatively high start-up cost initializing the zstd
> > context/state (IIRC a memset of a huge memory buffer) that dominated the
> > execution time... primarily because bluestore is generally compressing
> > pretty small chunks of data at a time, not big buffers or streams.
> >
> > Take a look at unittest_compression timings on compressing 16KB buffers
> > (smaller than bluestore needs usually, but illustrated of the problem):
> >
> > [ RUN ] Compressor/CompressorTest.compress_16384/0
> > [plugin zlib (zlib/isal)]
> > [ OK ] Compressor/CompressorTest.compress_16384/0 (294 ms)
> > [ RUN ] Compressor/CompressorTest.compress_16384/1
> > [plugin zlib (zlib/noisal)]
> > [ OK ] Compressor/CompressorTest.compress_16384/1 (1755 ms)
> > [ RUN ] Compressor/CompressorTest.compress_16384/2
> > [plugin snappy (snappy)]
> > [ OK ] Compressor/CompressorTest.compress_16384/2 (169 ms)
> > [ RUN ] Compressor/CompressorTest.compress_16384/3
> > [plugin zstd (zstd)]
> > [ OK ] Compressor/CompressorTest.compress_16384/3 (4528 ms)
> >
> > It's an order of magnitude slower than zlib or snappy, which probably
> > isn't acceptable--even if it is a bit smaller.
> >
> > We just updated to a newer zstd the other day but I haven't been paying
> > attention to the zstd code changes. When I was working on this the plugin
> > was initially also misusing the zstd API, but it was also pointed out
> > that the size of the memset is dependent on the compression level.
> > Maybe a different (default) choice there woudl help.
> >
> > https://github.com/facebook/zstd/issues/408#issuecomment-252163241
>
> thanks for the fast reply. Btrfs uses a default compression level of 3
> but i think this is the default anyway.
>
> Does the zstd plugin of ceph already uses the mentioned
> ZSTD_resetCStream instead of creating and initializing a new one every time?
Hmm, it doesn't:
https://github.com/ceph/ceph/blob/master/src/compressor/zstd/ZstdCompressor.h#L29
but perhaps that was because it didn't make a difference? Might be worth
revisiting.
> So if performance matters ceph would recommand snappy?
Yep!
sage
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com