Folks,

I've implemented page compression for persistent store and going to merge
it to master.

https://github.com/apache/ignite/pull/5200

Some design notes:

It employs "hole punching" approach, it means that the pages are kept
uncompressed in memory,
but when they get written to disk, they will be compressed and all the
extra file system blocks for the page will be released. Thus the storage
files become sparse.

Right now we will support 4 compression methods: ZSTD, LZ4, SNAPPY and
SKIP_GARBAGE. All of them are self-explaining except SKIP_GARBAGE, which
basically just takes only meaningful data from half-filled pages but does
not apply any compression. It is easy to add more if needed.

Since we can release only full file system blocks which are typically 4k
size, user must configure page size to be at least multiple FS blocks, e.g.
8k or 16k. It also means that max compression ratio here is fsBlockSize /
pageSize = 4k / 16k = 0.25

It is possible to enable compression for existing databases if they were
configured for large enough page size. In this case pages will be written
to disk in compressed form when updated, and the database will become
compressed gradually.

There will be 2 new properties on CacheConfiguration
(setDiskPageCompression and setDiskPageCompressionLevel) to setup disk page
compression.

Compression dictionaries are not supported at the time, but may in the
future. IMO it should be added as a separate feature if needed.

The only supported platform for now is Linux. Since all popular file
systems support sparse files, it must be  relatively easy to support more
platforms.

Please take a look and provide your thoughts and suggestions.

Thanks!

Sergi

Reply via email to