Hi Ilya, This is very useful. Is the compression going to be per-page, in which case the dictionary is going to be kept inside of a page? Or do you have some other design in mind?
D. On Mon, Sep 3, 2018 at 10:36 AM, Ilya Kasnacheev <ilya.kasnach...@gmail.com> wrote: > Hello again! > > I've been running various compression parameters through cod dataset. > > It looks like the best compression level in terms of speed is either 1 or > 2. > The default for Zstd seems to be 3 which would almost always perform worse. > For best performance a dictionary of 1024 is optimal, for better > compression > one might choose larger dictionaries, 6k looks good but I will also run a > few benchmarks on larger dicts. Unfortunately, Zstd crashes if sample size > is set to more than 16k entries (I guess I should probe the max buffer size > where problems begin). > > I'm attaching two charts which show what's we've got. Compression rate is a > fraction of original records size. Time to run is wall clock time the test > run. Reasonable compression will increase the run time twofold (of a > program > that only does text record parsing -> creates objects -> binarylizes them > -> > compresses -> decompresses). Notation: s{number of bin objects used to > train}-d{dictionary length in bytes}-l{compression level}. > <http://apache-ignite-developers.2346864.n4.nabble. > com/file/t374/chart1.png> > Second one is basically a zoom in on the first. > <http://apache-ignite-developers.2346864.n4.nabble. > com/file/t374/chart2.png> > I think that in additional to dictionary compression we should have > dictionary-less compression. On typical data of small records it shows > compression rate of 0.8 ~ 0.65, but I can imagine that with larger > unstructured records it can be as good as dict-based and much less of a > hassle dictionary-processing-wise. WDYT? > Sorry for the fine prints. I hope my charts will visible. > > You can see the updated code as pull request: > https://github.com/apache/ignite/pull/4673 > > Regards, > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ >