Re: compression of keys for a sequential scan over an inverted index

Josh Elser Thu, 22 Oct 2015 08:55:46 -0700

Jonathan Wonders wrote:

I have been digging into some details of Accumulo to model the disk and
network costs associated with various types of scan patterns and I have
a few questions regarding compression.


Assuming an inverted index table with rows following the pattern of

<key><value><id>

and a scan that specifies an exact key and value so as to constrain the
range, it seems that the dominant factor in network utiltization would
be sending key-value pairs from the tablet server to the client and a
secondary factor would be transmitting data from non-local RFiles
(assuming no caching).


Sounds about right to me.

Is my understanding correct that the on-disk compression of this type of
table is predominantly a function of the average number of differing
bits between adjacent ids?  Or, has anyone observed a significant
improvement with gz or lzo vs no additional compression?  I'm
considering running some experiments to measure the difference for a few
types of ids (uuid, snowflake-like, content based hashes), but I'm
curious if anyone else has done similar experiments.

There is definitely a noticed difference between gzip, lzo/snappy(they're about on par with speed and size), and no compression. I'm notsure what the deltas are off the top of my head, but I would expect thatyou see noticeable differences.

You can also notice differences in "densities" when the lexicographicaldelta between two keys. Sequential keys that differ very little willresult in very dense RFiles (lots of keys), where keys that vary greatlywill result in less keys per file. We've had some discussions about therun-length encoding in RFile lately -- have you stumbled on that?

Given a scan that specifies a range for an exact key and value, is there
any transport compression performed for tablet server to client
communication beyond the Key.compress method which appears to only
compress equivalent rows, columns, etc as opposed to those that share a
common prefix?

No, there are no compression algorithms applied to the data beforesending it. By default we use the TCompactProtocol from Thrift. Ifyou're interested in the specifics, Thrift has some documentation on thesubject.

I do recall some experiments I tried previously in which a logical"object" in Accumulo was comprised of multiple key-value pairs whichwere returned in one key-value pair (very similar to how theWholeRowIterator serializes things). In this case, I experimented withcompressing the serialized Key-Values before returning from the Iteratorstack. Sadly, I don't recall what the takeaway was, but I don't believeit was outright "bad" :)

It seems possible to implement a more specialized compression algorithm
with the iterator framework, performing the decompression on the client
side, but I'm curious if it could lead to general scan performance
improvements if the default compression also involved run-length encoding.


This would be really neat to test.

Some rigor in designing an experiment would be the first step, IMO. Ifyou want to spear-head this, I'm sure there are many who would be happyto give some feedback. It would be prudent to figure out what thevariables in the "equation" would be, specifically the distribution ofKey-Value pairs -- amount stored, compression on disk, cache sizes,query workload. First out what you want to test, what you will tweak,and define a hypothesis you want to test.

You could consider using the continuous ingest framework fordata-generation and query workloads. You could also look at YCSB [1]for some inspiration. Either of these would make your life easier in nothaving to generate the read/write workloads.



[1] https://github.com/brianfrankcooper/YCSB

Any insight on this subject is much appreciated.

V/R
Jonathan

Re: compression of keys for a sequential scan over an inverted index

Reply via email to