Josh, Thanks for the information. I did read through the discussion about compression of visibility expressions and columns within RFiles a while back which got me thinking about some of this. It makes sense that gzip or lzo/snappy compression would have a very noticeable impact when there are columns or visibility expressions that are not compressed with RLE even if neighboring rows have very small lexicographical deltas.
I will put some thought into desiging an experiment to evaluate whether or not there is any benefit to applying RLE during key-value transport from server to client. Even if it proves to be situationally beneficial, I think it could be implemented as a common iterator similar to the WholeRowIterator. Given, the current compression strategy I would expect better server-client transport compression retrieving a single row with many columns <key><value> : <id> [] compared to many lexicographically close rows. <key><value><id> : [] with the understanding that very large rows can lead to poor load balancing. V/R Jonathan On Thu, Oct 22, 2015 at 11:54 AM, Josh Elser <[email protected]> wrote: > Jonathan Wonders wrote: > >> I have been digging into some details of Accumulo to model the disk and >> network costs associated with various types of scan patterns and I have >> a few questions regarding compression. >> >> Assuming an inverted index table with rows following the pattern of >> >> <key><value><id> >> >> and a scan that specifies an exact key and value so as to constrain the >> range, it seems that the dominant factor in network utiltization would >> be sending key-value pairs from the tablet server to the client and a >> secondary factor would be transmitting data from non-local RFiles >> (assuming no caching). >> > > Sounds about right to me. > > Is my understanding correct that the on-disk compression of this type of >> table is predominantly a function of the average number of differing >> bits between adjacent ids? Or, has anyone observed a significant >> improvement with gz or lzo vs no additional compression? I'm >> considering running some experiments to measure the difference for a few >> types of ids (uuid, snowflake-like, content based hashes), but I'm >> curious if anyone else has done similar experiments. >> > > There is definitely a noticed difference between gzip, lzo/snappy (they're > about on par with speed and size), and no compression. I'm not sure what > the deltas are off the top of my head, but I would expect that you see > noticeable differences. > > You can also notice differences in "densities" when the lexicographical > delta between two keys. Sequential keys that differ very little will result > in very dense RFiles (lots of keys), where keys that vary greatly will > result in less keys per file. We've had some discussions about the > run-length encoding in RFile lately -- have you stumbled on that? > > Given a scan that specifies a range for an exact key and value, is there >> any transport compression performed for tablet server to client >> communication beyond the Key.compress method which appears to only >> compress equivalent rows, columns, etc as opposed to those that share a >> common prefix? >> > > No, there are no compression algorithms applied to the data before sending > it. By default we use the TCompactProtocol from Thrift. If you're > interested in the specifics, Thrift has some documentation on the subject. > > I do recall some experiments I tried previously in which a logical > "object" in Accumulo was comprised of multiple key-value pairs which were > returned in one key-value pair (very similar to how the WholeRowIterator > serializes things). In this case, I experimented with compressing the > serialized Key-Values before returning from the Iterator stack. Sadly, I > don't recall what the takeaway was, but I don't believe it was outright > "bad" :) > > It seems possible to implement a more specialized compression algorithm >> with the iterator framework, performing the decompression on the client >> side, but I'm curious if it could lead to general scan performance >> improvements if the default compression also involved run-length encoding. >> > > This would be really neat to test. > > Some rigor in designing an experiment would be the first step, IMO. If you > want to spear-head this, I'm sure there are many who would be happy to give > some feedback. It would be prudent to figure out what the variables in the > "equation" would be, specifically the distribution of Key-Value pairs -- > amount stored, compression on disk, cache sizes, query workload. First out > what you want to test, what you will tweak, and define a hypothesis you > want to test. > > You could consider using the continuous ingest framework for > data-generation and query workloads. You could also look at YCSB [1] > for some inspiration. Either of these would make your life easier in not > having to generate the read/write workloads. > > > [1] https://github.com/brianfrankcooper/YCSB > > > Any insight on this subject is much appreciated. >> >> V/R >> Jonathan >> >
