Re: compression of keys for a sequential scan over an inverted index

Jonathan Wonders Thu, 22 Oct 2015 13:38:38 -0700

Josh,

Thanks for the information.  I did read through the discussion about
compression of visibility expressions and columns within RFiles a while
back which got me thinking about some of this. It makes sense that gzip or
lzo/snappy compression would have a very noticeable impact when there are
columns or visibility expressions that are not compressed with RLE even if
neighboring rows have very small lexicographical deltas.


I will put some thought into desiging an experiment to evaluate whether or
not there is any benefit to applying RLE during key-value transport from
server to client.  Even if it proves to be situationally beneficial, I
think it could be implemented as a common iterator similar to the
WholeRowIterator.

Given, the current compression strategy I would expect better server-client
transport compression retrieving a single row with many columns

<key><value>        : <id>    []

compared to many lexicographically close rows.

<key><value><id>    :         []

with the understanding that very large rows can lead to poor load balancing.

V/R
Jonathan

On Thu, Oct 22, 2015 at 11:54 AM, Josh Elser <[email protected]> wrote:

> Jonathan Wonders wrote:
>
>> I have been digging into some details of Accumulo to model the disk and
>> network costs associated with various types of scan patterns and I have
>> a few questions regarding compression.
>>
>> Assuming an inverted index table with rows following the pattern of
>>
>> <key><value><id>
>>
>> and a scan that specifies an exact key and value so as to constrain the
>> range, it seems that the dominant factor in network utiltization would
>> be sending key-value pairs from the tablet server to the client and a
>> secondary factor would be transmitting data from non-local RFiles
>> (assuming no caching).
>>
>
> Sounds about right to me.
>
> Is my understanding correct that the on-disk compression of this type of
>> table is predominantly a function of the average number of differing
>> bits between adjacent ids?  Or, has anyone observed a significant
>> improvement with gz or lzo vs no additional compression?  I'm
>> considering running some experiments to measure the difference for a few
>> types of ids (uuid, snowflake-like, content based hashes), but I'm
>> curious if anyone else has done similar experiments.
>>
>
> There is definitely a noticed difference between gzip, lzo/snappy (they're
> about on par with speed and size), and no compression. I'm not sure what
> the deltas are off the top of my head, but I would expect that you see
> noticeable differences.
>
> You can also notice differences in "densities" when the lexicographical
> delta between two keys. Sequential keys that differ very little will result
> in very dense RFiles (lots of keys), where keys that vary greatly will
> result in less keys per file. We've had some discussions about the
> run-length encoding in RFile lately -- have you stumbled on that?
>
> Given a scan that specifies a range for an exact key and value, is there
>> any transport compression performed for tablet server to client
>> communication beyond the Key.compress method which appears to only
>> compress equivalent rows, columns, etc as opposed to those that share a
>> common prefix?
>>
>
> No, there are no compression algorithms applied to the data before sending
> it. By default we use the TCompactProtocol from Thrift. If you're
> interested in the specifics, Thrift has some documentation on the subject.
>
> I do recall some experiments I tried previously in which a logical
> "object" in Accumulo was comprised of multiple key-value pairs which were
> returned in one key-value pair (very similar to how the WholeRowIterator
> serializes things). In this case, I experimented with compressing the
> serialized Key-Values before returning from the Iterator stack. Sadly, I
> don't recall what the takeaway was, but I don't believe it was outright
> "bad" :)
>
> It seems possible to implement a more specialized compression algorithm
>> with the iterator framework, performing the decompression on the client
>> side, but I'm curious if it could lead to general scan performance
>> improvements if the default compression also involved run-length encoding.
>>
>
> This would be really neat to test.
>
> Some rigor in designing an experiment would be the first step, IMO. If you
> want to spear-head this, I'm sure there are many who would be happy to give
> some feedback. It would be prudent to figure out what the variables in the
> "equation" would be, specifically the distribution of Key-Value pairs --
> amount stored, compression on disk, cache sizes, query workload. First out
> what you want to test, what you will tweak, and define a hypothesis you
> want to test.
>
> You could consider using the continuous ingest framework for
> data-generation and query workloads. You could also look at YCSB [1]
> for some inspiration. Either of these would make your life easier in not
> having to generate the read/write workloads.
>
>
> [1] https://github.com/brianfrankcooper/YCSB
>
>
> Any insight on this subject is much appreciated.
>>
>> V/R
>> Jonathan
>>
>

Re: compression of keys for a sequential scan over an inverted index

Reply via email to