Hi James, Yes, that's correct. Unfortunately right now the RLE encoding code doesn't support int64.
I'd suggest trying the "BIT_SHUFFLE" encoding, which can get you a lot of the same benefits as RLE and does work on int64. Since it seems you're good at checking out the code, I'd also be happy to review a patch to fix this if you want to give it a try :) -Todd On Tue, Aug 9, 2016 at 2:19 PM, James Pirz <[email protected]> wrote: > Hi, > > I am trying to use RLE-encoding in Kudu with fixed bit-width values. I am > specifically dealing with uint64 values which are supposed to be 64-bit > long. I realized that the code under BitWriter (which is used in flushing > RLE runs) only handles values up to 32-bit values: > > inline void BitWriter::PutValue(uint64_t v, int num_bits) { > // TODO: revisit this limit if necessary (can be raised to 64 by > fixing some edge cases) > DCHECK_LE(num_bits, 32); > ... > > > Can you please verify if indeed such a limitation exists in Kudu for RLE > encoding (i.e. it can not be applied to fixed size values longer than 32 > bits) and if yes is there a work around for that ? > (I also checked RLE code in Impala and parquet-cpp which share the RLE > code and they seem to have the same limitation). > > Thanks > -- Todd Lipcon Software Engineer, Cloudera
