On Tue, May 13, 2014 at 3:58 PM, Ryan Blue <[email protected]> wrote:

> Here are a few more specific responses.
>
>
...


>
>  OrderedBytes implements a bit-shifting strategy for this.
>> {FixedLength,Terminated}Wrapper are provided to add flexibility. Ryan
>> has suggested a variation of run-length encoding as another alternative,
>> something we could add is there's sufficient need.
>>
>
> We went with the run-length encoding variant because in most cases, it
> decreases the size of the data or doesn't increase it too much. It
> increases the size only when there are single null bytes, in which case it
> adds a byte for each single null. Size is the same or reduced with two or
> more null bytes.
>
>
Out of interest, anyone asking for run length encoding at the moment?




>
>  The above date question is a perfece example of why I think it's
>> important that we have the DataType interface. Having the interface
>> means an application can implement it's own types when their needs are
>> too unique for commit to HBase. Other applications can still use that
>> implementation by including the relevant application jars. They enjoy
>> interoperability by agreeing on the DataType implementation, not on
>> something provided out of the box by a particular HBase version.
>>
>
> I think this spec would be a stronger interop guarantee. We should discuss
> whether we can support this spec along with existing data, although I
> suspect we probably can't.



Existing data in the above is data already written into HBase tables?
 Wouldn't such data be out of scope for this project?  Or what you thinking
Ryan?

Good stuff,
St.Ack

Reply via email to