[ 
https://issues.apache.org/jira/browse/HBASE-7162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13502071#comment-13502071
 ] 

Matt Corgan commented on HBASE-7162:
------------------------------------

I've been looking through the read path to see what it would take to extend 
Cell usage out further towards the client, and I haven't found any obstacles 
that can't be overcome without some work.  There are places that are doing 
comparisons and such on the raw bytes of the KeyValue which won't work.  I 
believe it was the StoreScanner that holds a reference to the current and 
previous KV which also won't work, but in that particular case it was just a 
safety check that could probably be removed.  I don't understand the lazy-seek 
stuff well enough yet...  

We just have to go through and start changing the individual uses from KeyValue 
to Cell.  To avoid a major refactoring, in problem areas we can make a quick 
call to KeyValueTool.fromCell(cell) when we need an immutable KeyValue (which 
is basically the 60% copying cost you are seeing above), and then work on 
eliminating those wasteful copying calls over time.  

All of it would be simpler if we don't have to pass around custom comparators 
for ROOT and META tables.

As for scatter/gather collection of values, one thing to keep in mind is that 
the data block encoding is most effective when keys are long and values are 
short.  If your average value is, say, 8 bytes, the setting up an ByteBuffer[] 
to hold references to all the value ranges may be overkill since each 
ByteBuffer probably takes something like 40 bytes of heap.
                
> Prefix Compression - Trie data block encoding; hbase-common and hbase-server 
> changes
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-7162
>                 URL: https://issues.apache.org/jira/browse/HBASE-7162
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.96.0
>            Reporter: stack
>            Assignee: Matt Corgan
>             Fix For: 0.96.0
>
>         Attachments: HBASE-4676-common-and-server-v8.patch, 
> HBASE-4676-common-and-server-v9.patch, HBASE-7162-common-and-server-v10.patch
>
>
> These are the hbase-common and hbase-server changes for hbase-4676 Prefix 
> Compression - Trie data block encoding.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to