[ 
https://issues.apache.org/jira/browse/HBASE-11425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14356182#comment-14356182
 ] 

ramkrishna.s.vasudevan commented on HBASE-11425:
------------------------------------------------

bq.Should bucket size be same as the hfile block size?
Yes. that would be better in many cases how ever the odd blocks may go beyond 
the hfile block size.
bq.Can MBB be developed in isolation with tests and refcounting tests apart 
from main code base? Is that being done?
We need some tests for the refcounting part. Apart from that they can be 
individual tasks as Anoop says.

Reg the BB and comparators having two paths, that would be the ideal way as per 
the profiler reports.  That is because for all the KVs that is coming from the 
HHFiles we have Buffer backed cells. But for the cells in memstore is byte[]. 
So as mentioned in the doc, if we try to create only BB based rows, families 
and qualifiers, we may have to do wrapping of these byte[]. That is a costlier 
operation.  Also in cases of creating fake keys it is always better to create 
fake keys in byte[] rather than in BB because for BB's we have to do some 
allocation and then copy the contents. All these are costlier.
Hence when we create a fake key and compare it against a key from HFile we have 
two version of cells. One backed by byte[] and another  by BB. So it would be 
better if/else based comparisons.  


Reg the Unsafe comparators,
They are just the same as in byte[] array now.  
bq.So, you might want to underline this point. Its BB but WE are managing the 
position and length to save on object creation and to bypass BB range checking, 
etc.
Yes. That is the important decision that we had to make.  One objective is to 
reduce the objects creation and another is to use the same APIs for offset and 
length.  

bq.Client won't be offheaping? If so, could the BB APIs be mixed in to Cell on 
the server only?
We discussed on that ServerCell concepts. But I would argue not do that because 
then the user would have two types of Cells - one on the write path and the 
other cell on the read path.  I would say that would make things more complex 
and not much ease of use too.

I would try to make a trunk based patch and upload for reference.


> Cell/DBB end-to-end on the read-path
> ------------------------------------
>
>                 Key: HBASE-11425
>                 URL: https://issues.apache.org/jira/browse/HBASE-11425
>             Project: HBase
>          Issue Type: Umbrella
>          Components: regionserver, Scanners
>    Affects Versions: 0.99.0
>            Reporter: Anoop Sam John
>            Assignee: Anoop Sam John
>         Attachments: Offheap reads in HBase using BBs_final.pdf
>
>
> Umbrella jira to make sure we can have blocks cached in offheap backed cache. 
> In the entire read path, we can refer to this offheap buffer and avoid onheap 
> copying.
> The high level items I can identify as of now are
> 1. Avoid the array() call on BB in read path.. (This is there in many 
> classes. We can handle class by class)
> 2. Support Buffer based getter APIs in cell.  In read path we will create a 
> new Cell with backed by BB. Will need in CellComparator, Filter (like SCVF), 
> CPs etc.
> 3. Avoid KeyValue.ensureKeyValue() calls in read path - This make byte copy.
> 4. Remove all CP hooks (which are already deprecated) which deal with KVs.  
> (In read path)
> Will add subtasks under this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to