[
https://issues.apache.org/jira/browse/HBASE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975391#comment-13975391
]
ramkrishna.s.vasudevan commented on HBASE-10974:
------------------------------------------------
[~mcorgan]
Yes Matt. The below code in KeyValueHeap.next(),
{code}
public Cell next() throws IOException {
if(this.current == null) {
return null;
}
Cell kvReturn = this.current.next();
Cell kvNext = this.current.peek();
if (kvNext == null) {
this.current.close();
this.current = pollRealKV();
} else {
KeyValueScanner topScanner = this.heap.peek();
// no need to add current back to the heap if it is the only scanner left
if (topScanner != null && this.comparator.compare(kvNext,
topScanner.peek()) >= 0) {
this.heap.add(this.current);
this.current = pollRealKV();
}
}
return kvReturn;
}
{code}
Would internally use StoreFileScanner.next(). There you could see we are
holding a reference to the current 'cur' in 'retKey' and the 'cur' gets changed
and we return 'retKey'. But the KeyValueHeap.next() would need the retKey and
also the new 'cur' stored as kvNext. See the this.current.next() is kvReturn
but the peek() actually returns the next() value. Lazily doing the deep clone
is definitely beneficial in places where we need to do the comparisons before
fetching the next KV.
In cases without DBEs, there is already no deep copying happening because we
are referring to the same internal BB except for the offset and lengths
changed. Only for DBE's read code path we have this issue.
> Improve DBEs read performance by avoiding byte array deep copies for key[]
> and value[]
> --------------------------------------------------------------------------------------
>
> Key: HBASE-10974
> URL: https://issues.apache.org/jira/browse/HBASE-10974
> Project: HBase
> Issue Type: Improvement
> Components: Scanners
> Affects Versions: 0.99.0
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 0.99.0
>
> Attachments: HBASE-10974_1.patch
>
>
> As part of HBASE-10801, we tried to reduce the copy of the value [] in
> forming the KV from the DBEs.
> The keys required copying and this was restricting us in using Cells and
> always wanted to copy to be done.
> The idea here is to replace the key byte[] as ByteBuffer and create a
> consecutive stream of the keys (currently the same byte[] is used and hence
> the copy). Use offset and length to track this key bytebuffer.
> The copy of the encoded format to normal Key format is definitely needed and
> can't be avoided but we could always avoid the deep copy of the bytes to form
> a KV and thus use cells effectively. Working on a patch, will post it soon.
--
This message was sent by Atlassian JIRA
(v6.2#6252)