[
https://issues.apache.org/jira/browse/HBASE-10974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13975400#comment-13975400
]
ramkrishna.s.vasudevan commented on HBASE-10974:
------------------------------------------------
[[email protected]]
bq.The "16.7.2.2. Medium Tests" in refguide has medium tests taking < 50
seconds. FYI.
Okie.
bq.So except for the common bytes the remaining bytes are copied for every
next().
I mentioned this in my comment in relation to the current code that is
existing.
bq.I just see us copying the full key, not the difference. Am I looking in the
wrong place boss?
In the current patch we do copy the full key. But existing code does not.
{code}
current.ensureSpaceForKey();
currentBuffer.get(current.keyBuffer, current.lastCommonPrefix,
current.keyLength - current.lastCommonPrefix);
{code}
See here. The current.keyBuffer is filled up from the offset marked by
current.lastCommonPrefix. which means if there are 25 bytes in common with the
previous KV, we do not get those 25 bytes from the current buffer.(because it
would not have that) and so the current.keyBuffer does not get filled up with
those common 25 bytes. It starts filling up from 26th byte onwards.
Same you could see with the other encoders also.
The 'math' that i saw was definitely better for scanning in case of 'gets' it
is having impacts. But the main advantage is that you can work with Cells
without having to copy the values in cases where the values are significantly
higher in terms of number of bytes than the keys.
> Improve DBEs read performance by avoiding byte array deep copies for key[]
> and value[]
> --------------------------------------------------------------------------------------
>
> Key: HBASE-10974
> URL: https://issues.apache.org/jira/browse/HBASE-10974
> Project: HBase
> Issue Type: Improvement
> Components: Scanners
> Affects Versions: 0.99.0
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 0.99.0
>
> Attachments: HBASE-10974_1.patch
>
>
> As part of HBASE-10801, we tried to reduce the copy of the value [] in
> forming the KV from the DBEs.
> The keys required copying and this was restricting us in using Cells and
> always wanted to copy to be done.
> The idea here is to replace the key byte[] as ByteBuffer and create a
> consecutive stream of the keys (currently the same byte[] is used and hence
> the copy). Use offset and length to track this key bytebuffer.
> The copy of the encoded format to normal Key format is definitely needed and
> can't be avoided but we could always avoid the deep copy of the bytes to form
> a KV and thus use cells effectively. Working on a patch, will post it soon.
--
This message was sent by Atlassian JIRA
(v6.2#6252)