[
https://issues.apache.org/jira/browse/HBASE-10801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13973831#comment-13973831
]
ramkrishna.s.vasudevan commented on HBASE-10801:
------------------------------------------------
I tested this patch with a minor modification of not passing the SeekerState to
the KeyOnlyClonedSeekerState to have only the primitive member variables.
(passing seekerstate was bit more costly).
Combining this with HBASE-10929 and added a filter FilterAllFilter, that
filters out every row that gets returned to the client. This ensures that the
path of the scan there is no need for creating a KV object (which involves
copying the value part also). So purely the comparison happens as only Cells.
Note that in this patch the key part is copied in the shallowCopy().
Doing so with a full table scan with 1 thread over 2000000 rows resulted in
this
With patch
========
{code}
hbase(main):002:0> scan
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW COLUMN+CELL
0 row(s) in 9.6820 seconds
hbase(main):003:0> scan
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW COLUMN+CELL
0 row(s) in 2.8490 seconds
hbase(main):004:0> scan
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW COLUMN+CELL
0 row(s) in 2.7680 seconds
hbase(main):005:0> scan
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW COLUMN+CELL
0 row(s) in 2.5470 seconds
{code}
without patch
=========
{code}
hbase(main):002:0> scan
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW COLUMN+CELL
0 row(s) in 19.4020 seconds
hbase(main):003:0> scan
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW COLUMN+CELL
0 row(s) in 6.1450 seconds
hbase(main):004:0> scan
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW COLUMN+CELL
0 row(s) in 2.8520 seconds
hbase(main):005:0> scan
'TestTable',{FILTER=>org.apache.hadoop.hbase.filter.FilterAllFilter.new()}
ROW COLUMN+CELL
0 row(s) in 2.6900 seconds
{code}
Used Performance Evaluation tool. So the length of value bytes is 1000 per
row. So you could see when the experiment starts the scan almost takes 50%
more time. But once the cache is fully loaded the scans are not too costly and
the values even out with a small deviation. Changing the value size may impact
much more than this.
Can test with changing the value also and making it much more bigger.
This change in the performance during the first scanning remains consistent.
> Ensure DBE interfaces can work with Cell
> ----------------------------------------
>
> Key: HBASE-10801
> URL: https://issues.apache.org/jira/browse/HBASE-10801
> Project: HBase
> Issue Type: Sub-task
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 0.99.0
>
> Attachments: HBASE-10801.patch, HBASE-10801_1.patch,
> HBASE-10801_2.patch, HBASE-10801_3.patch
>
>
> Some changes to the interfaces may be needed for DBEs or may be the way it
> works currently may be need to be modified inorder to make DBEs work with
> Cells. Suggestions and ideas welcome.
--
This message was sent by Atlassian JIRA
(v6.2#6252)