[
https://issues.apache.org/jira/browse/HBASE-24754?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17250798#comment-17250798
]
ramkrishna.s.vasudevan commented on HBASE-24754:
------------------------------------------------
[~stack] and [~bharathv]
Thanks for chiming in. I believe atleast in the MR case the Puts that are
generated here (in PutSortReducer) is anywa going to KVs only as in the client
facing Put API we expose Cells that are always KV. If we can generate a KV
comparator code that will be the best way I agree to it. I don't have much
experience in doing this code generation. I can look into options for that and
see if it can be used here.
> Bulk load performance is degraded in HBase 2
> ---------------------------------------------
>
> Key: HBASE-24754
> URL: https://issues.apache.org/jira/browse/HBASE-24754
> Project: HBase
> Issue Type: Bug
> Components: Performance
> Affects Versions: 2.2.3
> Reporter: Ajeet Rai
> Assignee: ramkrishna.s.vasudevan
> Priority: Major
> Fix For: 3.0.0-alpha-1, 2.5.0
>
> Attachments: Branc2_withComparator_atKeyValue.patch,
> Branch1.3_putSortReducer_sampleCode.patch,
> Branch2_putSortReducer_sampleCode.patch, flamegraph_branch-1_new.svg,
> flamegraph_branch-2.svg, flamegraph_branch-2_afterpatch.svg
>
>
> in our Test,It is observed that Bulk load performance is degraded in HBase 2 .
> Test Input:
> 1: Table with 500 region(300 column family)
> 2: data =2 TB
> Data Sample
> 18600000001201502051000000068110,18600000001,20150205,5,404,735412,2938,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111,111111111111111111111111111111111111111111111111111111111111111111111111111111111
> 3: Cluster: 7 node(2 master+5 Region Server)
> 4: No of Container Launched are same in both case
> HBase 2 took 10% more time then HBase 1.3 where test input is same for both
> cluster
>
> |Feature|HBase 2.2.3
> Time(Sec)|HBase 1.3.1
> Time(Sec)|Diff%|Snappy lib:
> |
> |BulkLoad|21837|19686.16|-10.93|Snappy lib:
> HBase 2.2.3: 1.4
> HBase 1.3.1: 1.4|
--
This message was sent by Atlassian Jira
(v8.3.4#803005)