[jira] [Commented] (HBASE-21657) PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100% scan case.

stack (JIRA) Sat, 12 Jan 2019 12:04:34 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16741392#comment-16741392
 ]


stack commented on HBASE-21657:
-------------------------------

v5 is nice and clean. Do you need the change in NoTagsByteBufferKeyValue. It 
inherits from ByteBufferKeyValue which has your change. 

Is it safe doing a return of this.length in SizeCachedKeyValue ? It caches 
rowLen and keyLen... Does it cache this.length? Maybe its caching what is 
passed in on construction?

That addition to addSize in RSRpcServices is crptic. We need that sir? Say more 
why you are doing the accounting on the outside?

Patch is great. +1. It should go back to branch-2.0?

I've not done the testing on my side. I'll keep trying but lets not have this 
hold up this fix.







> PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100% 
> scan case.
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-21657
>                 URL: https://issues.apache.org/jira/browse/HBASE-21657
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
>         Attachments: HBASE-21657.v1.patch, HBASE-21657.v2.patch, 
> HBASE-21657.v3.patch, HBASE-21657.v3.patch, HBASE-21657.v4.patch, 
> HBASE-21657.v5.patch, HBASE-21657.v5.patch, HBASE-21657.v5.patch, 
> HBase1.4.9-ssd-10000000-rows-flamegraph.svg, 
> HBase1.4.9-ssd-10000000-rows-qps-latency.png, 
> HBase2.0.4-patch-v2-ssd-10000000-rows-qps-and-latency.png, 
> HBase2.0.4-patch-v2-ssd-10000000-rows.svg, 
> HBase2.0.4-patch-v3-ssd-10000000-rows-flamegraph.svg, 
> HBase2.0.4-patch-v3-ssd-10000000-rows-qps-and-latency.png, 
> HBase2.0.4-patch-v4-ssd-10000000-rows-flamegraph.svg, 
> HBase2.0.4-ssd-10000000-rows-flamegraph.svg, 
> HBase2.0.4-ssd-10000000-rows-qps-latency.png, HBase2.0.4-with-patch.v2.png, 
> HBase2.0.4-without-patch-v2.png, debug-the-ByteBufferKeyValue.diff, 
> hbase2.0.4-ssd-scan-traces.2.svg, hbase2.0.4-ssd-scan-traces.svg, 
> hbase20-ssd-100-scan-traces.svg, image-2019-01-07-19-03-37-930.png, 
> image-2019-01-07-19-03-55-577.png, overview-statstics-1.png, run.log
>
>
> We are evaluating the performance of branch-2, and find that the throughput 
> of scan in SSD cluster is almost the same as HDD cluster. so I made a 
> FlameGraph on RS, and found that the 
> PrivateCellUtil#estimatedSerializedSizeOf cost about 29% cpu, Obviously, it 
> has been the bottleneck in 100% scan case.
> See the [^hbase20-ssd-100-scan-traces.svg]
> BTW, in our XiaoMi branch, we introduce a 
> HRegion#updateReadRequestsByCapacityUnitPerSecond to sum up the size of cells 
> (for metric monitor), so it seems the performance loss was amplified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21657) PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100% scan case.

Reply via email to