[jira] [Commented] (HBASE-21657) PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100% scan case.

stack (JIRA) Thu, 03 Jan 2019 22:28:16 -0800


    [ 
https://issues.apache.org/jira/browse/HBASE-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16733856#comment-16733856
 ]


stack commented on HBASE-21657:
-------------------------------

Thanks for turning up this one.

Whats the hdd flamegraph look like? It was same version? Where is it spending 
time? In same place?

bq. The complicated condition sentences which may lead to the JVM inline did 
not work.... 

Which are these [~openinx] ? And when you say, did not work, is it that they 
are not inlining? (That StoreScanner#next is a massive method. It for sure does 
not inline so anything under it such as estimated size will not inline 
either....).

Whats the flamegraph look like now? Why you think the speedup?

Adding serialized size to Cell Interface is a radical change but having 
defaults makes it easier and hard to argue w/ a 40% speedup (do we have to have 
two methods... one with tags and one without? Can't the Cell figure if it has 
tags or not?).







> PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100% 
> scan case.
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-21657
>                 URL: https://issues.apache.org/jira/browse/HBASE-21657
>             Project: HBase
>          Issue Type: Bug
>          Components: Performance
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
>         Attachments: HBASE-21657.v1.patch, HBASE-21657.v2.patch, 
> HBase2.0.4-with-patch.v2.png, HBase2.0.4-without-patch-v2.png, 
> hbase2.0.4-ssd-scan-traces.2.svg, hbase2.0.4-ssd-scan-traces.svg, 
> hbase20-ssd-100-scan-traces.svg
>
>
> We are evaluating the performance of branch-2, and find that the throughput 
> of scan in SSD cluster is almost the same as HDD cluster. so I made a 
> FlameGraph on RS, and found that the 
> PrivateCellUtil#estimatedSerializedSizeOf cost about 29% cpu, Obviously, it 
> has been the bottleneck in 100% scan case.
> See the [^hbase20-ssd-100-scan-traces.svg]
> BTW, in our XiaoMi branch, we introduce a 
> HRegion#updateReadRequestsByCapacityUnitPerSecond to sum up the size of cells 
> (for metric monitor), so it seems the performance loss was amplified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-21657) PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100% scan case.

Reply via email to