[
https://issues.apache.org/jira/browse/HBASE-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16731757#comment-16731757
]
Zheng Hu commented on HBASE-21657:
----------------------------------
bq. I think this method is only called if we actually return some Cells to the
client
That's right.
bq. So I guess the assumption was that when the Cell need to ship over the
network to the client anyway, that some CPU won't hurt. No longer true, I guess.
I don't think so. because if the bottleneck was network or rpc, the
estimatedSerializedSizeOf in flamegraph shouldn't cost so much, the methods
related RPC should have more higher ratio.
bq. The cells being scanned not of type ExtendedCell?
I've checked the code path and added some log. All the cells which passed to
PrivateCellUtil#estimatedSerializedSizeOf were SizeCachedKeyValue* or
ByteBufferedKeyValue (see HFileReaderImpl#getCell)... so all of them should be
instanceof ExtendedCell. The complicated condition sentences which lead to
the JVM inline did not work.... Anyway, I'll provide a new performance report
after applying patch.v1 which moved the getSerializeSize from ExtendCell to
Cell for avoiding the frequent instanceof, it's not a production patch, just
for verification.
> PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100%
> scan case.
> ------------------------------------------------------------------------------------
>
> Key: HBASE-21657
> URL: https://issues.apache.org/jira/browse/HBASE-21657
> Project: HBase
> Issue Type: Bug
> Components: Performance
> Reporter: Zheng Hu
> Assignee: Zheng Hu
> Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
> Attachments: HBASE-21657.v1.patch, hbase20-ssd-100-scan-traces.svg
>
>
> We are evaluating the performance of branch-2, and find that the throughput
> of scan in SSD cluster is almost the same as HDD cluster. so I made a
> FlameGraph on RS, and found that the
> PrivateCellUtil#estimatedSerializedSizeOf cost about 29% cpu, Obviously, it
> has been the bottleneck in 100% scan case.
> See theĀ [^hbase20-ssd-100-scan-traces.svg]
> BTW, in our XiaoMi branch, we introduce a
> HRegion#updateReadRequestsByCapacityUnitPerSecond to sum up the size of cells
> (for metric monitor), so it seems the performance loss was amplified.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)