[
https://issues.apache.org/jira/browse/HBASE-21657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zheng Hu updated HBASE-21657:
-----------------------------
Hadoop Flags: Incompatible change
Release Note:
In HBASE-21657, I simplified the path of estimatedSerialiedSize() &
estimatedSerialiedSizeOfCell() by moving the general getSerializedSize()
and heapSize() from ExtendedCell to Cell interface. The patch also included
some other improvments:
1. For 99% of case, our cells has no tags, so let the HFileScannerImpl just
return the NoTagsByteBufferKeyValue if no tags, which means we can save
lots of cpu time when sending no tags cell to rpc because can just return
the length instead of getting the serialize size by caculating offset/length
of each fields(row/cf/cq..)
2. Move the subclass's getSerializedSize implementation from ExtendedCell to
their own class, which mean we did not need to call ExtendedCell's
getSerialiedSize() firstly, then forward to subclass's
getSerializedSize(withTags).
3. Give a estimated result arraylist size for avoiding the frequent list
extension when in a big scan, now we estimate the array size as min(scan.rows,
512).
it's also help a lot.
We gain almost ~40% throughput improvement in 100% scan case for branch-2
(cacheHitRatio~100%)[1], it's a good thing. While it's a incompatible change in
some case, such as if the upstream user implemented their own Cells, although
it's rare but can happen, then their compile will be error.
> PrivateCellUtil#estimatedSerializedSizeOf has been the bottleneck in 100%
> scan case.
> ------------------------------------------------------------------------------------
>
> Key: HBASE-21657
> URL: https://issues.apache.org/jira/browse/HBASE-21657
> Project: HBase
> Issue Type: Bug
> Components: Performance
> Reporter: Zheng Hu
> Assignee: Zheng Hu
> Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.3, 2.0.5
>
> Attachments: HBASE-21657.v1.patch, HBASE-21657.v2.patch,
> HBASE-21657.v3.patch, HBASE-21657.v3.patch, HBASE-21657.v4.patch,
> HBASE-21657.v5.patch, HBASE-21657.v5.patch, HBASE-21657.v5.patch,
> HBASE-21657.v6.patch, HBASE-21657.v7.patch,
> HBase1.4.9-ssd-10000000-rows-flamegraph.svg,
> HBase1.4.9-ssd-10000000-rows-qps-latency.png,
> HBase2.0.4-patch-v2-ssd-10000000-rows-qps-and-latency.png,
> HBase2.0.4-patch-v2-ssd-10000000-rows.svg,
> HBase2.0.4-patch-v3-ssd-10000000-rows-flamegraph.svg,
> HBase2.0.4-patch-v3-ssd-10000000-rows-qps-and-latency.png,
> HBase2.0.4-patch-v4-ssd-10000000-rows-flamegraph.svg,
> HBase2.0.4-ssd-10000000-rows-flamegraph.svg,
> HBase2.0.4-ssd-10000000-rows-qps-latency.png, HBase2.0.4-with-patch.v2.png,
> HBase2.0.4-without-patch-v2.png, debug-the-ByteBufferKeyValue.diff,
> hbase2.0.4-ssd-scan-traces.2.svg, hbase2.0.4-ssd-scan-traces.svg,
> hbase20-ssd-100-scan-traces.svg, image-2019-01-07-19-03-37-930.png,
> image-2019-01-07-19-03-55-577.png, overview-statstics-1.png, run.log
>
>
> We are evaluating the performance of branch-2, and find that the throughput
> of scan in SSD cluster is almost the same as HDD cluster. so I made a
> FlameGraph on RS, and found that the
> PrivateCellUtil#estimatedSerializedSizeOf cost about 29% cpu, Obviously, it
> has been the bottleneck in 100% scan case.
> See theĀ [^hbase20-ssd-100-scan-traces.svg]
> BTW, in our XiaoMi branch, we introduce a
> HRegion#updateReadRequestsByCapacityUnitPerSecond to sum up the size of cells
> (for metric monitor), so it seems the performance loss was amplified.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)