Wellington Chevreuil created HBASE-30225:
--------------------------------------------
Summary: Performance degradation observed on ycsb reads benchmark
after HBASE-29727
Key: HBASE-30225
URL: https://issues.apache.org/jira/browse/HBASE-30225
Project: HBase
Issue Type: Bug
Reporter: Wellington Chevreuil
Assignee: Wellington Chevreuil
Attachments: flame-graph-zoomed.png, flamegraph-high-level.png
In HBASE-29727 we replaced a single Path attribute by three String fields for
region, column family and file names, respectively. Since these values tend to
have a high level of redundancy on large caches (same region, family and file
names for many different blocks), we introduced the usage of string pool to
avoid string value repetition and save heap allocation.
When executing ycsb read workloads, we observed a ~30% latency degradation
The problem was that we added logic for parsing the file Path into region name,
family name, as well checks for archiving all on the BlockCacheKey constructor
used by HFileReaderImpl on the beginning of each block read. As seen on the
flame graphs attached covering a five minutes window on one of the RSes, around
30% of the CPU time was spent on the BlockCacheKey constructor, either calling
Path.getParent() or HFileUtils.isHFileArchived().
!flame-graph-zoomed.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)