Wellington Chevreuil created HBASE-30225:
--------------------------------------------

             Summary: Performance degradation observed on ycsb reads benchmark 
after HBASE-29727
                 Key: HBASE-30225
                 URL: https://issues.apache.org/jira/browse/HBASE-30225
             Project: HBase
          Issue Type: Bug
            Reporter: Wellington Chevreuil
            Assignee: Wellington Chevreuil
         Attachments: flame-graph-zoomed.png, flamegraph-high-level.png

In HBASE-29727 we replaced a single Path attribute by three String fields for 
region, column family and file names, respectively. Since these values tend to 
have a high level of redundancy on large caches (same region, family and file 
names for many different blocks), we introduced the usage of string pool to 
avoid string value repetition and save heap allocation.

When executing ycsb read workloads, we observed a ~30% latency degradation 

The problem was that we added logic for parsing the file Path into region name, 
family name, as well checks for archiving all on the BlockCacheKey constructor 
used by HFileReaderImpl on the beginning of each block read. As seen on the 
flame graphs attached covering a five minutes window on one of the RSes, around 
30% of the CPU time was spent on the BlockCacheKey constructor, either calling 
Path.getParent() or HFileUtils.isHFileArchived().

!flame-graph-zoomed.png!

 

 

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to