Bryan Beaudreault created HBASE-29012:
-----------------------------------------

             Summary: Performance regression in hot reads after split/merge
                 Key: HBASE-29012
                 URL: https://issues.apache.org/jira/browse/HBASE-29012
             Project: HBase
          Issue Type: Bug
            Reporter: Bryan Beaudreault


We noticed a significant performance regression which comes from HBASE-27474. 
In that ticket, logic is added so that we don't cache blocks that exist within 
a reference or a link if compactions are enabled.

The issue we noticed is that we had a cluster which had compactions enabled, 
but compactions were a bit delayed. During the time, there were some regions 
which were recently split/merged and they contained references. This cluster is 
very hot reads and relies heavily on bloom filters. I noticed through profiles 
that we were spending a lot of time fetching BLOOM_CHUNK blocks from hdfs. This 
is almost never the case since we continually rightside the block cache to 
ensure all blooms are cached. In fact, we had no evictions at the time. So why 
weren't they getting cached?

With trace logging enabled I noticed that all of the blocks being read over and 
over happened to come from hfiles that looked to be references. This led me to 
the ticket in question.

This feels like a very serious regression, as it leads to substantantial impact 
to both hdfs and hbase in terms of request times and GC time and the host 
becomes fully hosed. I sort of wonder if we should revert that issue, or at the 
very least make it configurable. I'm not sure how to preserve the intended 
behavior of the ticket while also protecting the regionserver performance. In 
our case this happened for bloom blocks, but it could just as easily happen to 
a hot data block.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to