[ 
https://issues.apache.org/jira/browse/HBASE-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13694361#comment-13694361
 ] 

Varun Sharma commented on HBASE-8370:
-------------------------------------

Having a cache hit ratio of 80 % means that at least 80 % of my requests are 
fast (assuming GC out of picture) - in the current scenario, it may map to a 
number like 99.9 % and tomorrow if I had 0 % cache hits for data blocks, the 
number comes down to 99.5 % - I am able to calculate this based on the numbers 
I paste above. It assumes a certain distribution b/w number of accesses to 
Index blocks and Data blocks. Tomorrow, if the distribution changes, it may 
well be that 99.5 % overall cache hit ratio corresponds to 90 % hit rate on 
data blocks. So, I don't think that "Overall cache hit ratio" is a good proxy 
for "Data block cache hit ratio".

As far as derivatives go, Miss count derivative can go up with other things 
like read request count - so now we would also need to do a derivate on that 
counter and compare etc. On 0.94, that number has been overflowing for us all 
the time and is -ve, is that being fixed in trunk ?

I dont think this is about counters vs gauges. I am fine with exposing counters 
per block type. Right now, I just don't have any insight into the block cache 
which plays an important role in serving reads. When a compaction happens and 
new files are written, I dont know the number of cache misses for Index block 
vs Data block vs Bloom block. I would no longer know how many Data blocks are 
being accessed and how many Index blocks etc

                
> Report data block cache hit rates apart from aggregate cache hit rates
> ----------------------------------------------------------------------
>
>                 Key: HBASE-8370
>                 URL: https://issues.apache.org/jira/browse/HBASE-8370
>             Project: HBase
>          Issue Type: Improvement
>          Components: metrics
>            Reporter: Varun Sharma
>            Assignee: Varun Sharma
>            Priority: Minor
>
> Attaching from mail to [email protected]
> I am wondering whether the HBase cachingHitRatio metrics that the region 
> server UI shows, can get me a break down by data blocks. I always see this 
> number to be very high and that could be exagerated by the fact that each 
> lookup hits the index blocks and bloom filter blocks in the block cache 
> before retrieving the data block. This could be artificially bloating up the 
> cache hit ratio.
> Assuming the above is correct, do we already have a cache hit ratio for data 
> blocks alone which is more obscure ? If not, my sense is that it would be 
> pretty valuable to add one.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to