[ https://issues.apache.org/jira/browse/HBASE-20045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388643#comment-16388643 ]
Vladimir Rodionov commented on HBASE-20045: ------------------------------------------- {quote} After investigation, we found the JVM garbage collector (GC) contributed a lot to the latency spikes. We defined a metric called GC stall percentage to measure the percentage of time a Cassandra server was doing stop-the-world GC (Young Gen GC) and could not serve client requests. Here’s another graph that shows the GC stall percentage on our production Cassandra servers. It was 1.25% during the lowest traffic time windows, and could be as high as 2.5% during peak hours. The graph shows that a Cassandra server instance could spend 2.5% of runtime on garbage collections instead of serving client requests. The GC overhead obviously had a big impact on our P99 latency, so if we could lower the GC stall percentage, we would be able to reduce our P99 latency significantly. {quote} Original is here: https://engineering.instagram.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589 Instagram engineers confirm what I said already: *Java GC is the major source of p99 (and up)* > When running compaction, cache recent blocks. > --------------------------------------------- > > Key: HBASE-20045 > URL: https://issues.apache.org/jira/browse/HBASE-20045 > Project: HBase > Issue Type: New Feature > Components: BlockCache, Compaction > Affects Versions: 2.0.0-beta-1 > Reporter: Jean-Marc Spaggiari > Priority: Major > > HBase already allows to cache blocks on flush. This is very useful for > usecases where most queries are against recent data. However, as soon as > their is a compaction, those blocks are evicted. It will be interesting to > have a table level parameter to say "When compacting, cache blocks less than > 24 hours old". That way, when running compaction, all blocks where some data > are less than 24h hold, will be automatically cached. > > Very useful for table design where there is TS in the key but a long history > (Like a year of sensor data). -- This message was sent by Atlassian JIRA (v7.6.3#76005)