[jira] [Commented] (HBASE-20045) When running compaction, cache recent blocks.

Vladimir Rodionov (JIRA) Tue, 06 Mar 2018 14:38:36 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-20045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388643#comment-16388643
 ]


Vladimir Rodionov commented on HBASE-20045:
-------------------------------------------

{quote}
After investigation, we found the JVM garbage collector (GC) contributed a lot 
to the latency spikes. We defined a metric called GC stall percentage to 
measure the percentage of time a Cassandra server was doing stop-the-world GC 
(Young Gen GC) and could not serve client requests. Here’s another graph that 
shows the GC stall percentage on our production Cassandra servers. It was 1.25% 
during the lowest traffic time windows, and could be as high as 2.5% during 
peak hours.

The graph shows that a Cassandra server instance could spend 2.5% of runtime on 
garbage collections instead of serving client requests. The GC overhead 
obviously had a big impact on our P99 latency, so if we could lower the GC 
stall percentage, we would be able to reduce our P99 latency significantly.
{quote}

Original is here:
https://engineering.instagram.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589

Instagram engineers confirm what I said already:
*Java GC is the major source of p99 (and up)* 

> When running compaction, cache recent blocks.
> ---------------------------------------------
>
>                 Key: HBASE-20045
>                 URL: https://issues.apache.org/jira/browse/HBASE-20045
>             Project: HBase
>          Issue Type: New Feature
>          Components: BlockCache, Compaction
>    Affects Versions: 2.0.0-beta-1
>            Reporter: Jean-Marc Spaggiari
>            Priority: Major
>
> HBase already allows to cache blocks on flush. This is very useful for 
> usecases where most queries are against recent data. However, as soon as 
> their is a compaction, those blocks are evicted. It will be interesting to 
> have a table level parameter to say "When compacting, cache blocks less than 
> 24 hours old". That way, when running compaction, all blocks where some data 
> are less than 24h hold, will be automatically cached. 
>  
> Very useful for table design where there is TS in the key but a long history 
> (Like a year of sensor data).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HBASE-20045) When running compaction, cache recent blocks.

Reply via email to