[
https://issues.apache.org/jira/browse/HBASE-20045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388643#comment-16388643
]
Vladimir Rodionov commented on HBASE-20045:
-------------------------------------------
{quote}
After investigation, we found the JVM garbage collector (GC) contributed a lot
to the latency spikes. We defined a metric called GC stall percentage to
measure the percentage of time a Cassandra server was doing stop-the-world GC
(Young Gen GC) and could not serve client requests. Here’s another graph that
shows the GC stall percentage on our production Cassandra servers. It was 1.25%
during the lowest traffic time windows, and could be as high as 2.5% during
peak hours.
The graph shows that a Cassandra server instance could spend 2.5% of runtime on
garbage collections instead of serving client requests. The GC overhead
obviously had a big impact on our P99 latency, so if we could lower the GC
stall percentage, we would be able to reduce our P99 latency significantly.
{quote}
Original is here:
https://engineering.instagram.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589
Instagram engineers confirm what I said already:
*Java GC is the major source of p99 (and up)*
> When running compaction, cache recent blocks.
> ---------------------------------------------
>
> Key: HBASE-20045
> URL: https://issues.apache.org/jira/browse/HBASE-20045
> Project: HBase
> Issue Type: New Feature
> Components: BlockCache, Compaction
> Affects Versions: 2.0.0-beta-1
> Reporter: Jean-Marc Spaggiari
> Priority: Major
>
> HBase already allows to cache blocks on flush. This is very useful for
> usecases where most queries are against recent data. However, as soon as
> their is a compaction, those blocks are evicted. It will be interesting to
> have a table level parameter to say "When compacting, cache blocks less than
> 24 hours old". That way, when running compaction, all blocks where some data
> are less than 24h hold, will be automatically cached.
>
> Very useful for table design where there is TS in the key but a long history
> (Like a year of sensor data).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)