Joseph Lynch created CASSANDRA-14898:
----------------------------------------

             Summary: Key cache loading is very slow when there are many 
SSTables
                 Key: CASSANDRA-14898
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
             Project: Cassandra
          Issue Type: Bug
          Components: Local Write-Read Paths
         Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
RAM, loading about 8MB of KeyCache with 10k keys in it.
            Reporter: Joseph Lynch
         Attachments: key_cache_load_slow.svg

While dealing with a production issue today where some 3.0.17 nodes had close 
to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
crash due to OOM and then they took close to 17 minutes to load the key cache 
and recover. This excessive key cache load significantly increased the duration 
of the outage (to mitigate we just removed the saved key cache files). For 
example here is one example taking 17 minutes to load 10k keys, or about 10 
keys per second (which is ... very slow):
{noformat}
INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
Completed loading (1014606 ms; 10103 keys) KeyCache cache
{noformat}
I've witnessed similar behavior in the past with large LCS clusters, and indeed 
it appears that any time the number of sstables is large, KeyCache loading 
takes a _really_ long time. Today I got a flame graph and I believe that I 
found the issue and I think it's reasonably easy to fix. From what I can tell 
the {{KeyCacheSerializer::deserialize}} [method 
|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
 which is called for every key is linear in the number of sstables due to the 
[call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
 to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
[here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
 The {{View::select}} call is linear in the number of sstables and causes a 
_lot_ of {{HashSet}} 
[resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
 when the number of sstables is much greater than 16 (the default size of the 
backing {{HashMap}}).

As we see in the attached flamegraph we spend 50% of our CPU time in these 
{{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
{{View::select}} and 17% is spent just iterating the sstables in the first 
place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
 for the sstable generation.

I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
pretty easily fixed by either caching the getSSTables call or at the very least 
pre-sizing the {{HashSet}} in {{View::select}} to be the size of the sstables 
map.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to