[ 
https://issues.apache.org/jira/browse/CASSANDRA-14898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17267071#comment-17267071
 ] 

Venkata Harikrishna Nukala commented on CASSANDRA-14898:
--------------------------------------------------------

[~jolynch] I took your micro bench test and added it my branch: 
[https://github.com/nvharikrishna/cassandra/commits/14898-keycache-performance-fix]
 . Ran the test in my laptop without and with my changes and here are the 
results:

 

Without fix:
{noformat}
[java] Result 
"org.apache.cassandra.test.microbench.CacheLoaderBench.keyCacheLoadTest":
 [java] N = 33
 [java] mean = 125.885 ?(99.9%) 5.067 ms/op
 [java]
 [java] Histogram, ms/op:
 [java] [110.000, 115.000) = 0
 [java] [115.000, 120.000) = 2
 [java] [120.000, 125.000) = 19
 [java] [125.000, 130.000) = 8
 [java] [130.000, 135.000) = 2
 [java] [135.000, 140.000) = 0
 [java] [140.000, 145.000) = 1
 [java] [145.000, 150.000) = 0
 [java] [150.000, 155.000) = 0
 [java] [155.000, 160.000) = 0
 [java] [160.000, 165.000) = 1
 [java]
 [java] Percentiles, ms/op:
 [java] p(0.0000) = 118.358 ms/op
 [java] p(50.0000) = 123.863 ms/op
 [java] p(90.0000) = 131.990 ms/op
 [java] p(95.0000) = 148.347 ms/op
 [java] p(99.0000) = 163.578 ms/op
 [java] p(99.9000) = 163.578 ms/op
 [java] p(99.9900) = 163.578 ms/op
 [java] p(99.9990) = 163.578 ms/op
 [java] p(99.9999) = 163.578 ms/op
 [java] p(100.0000) = 163.578 ms/op{noformat}
 

With fix:
{noformat}
[java] Result 
"org.apache.cassandra.test.microbench.CacheLoaderBench.keyCacheLoadTest":
 [java] N = 277
 [java] mean = 14.093 ?(99.9%) 0.530 ms/op
 [java]
 [java] Histogram, ms/op:
 [java] [10.000, 11.250) = 0
 [java] [11.250, 12.500) = 118
 [java] [12.500, 13.750) = 50
 [java] [13.750, 15.000) = 27
 [java] [15.000, 16.250) = 16
 [java] [16.250, 17.500) = 20
 [java] [17.500, 18.750) = 23
 [java] [18.750, 20.000) = 16
 [java] [20.000, 21.250) = 6
 [java] [21.250, 22.500) = 1
 [java] [22.500, 23.750) = 0
 [java] [23.750, 25.000) = 0
 [java] [25.000, 26.250) = 0
 [java] [26.250, 27.500) = 0
 [java] [27.500, 28.750) = 0
 [java]
 [java] Percentiles, ms/op:
 [java] p(0.0000) = 11.289 ms/op
 [java] p(50.0000) = 13.238 ms/op
 [java] p(90.0000) = 18.468 ms/op
 [java] p(95.0000) = 19.218 ms/op
 [java] p(99.0000) = 20.588 ms/op
 [java] p(99.9000) = 22.053 ms/op
 [java] p(99.9900) = 22.053 ms/op
 [java] p(99.9990) = 22.053 ms/op
 [java] p(99.9999) = 22.053 ms/op
 [java] p(100.0000) = 22.053 ms/op{noformat}
 

> Key cache loading is very slow when there are many SSTables
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-14898
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14898
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/Local Write-Read Paths
>         Environment: AWS i3.2xlarge, 4 physical cores (8 threads), 60GB of 
> RAM, loading about 8MB of KeyCache with 10k keys in it.
>            Reporter: Joey Lynch
>            Assignee: Venkata Harikrishna Nukala
>            Priority: Low
>              Labels: Performance, low-hanging-fruit
>         Attachments: key_cache_load_slow.svg
>
>
> While dealing with a production issue today where some 3.0.17 nodes had close 
> to ~8k sstables on disk due to excessive write pressure, we had a few nodes 
> crash due to OOM and then they took close to 17 minutes to load the key cache 
> and recover. This excessive key cache load significantly increased the 
> duration of the outage (to mitigate we just removed the saved key cache 
> files). For example here is one example taking 17 minutes to load 10k keys, 
> or about 10 keys per second (which is ... very slow):
> {noformat}
> INFO  [pool-3-thread-1] 2018-11-15 21:50:21,885 AutoSavingCache.java:190 - 
> reading saved cache /mnt/data/cassandra/saved_caches/KeyCache-d.db
> INFO  [pool-3-thread-1] 2018-11-15 22:07:16,490 AutoSavingCache.java:166 - 
> Completed loading (1014606 ms; 10103 keys) KeyCache cache
> {noformat}
> I've witnessed similar behavior in the past with large LCS clusters, and 
> indeed it appears that any time the number of sstables is large, KeyCache 
> loading takes a _really_ long time. Today I got a flame graph and I believe 
> that I found the issue and I think it's reasonably easy to fix. From what I 
> can tell the {{KeyCacheSerializer::deserialize}} [method 
> |https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L445]
>  which is called for every key is linear in the number of sstables due to the 
> [call|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L459]
>  to {{ColumnFamilyStore::getSSTables}} which ends up calling {{View::select}} 
> [here|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139].
>  The {{View::select}} call is linear in the number of sstables and causes a 
> _lot_ of {{HashSet}} 
> [resizing|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/db/lifecycle/View.java#L139]
>  when the number of sstables is much greater than 16 (the default size of the 
> backing {{HashMap}}).
> As we see in the attached flamegraph we spend 50% of our CPU time in these 
> {{getSSTable}} calls, of which 36% is spent adding sstables to the HashSet in 
> {{View::select}} and 17% is spent just iterating the sstables in the first 
> place. A full 16% of CPU time is spent _just resizing the HashMap_. Then 
> another 4% is spend calling {{CacheService::findDesc}} which does [a linear 
> search|https://github.com/apache/cassandra/blob/06209037ea56b5a2a49615a99f1542d6ea1b2947/src/java/org/apache/cassandra/service/CacheService.java#L475]
>  for the sstable generation.
> I believe that this affects at least Cassandra 3.0.17 and trunk, and could be 
> pretty easily fixed by either caching the getSSTables call or at the very 
> least pre-sizing the {{HashSet}} in {{View::select}} to be the size of the 
> sstables map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to