[ 
https://issues.apache.org/jira/browse/CASSANDRA-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13825614#comment-13825614
 ] 

Rick Branson commented on CASSANDRA-6345:
-----------------------------------------

I like the simpler approach. I still think the callbacks for invalidation are 
asking for it ;) I also think perhaps the stampede lock should be more explicit 
than a synchronized lock on "this" to prevent unintended blocking from future 
modifications.

Either way, I think the only material concern I have is the order that 
TokenMetadata changes get applied to the caches in AbstractReplicationStrategy 
instances. Shouldn't the invalidation take place on all threads in all 
instances of AbstractReplicationStrategy before returning from an 
endpoint-mutating write operation in TokenMetadata? It seems as if just setting 
the cache to empty would allow a period of time where TokenMetadata write 
methods had returned but not all threads have seen the mutation yet because 
they are still holding onto the old clone of TM. This might be alright though, 
I'm not sure. Thoughts?

> Endpoint cache invalidation causes CPU spike (on vnode rings?)
> --------------------------------------------------------------
>
>                 Key: CASSANDRA-6345
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6345
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: 30 nodes total, 2 DCs
> Cassandra 1.2.11
> vnodes enabled (256 per node)
>            Reporter: Rick Branson
>            Assignee: Jonathan Ellis
>             Fix For: 1.2.12, 2.0.3
>
>         Attachments: 6345-rbranson-v2.txt, 6345-rbranson.txt, 6345-v2.txt, 
> 6345-v3.txt, 6345.txt, half-way-thru-6345-rbranson-patch-applied.png
>
>
> We've observed that events which cause invalidation of the endpoint cache 
> (update keyspace, add/remove nodes, etc) in AbstractReplicationStrategy 
> result in several seconds of thundering herd behavior on the entire cluster. 
> A thread dump shows over a hundred threads (I stopped counting at that point) 
> with a backtrace like this:
>         at java.net.Inet4Address.getAddress(Inet4Address.java:288)
>         at 
> org.apache.cassandra.locator.TokenMetadata$1.compare(TokenMetadata.java:106)
>         at 
> org.apache.cassandra.locator.TokenMetadata$1.compare(TokenMetadata.java:103)
>         at java.util.TreeMap.getEntryUsingComparator(TreeMap.java:351)
>         at java.util.TreeMap.getEntry(TreeMap.java:322)
>         at java.util.TreeMap.get(TreeMap.java:255)
>         at 
> com.google.common.collect.AbstractMultimap.put(AbstractMultimap.java:200)
>         at 
> com.google.common.collect.AbstractSetMultimap.put(AbstractSetMultimap.java:117)
>         at com.google.common.collect.TreeMultimap.put(TreeMultimap.java:74)
>         at 
> com.google.common.collect.AbstractMultimap.putAll(AbstractMultimap.java:273)
>         at com.google.common.collect.TreeMultimap.putAll(TreeMultimap.java:74)
>         at 
> org.apache.cassandra.utils.SortedBiMultiValMap.create(SortedBiMultiValMap.java:60)
>         at 
> org.apache.cassandra.locator.TokenMetadata.cloneOnlyTokenMap(TokenMetadata.java:598)
>         at 
> org.apache.cassandra.locator.AbstractReplicationStrategy.getNaturalEndpoints(AbstractReplicationStrategy.java:104)
>         at 
> org.apache.cassandra.service.StorageService.getNaturalEndpoints(StorageService.java:2671)
>         at 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:375)
> It looks like there's a large amount of cost in the 
> TokenMetadata.cloneOnlyTokenMap that 
> AbstractReplicationStrategy.getNaturalEndpoints is calling each time there is 
> a cache miss for an endpoint. It seems as if this would only impact clusters 
> with large numbers of tokens, so it's probably a vnodes-only issue.
> Proposal: In AbstractReplicationStrategy.getNaturalEndpoints(), cache the 
> cloned TokenMetadata instance returned by TokenMetadata.cloneOnlyTokenMap(), 
> wrapping it with a lock to prevent stampedes, and clearing it in 
> clearEndpointCache(). Thoughts?



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to