[ 
https://issues.apache.org/jira/browse/CASSANDRA-9258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056216#comment-15056216
 ] 

Branimir Lambov commented on CASSANDRA-9258:
--------------------------------------------

The solution looks good overall.

{{PendingRangeMaps}} could be declared iterable so that the for-in syntax can 
be used in e.g. {{TokenMetadata.getPendingRangesMM}}.

The comparators should not be created by the constructor as they can just as 
well be static final constants. It is not very clear why you need 4 of them, a 
comment explaining what each ordering does would help a lot (e.g. sorts ends 
ascending, and its secondary comparison for starts is chosen in such a way that 
and <end, end> ranges come before non-wraparound intervals with the same end).

{{PendingRangeMaps.addPendingRange}} needs a comment on why adding the address 
is done only once in the found case. Or restate it as finding the list to add 
to, creating if necessary, then adding the new address, for example:
{code}
List<InetAddress> addresses = ascending.get(range);
if (addresses == null)
{
    addresses = new ArrayList<InetAddress>(1);
    ascending.put(range, addresses);
    descending.put(range, addresses);
}
addresses.add(address);
{code}

In {{PendingRangeMaps.pendingEndpointsFor}} the {{keySet, containsKey, get}} 
sequence is a bit more expensive than it could be: either use {{entrySet}} to 
avoid {{smaller.get}}, or {{bigger.get}} instead of {{containsKey}} to directly 
use the value if it's not null. Similarly, the two wrap-around loops should be 
on the {{entrySet}} s to avoid unnecessarily searching for the entry again.


> Range movement causes CPU & performance impact
> ----------------------------------------------
>
>                 Key: CASSANDRA-9258
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9258
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: Cassandra 2.1.4
>            Reporter: Rick Branson
>            Assignee: Dikang Gu
>             Fix For: 2.1.x
>
>         Attachments: 0001-pending-ranges-map.patch
>
>
> Observing big CPU & latency regressions when doing range movements on 
> clusters with many tens of thousands of vnodes. See CPU usage increase by 
> ~80% when a single node is being replaced.
> Top methods are:
> 1) Ljava/math/BigInteger;.compareTo in 
> Lorg/apache/cassandra/dht/ComparableObjectToken;.compareTo 
> 2) Lcom/google/common/collect/AbstractMapBasedMultimap;.wrapCollection in 
> Lcom/google/common/collect/AbstractMapBasedMultimap$AsMap$AsMapIterator;.next
> 3) Lorg/apache/cassandra/db/DecoratedKey;.compareTo in 
> Lorg/apache/cassandra/dht/Range;.contains
> Here's a sample stack from a thread dump:
> {code}
> "Thrift:50673" daemon prio=10 tid=0x00007f2f20164800 nid=0x3a04af runnable 
> [0x00007f2d878d0000]
>    java.lang.Thread.State: RUNNABLE
>       at org.apache.cassandra.dht.Range.isWrapAround(Range.java:260)
>       at org.apache.cassandra.dht.Range.contains(Range.java:51)
>       at org.apache.cassandra.dht.Range.contains(Range.java:110)
>       at 
> org.apache.cassandra.locator.TokenMetadata.pendingEndpointsFor(TokenMetadata.java:916)
>       at 
> org.apache.cassandra.service.StorageProxy.performWrite(StorageProxy.java:775)
>       at 
> org.apache.cassandra.service.StorageProxy.mutate(StorageProxy.java:541)
>       at 
> org.apache.cassandra.service.StorageProxy.mutateWithTriggers(StorageProxy.java:616)
>       at 
> org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1101)
>       at 
> org.apache.cassandra.thrift.CassandraServer.doInsert(CassandraServer.java:1083)
>       at 
> org.apache.cassandra.thrift.CassandraServer.batch_mutate(CassandraServer.java:976)
>       at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3996)
>       at 
> org.apache.cassandra.thrift.Cassandra$Processor$batch_mutate.getResult(Cassandra.java:3980)
>       at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>       at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>       at 
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:205)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:745){code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to