[jira] [Commented] (CASSANDRA-11724) False Failure Detection in Big Cassandra Cluster

Jeffrey F. Lukman (JIRA) Sat, 07 May 2016 12:46:39 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-11724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15275359#comment-15275359
 ]


Jeffrey F. Lukman commented on CASSANDRA-11724:
-----------------------------------------------

Hi Sylvain,

I've tried to go deeper on why this bug happen in big scale of cluster.
I found out that when we start 256 Cassandra instances (or more) simultaneously,
the {{applyStateLocally()}} function of Gossiper.java can take a long time in 
the init phase.
In some nodes, I saw that some of the nodes can take *60+ seconds*. 
The worst {{applyStateLocally()}} running time that I see so far is *120 
seconds*. 
The average worst running time of {{applyStateLocally()}} from all nodes is 
*28.5 seconds*.
I believe this number can get even worst in bootstrapping 512 Cassandra 
instances.

I'm still trying to understand why this can happen and why this only happen in 
the initialization phase.

> False Failure Detection in Big Cassandra Cluster
> ------------------------------------------------
>
>                 Key: CASSANDRA-11724
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11724
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jeffrey F. Lukman
>              Labels: gossip, node-failure
>         Attachments: Workload1.jpg, Workload2.jpg, Workload3.jpg, 
> Workload4.jpg
>
>
> We are running some testing on Cassandra v2.2.5 stable in a big cluster. The 
> setting in our testing is that each machine has 16-cores and runs 8 cassandra 
> instances, and our testing is 32, 64, 128, 256, and 512 instances of 
> Cassandra. We use the default number of vnodes for each instance which is 
> 256. The data and log directories are on in-memory tmpfs file system.
> We run several types of workloads on this Cassandra cluster:
> Workload1: Just start the cluster
> Workload2: Start half of the cluster, wait until it gets into a stable 
> condition, and run another half of the cluster
> Workload3: Start half of the cluster, wait until it gets into a stable 
> condition, load some data, and run another half of the cluster
> Workload4: Start the cluster, wait until it gets into a stable condition, 
> load some data and decommission one node
> For this testing, we measure the total numbers of false failure detection 
> inside the cluster. By false failure detection, we mean that, for example, 
> instance-1 marks the instance-2 down, but the instance-2 is not down. We dig 
> deeper into the root cause and find out that instance-1 has not received any 
> heartbeat after some time from instance-2 because the instance-2 run a long 
> computation process.
> Here I attach the graphs of each workload result.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11724) False Failure Detection in Big Cassandra Cluster

Reply via email to