[jira] [Commented] (CASSANDRA-11615) cassandra-stress blocks when connecting to a big cluster

Andy Tolbert (JIRA) Fri, 22 Apr 2016 09:34:43 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-11615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15254175#comment-15254175
 ]


Andy Tolbert commented on CASSANDRA-11615:
------------------------------------------

Was digging into this with [~eduard.tudenhoefner], and I suspect this is being 
caused by [JAVA-1002|https://datastax-oss.atlassian.net/browse/JAVA-1002], 
which will be fixed in 3.0.1.  I tried this out with a 100 node simulated 
cluster (not using stress in this case) a single threaded netty event loop 
group in the driver (to amplify the impact), and timed how long a 
session.prepare takes when the keyspace is set on the connection.  It took 
203ms with the fix for JAVA-1002, otherwise it takes a very very long time.   I 
think this is the source of the issue, but we'll need to confirm again when we 
get a large cluster up again in the next week or so.

{noformat}
3.0.0 - 100 nodes, no keyspace set on session - 206ms

42776  [main] INFO  OneHundredNodeSimulation - Done Initing Cluster..Preparing 
Statement
42982  [main] INFO  OneHundredNodeSimulation - Done Preparing 
Statement...making query

3.0.0 - 100 nodes, keyspace set on session - too long..ms

46276  [main] INFO  OneHundredNodeSimulation - Done Initing Cluster..Preparing 
Statement
58429  [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.1:9042-3, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
70510  [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.3:9042-1, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
82609  [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.4:9042-1, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
94725  [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.5:9042-1, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
106818 [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.6:9042-1, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
118908 [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.7:9042-1, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
131008 [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.8:9042-1, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
143109 [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.9:9042-1, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
155207 [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.10:9042-1, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
167308 [cluster1-nio-worker-0] WARN  com.datastax.driver.core.Connection - 
Timeout while setting keyspace on Connection[/127.0.1.11:9042-1, inFlight=1, 
closed=false]. This should not happen but is not critical (it will be retried)
...

3.0.1rc (has JAVA-1002 fix) - 100 nodes, keyspace set on session - 203ms

46000  [main] INFO  OneHundredNodeSimulation - Done Initing Cluster..Preparing 
Statement
46203  [main] INFO  OneHundredNodeSimulation - Done Preparing 
Statement...making query
{noformat}

> cassandra-stress blocks when connecting to a big cluster
> --------------------------------------------------------
>
>                 Key: CASSANDRA-11615
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11615
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Eduard Tudenhoefner
>            Assignee: Eduard Tudenhoefner
>             Fix For: 3.0.x
>
>         Attachments: 11615-3.0-2nd.patch, 11615-3.0.patch
>
>
> I had a *100* node cluster and was running 
> {code}
> cassandra-stress read n=100 no-warmup cl=LOCAL_QUORUM -rate 'threads=20' 
> 'limit=1000/s'
> {code}
> Based on the thread dump it looks like it's been blocked at 
> https://github.com/apache/cassandra/blob/cassandra-3.0/tools/stress/src/org/apache/cassandra/stress/util/JavaDriverClient.java#L96
> {code}
> "Thread-20" #245 prio=5 os_prio=0 tid=0x00007f3781822000 nid=0x46c4 waiting 
> for monitor entry [0x00007f36cc788000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at 
> org.apache.cassandra.stress.util.JavaDriverClient.prepare(JavaDriverClient.java:96)
>         - waiting to lock <0x00000005c003d920> (a 
> java.util.concurrent.ConcurrentHashMap)
>         at 
> org.apache.cassandra.stress.operations.predefined.CqlOperation$JavaDriverWrapper.createPreparedStatement(CqlOperation.java:314)
>         at 
> org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:77)
>         at 
> org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:109)
>         at 
> org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:261)
>         at 
> org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:327)
> "Thread-19" #244 prio=5 os_prio=0 tid=0x00007f3781820000 nid=0x46c3 waiting 
> for monitor entry [0x00007f36cc889000]
>    java.lang.Thread.State: BLOCKED (on object monitor)
>         at 
> org.apache.cassandra.stress.util.JavaDriverClient.prepare(JavaDriverClient.java:96)
>         - waiting to lock <0x00000005c003d920> (a 
> java.util.concurrent.ConcurrentHashMap)
>         at 
> org.apache.cassandra.stress.operations.predefined.CqlOperation$JavaDriverWrapper.createPreparedStatement(CqlOperation.java:314)
>         at 
> org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:77)
>         at 
> org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:109)
>         at 
> org.apache.cassandra.stress.operations.predefined.CqlOperation.run(CqlOperation.java:261)
>         at 
> org.apache.cassandra.stress.StressAction$Consumer.run(StressAction.java:327)
> {code}
> I was trying the same with with a smaller cluster (50 nodes) and it was 
> working fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CASSANDRA-11615) cassandra-stress blocks when connecting to a big cluster

Reply via email to