[ 
https://issues.apache.org/jira/browse/CASSANDRA-10730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15024963#comment-15024963
 ] 

Jim Witschey commented on CASSANDRA-10730:
------------------------------------------

I didn't clarify here what the error actually is about. Basically, the node 
seems to start correctly, but never makes itself available via CQL. The 
timeouts that we've looked at all happen on the first attempt to connect to a 
recently started node. This change to the dtests:

https://github.com/riptano/cassandra-dtest/commit/fe87ae58acfe13ba14866f1c920b908b8b2bf403

hasn't seemed to improve the situation much, even though it doubled the 
connection timeout and increased the number of times the connection would be 
re-attempted.

> periodic timeout errors in dtest
> --------------------------------
>
>                 Key: CASSANDRA-10730
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10730
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jim Witschey
>            Assignee: Jim Witschey
>
> Dtests often fail with connection timeout errors. For example:
> http://cassci.datastax.com/job/cassandra-3.1_dtest/lastCompletedBuild/testReport/upgrade_tests.cql_tests/TestCQLNodes3RF3/deletion_test/
> {code}
> ('Unable to connect to any servers', {'127.0.0.1': 
> OperationTimedOut('errors=Timed out creating connection (10 seconds), 
> last_host=None',)})
> {code}
> We've merged a PR to increase timeouts:
> https://github.com/riptano/cassandra-dtest/pull/663
> It doesn't look like this has improved things:
> http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/363/testReport/
> Next steps here are
> * to scrape Jenkins history to see if and how the number of tests failing 
> this way has increased (it feels like it has). From there we can bisect over 
> the dtests, ccm, or C*, depending on what looks like the source of the 
> problem.
> * to better instrument the dtest/ccm/C* startup process to see why the nodes 
> start but don't successfully make the CQL port available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to