[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-12-21 Thread Stefania (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15066524#comment-15066524
 ] 

Stefania edited comment on CASSANDRA-8072 at 12/21/15 3:54 PM:
---

Building on [~brandon.williams] previous analysis but taking into account more 
recent changes where we do close sockets, the problem is still that the seed 
node is sending the ACK to the old socket, even after it has been closed by the 
decommissioned node. This is because we only send on these sockets, so we 
cannot know when they are closed until the send buffers are exceeded or unless 
we try to read from them as well. However, the problem should now only be true 
until the node is convicted, approx 10 seconds with a {{phi_convict_threshold}} 
of 8. I verified this by adding a sleep of 15 seconds in my test before 
restarting the node, and it restarted without problems. [~slowenthal] or 
[~rhatch] would you be able to confirm this with your tests?

If we cannot detect when an outgoing socket is closed by its peer, then we need 
an out-of-bound notification. This could come from the departing node 
announcing its shutdown at the end of its decommission but the existing logic 
in {{Gossiper.stop()}} prevents this for the dead states (*removing, removed, 
left and hibernate*) or for *bootstrapping*. This was introduced by 
CASSANDRA-8336 and the same problem has already been raised in CASSANDRA-9630. 
Even if we undo CASSANDRA-8336 there is then another issue: since 
CASSANDRA-9765 we can no longer join a cluster in status SHUTDOWN and I believe 
this is correct. So the answer cannot be to announce a shutdown after 
decommission, not without significant changes to the Gossip protocol. Closing 
the socket earlier, say when we get the status LEFT notification, is not 
sufficient because during the RING_DELAY sleep period we may re-establish the 
connection to the node before it dies, typically for a Gossip update. 

So I think we only have two options:

* read from outgoing sockets purely to detect when they are closed
* send a new GOSSIP flag indicating it is time to close the sockets to a node



was (Author: stefania):
Building on [~brandon.williams] previous analysis but taking into account more 
recent changes where we do close sockets, the problem is still that the seed 
node is sending the ACK to the old socket, even after it has been closed by the 
decommissioned node. This is because we only send on these sockets, so we 
cannot know when they are closed until the send buffers are exceeded or unless 
we try to read from them as well. However, the problem should now only be true 
until the node is convicted, approx 10 seconds with a {{phi_convict_threshold}} 
of 8. I verified this by adding a sleep of 15 seconds in my test before 
restarting the node, and it restarted without problems. [~slowenthal] would you 
be able to confirm this with your tests?

If we cannot detect when an outgoing socket is closed by its peer, then we need 
an out-of-bound notification. This could come from the departing node 
announcing its shutdown at the end of its decommission but the existing logic 
in {{Gossiper.stop()}} prevents this for the dead states (*removing, removed, 
left and hibernate*) or for *bootstrapping*. This was introduced by 
CASSANDRA-8336 and the same problem has already been raised in CASSANDRA-9630. 
Even if we undo CASSANDRA-8336 there is then another issue: since 
CASSANDRA-9765 we can no longer join a cluster in status SHUTDOWN and I believe 
this is correct. So the answer cannot be to announce a shutdown after 
decommission, not without significant changes to the Gossip protocol. Closing 
the socket earlier, say when we get the status LEFT notification, is not 
sufficient because during the RING_DELAY sleep period we may re-establish the 
connection to the node before it dies, typically for a Gossip update. 

So I think we only have two options:

* read from outgoing sockets purely to detect when they are closed
* send a new GOSSIP flag indicating it is time to close the sockets to a node


> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>  Components: Lifecycle
>Reporter: Ryan Springer
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one 

[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-11-02 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985589#comment-14985589
 ] 

Kenneth Failbus edited comment on CASSANDRA-8072 at 11/2/15 5:38 PM:
-

Upon enabling trace on the one see and re-bootstrapping the new node, I got the 
following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}


was (Author: kenfailbus):
Upon enabling trace on the one see and re-bootstrapping the new node, I got the 
following exception
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
> Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
> (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
> MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
> MessagingService.java (line 941) MessagingService has terminated the accept() 
> thread
> This errors does not always occur when provisioning a 2-node cluster, but 
> probably around half of the time on only one of the nodes.  I haven't been 
> able to reproduce this error with DSC 2.0.9, and there have been no code or 
> definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed 
> fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-11-02 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985589#comment-14985589
 ] 

Kenneth Failbus edited comment on CASSANDRA-8072 at 11/2/15 5:49 PM:
-

Upon enabling trace on the one seed node and re-bootstrapping the new node, I 
got the following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}


was (Author: kenfailbus):
Upon enabling trace on the one see and re-bootstrapping the new node, I got the 
following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
> Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
> (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
> MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
> MessagingService.java (line 941) MessagingService has terminated the accept() 
> thread
> This errors does not always occur when provisioning a 2-node cluster, but 
> probably around half of the time on only one of the nodes.  I haven't been 
> able to reproduce this error with DSC 2.0.9, and there have been no code or 
> definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed 
> fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-11-02 Thread Kenneth Failbus (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985589#comment-14985589
 ] 

Kenneth Failbus edited comment on CASSANDRA-8072 at 11/2/15 5:50 PM:
-

Upon enabling trace on the one seed node and re-bootstrapping the new node, I 
got the following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}

And then usual exception that this ticket is mentioning about
{code}
2015-11-02 17:34:55,526 [EXPIRING-MAP-REAPER:1] TRACE ExpiringMap Expired 0 
entries
2015-11-02 17:35:00,526 [EXPIRING-MAP-REAPER:1] TRACE ExpiringMap Expired 0 
entries
2015-11-02 17:35:05,527 [EXPIRING-MAP-REAPER:1] TRACE ExpiringMap Expired 0 
entries
2015-11-02 17:35:10,527 [EXPIRING-MAP-REAPER:1] TRACE ExpiringMap Expired 0 
entries
2015-11-02 17:35:11,982 [main] ERROR CassandraDaemon Exception encountered 
during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1296)
at 
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:457)
at 
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:671)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:623)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:515)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:437)
at com.datastax.bdp.server.DseDaemon.setup(DseDaemon.java:423)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:567)
at com.datastax.bdp.server.DseDaemon.main(DseDaemon.java:641)
2015-11-02 17:35:11,986 [Thread-7] INFO DseDaemon DSE shutting down...
2015-11-02 17:35:11,987 [StorageServiceShutdownHook] WARN Gossiper No local 
state or state is in silent shutdown, not announcing shutdown
2015-11-02 17:35:11,987 [StorageServiceShutdownHook] INFO MessagingService 
Waiting for messaging service to quiesce
2015-11-02 17:35:11,987 [StorageServiceShutdownHook] DEBUG MessagingService 
Closing accept() thread
2015-11-02 17:35:11,988 [ACCEPT-/10.22.168.53] DEBUG MessagingService 
Asynchronous close seen by server thread
2015-11-02 17:35:11,988 [ACCEPT-/10.22.168.53] INFO MessagingService 
MessagingService has terminated the accept() thread
2015-11-02 17:35:12,068 [Thread-7] ERROR CassandraDaemon Exception in thread 
Thread[Thread-7,5,main]
{code}


was (Author: kenfailbus):
Upon enabling trace on the one seed node and re-bootstrapping the new node, I 
got the following exception on the node that was bootstrapping.
{code}
2015-11-02 17:34:52,150 [ACCEPT-/10.22.168.53] DEBUG MessagingService Error 
reading the socket Socket[addr=/10.xx.xx.xx,port=46678,localport=10xxx]
java.net.SocketTimeoutException
at 
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at 
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:916)
{code}

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Stefania
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 

[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-09-13 Thread Steven Lowenthal (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14742914#comment-14742914
 ] 

Steven Lowenthal edited comment on CASSANDRA-8072 at 9/14/15 4:49 AM:
--

I can easily reproduce this with my automated launcher  I even tried to better 
randomize when non-seed nodes come in to join.   I recently noticed that the 
seed node has a socket stuck in CLOSE_WAIT for the nodes that report can't 
gossip with any seeds.  Perhaps the solution lies in ensuring that both ends of 
the connection properly close the connection.  It's likely the client (the node 
asking to join) exceptions out and dies without elegantly closing the 
connection.   See Screenshot.  Also well-known port afs3-fileserver is 7000.


was (Author: slowenthal):
I can easily reproduce this with my automated launcher  I even tried to better 
randomize when non-seed nodes come in to join.   I recently noticed that the 
seed node has a socket stuck in CLOSE_WAIT for the nodes that report can't 
gossip with any seeds.  Perhaps the solution lies in ensuring that both ends of 
the connection properly close the connection.  It's likely the client (the node 
asking to join) exceptions out and dies without elegantly closing the 
connection.   

> Exception during startup: Unable to gossip with any seeds
> -
>
> Key: CASSANDRA-8072
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Ryan Springer
>Assignee: Brandon Williams
> Fix For: 2.1.x
>
> Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
> cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
> casandra-system-log-with-assert-patch.log, screenshot-1.png, 
> trace_logs.tar.bz2
>
>
> When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
> in either ec2 or locally, an error occurs sometimes with one of the nodes 
> refusing to start C*.  The error in the /var/log/cassandra/system.log is:
> ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
> Exception encountered during startup
> java.lang.RuntimeException: Unable to gossip with any seeds
> at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
> at 
> org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
> at 
> org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
> at 
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
> at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
> at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
> at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
> (line 1279) Announcing shutdown
>  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
> MessagingService.java (line 701) Waiting for messaging service to quiesce
>  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
> MessagingService.java (line 941) MessagingService has terminated the accept() 
> thread
> This errors does not always occur when provisioning a 2-node cluster, but 
> probably around half of the time on only one of the nodes.  I haven't been 
> able to reproduce this error with DSC 2.0.9, and there have been no code or 
> definition file changes in Opscenter.
> I can reproduce locally with the above steps.  I'm happy to test any proposed 
> fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-06-10 Thread Andreas Schnitzerling (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580341#comment-14580341
 ] 

Andreas Schnitzerling edited comment on CASSANDRA-8072 at 6/10/15 10:15 AM:


Hello, I made following steps: I decom a 2.0.15 node with 128 vnodes and tried 
to bootstrap 2.1.6-R on the same node w/ 256 vnodes in write survey mode to 
test. 2.1.6 doesn't bootstrap because of the unable gossib exception but the 
old 2.0.15 does it w/o problems. Even if i use cassandra.yaml from 2.0.15 
(deleted properties invalid for 2.1.6) it doesn't start. I have 14 nodes 2.0.15 
running on Windows 7.
{panel:title=system.log}
ERROR [main] 2015-06-10 12:03:22,200 CassandraDaemon.java:553 - Exception 
encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1307) 
~[apache-cassandra-2.1.6.jar:2.1.6]
at 
org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:530)
 ~[apache-cassandra-2.1.6.jar:2.1.6]
at 
org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:774)
 ~[apache-cassandra-2.1.6.jar:2.1.6]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:711) 
~[apache-cassandra-2.1.6.jar:2.1.6]
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:602) 
~[apache-cassandra-2.1.6.jar:2.1.6]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:394) 
[apache-cassandra-2.1.6.jar:2.1.6]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:536) 
[apache-cassandra-2.1.6.jar:2.1.6]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:625) 
[apache-cassandra-2.1.6.jar:2.1.6]
WARN  [StorageServiceShutdownHook] 2015-06-10 12:03:22,200 Gossiper.java:1418 - 
No local state or state is in silent shutdown, not announcing shutdown
INFO  [StorageServiceShutdownHook] 2015-06-10 12:03:22,200 
MessagingService.java:708 - Waiting for messaging service to quiesce
INFO  [ACCEPT-PC5771/10.2.0.61] 2015-06-10 12:03:22,200 
MessagingService.java:958 - MessagingService has terminated the accept() thread
{panel}


was (Author: andie78):
Hello, I made following steps: I decom a 2.0.15 node with 128 vnodes and tried 
to bootstrap 2.1.6-R on the same node w/ 256 vnodes in write survey mode to 
test. 2.1.6 doesn't bootstrap because of the unable gossib exception but the 
old 2.0.15 does it w/o problems. Even if i use cassandra.yaml from 2.0.15 
(deletetd properties invalid for 2.1.6) it doesn't start. I have 14 nodes 
2.0.15 running on Windows 7.

 Exception during startup: Unable to gossip with any seeds
 -

 Key: CASSANDRA-8072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan Springer
Assignee: Brandon Williams
 Fix For: 2.1.x, 2.0.x

 Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
 cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
 cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
 casandra-system-log-with-assert-patch.log, trace_logs.tar.bz2


 When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
 in either ec2 or locally, an error occurs sometimes with one of the nodes 
 refusing to start C*.  The error in the /var/log/cassandra/system.log is:
 ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
 Exception encountered during startup
 java.lang.RuntimeException: Unable to gossip with any seeds
 at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
 at 
 org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
 at 
 org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
 (line 1279) Announcing shutdown
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
 MessagingService.java (line 701) Waiting for messaging service to quiesce
  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
 MessagingService.java (line 941) MessagingService has terminated the accept() 
 

[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-06-10 Thread Andreas Schnitzerling (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580409#comment-14580409
 ] 

Andreas Schnitzerling edited comment on CASSANDRA-8072 at 6/10/15 11:16 AM:


I tried severel times to switch between 2.0.15 and 2.1.6 (start bootstr at 
2.0.15, stop, copy data to 2.1.6) but 2.1.6 doesn't start probably. One time, 
2.1.6 had seen only other nodes, not himself. Now I deleted all data again on 
that node, removed the node with nodetool and now the bootstrap w/ 2.0.15 works 
well in write survey mode. I'll wait until bootstrap finished and then I try to 
upgrade the same node to 2.1.6 w/ write survey as well. Seems like 2.1.6 is not 
backward compatible bootstrapping to a 2.0.x cluster.


was (Author: andie78):
I tried severel times to switch between 2.0.15 and 2.1.6 (start bootstr at 
2.0.15, stop, copy data to 2.1.6) but 2.1.6 don't start probably. One time, 
2.1.6 have seen only other nodes, not himself. Now I deleted all again on that 
node, removed the node with nodetool and now the bootstrap w/ 2.0.15 works well 
in write survey mode. I'll wait until bootstrap finished and then I try to 
upgrade the same node to 2.1.6 w/ write survey as well. Seems like 2.1.6 is not 
backward compatible bootstrapping to a 2.0.x cluster.

 Exception during startup: Unable to gossip with any seeds
 -

 Key: CASSANDRA-8072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan Springer
Assignee: Brandon Williams
 Fix For: 2.1.x, 2.0.x

 Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
 cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
 cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
 casandra-system-log-with-assert-patch.log, trace_logs.tar.bz2


 When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
 in either ec2 or locally, an error occurs sometimes with one of the nodes 
 refusing to start C*.  The error in the /var/log/cassandra/system.log is:
 ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
 Exception encountered during startup
 java.lang.RuntimeException: Unable to gossip with any seeds
 at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
 at 
 org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
 at 
 org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
 (line 1279) Announcing shutdown
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
 MessagingService.java (line 701) Waiting for messaging service to quiesce
  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
 MessagingService.java (line 941) MessagingService has terminated the accept() 
 thread
 This errors does not always occur when provisioning a 2-node cluster, but 
 probably around half of the time on only one of the nodes.  I haven't been 
 able to reproduce this error with DSC 2.0.9, and there have been no code or 
 definition file changes in Opscenter.
 I can reproduce locally with the above steps.  I'm happy to test any proposed 
 fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-06-10 Thread Andreas Schnitzerling (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580409#comment-14580409
 ] 

Andreas Schnitzerling edited comment on CASSANDRA-8072 at 6/10/15 11:17 AM:


I tried severel times to switch between 2.0.15 and 2.1.6 (start bootstr at 
2.0.15, stop, copy data to 2.1.6) but 2.1.6 doesn't start probably. One time, 
2.1.6 had seen only other nodes, not himself. Now I deleted all data again on 
that node, removed the node with nodetool and now the bootstrap w/ 2.0.15 works 
well in write survey mode. I'll wait until bootstrap finished and then I try to 
upgrade the same node to 2.1.6 w/ write survey as well. Seems like 2.1.6 is not 
backward compatible bootstrapping to a 2.0.x cluster. Can u confirm that 
behavior?


was (Author: andie78):
I tried severel times to switch between 2.0.15 and 2.1.6 (start bootstr at 
2.0.15, stop, copy data to 2.1.6) but 2.1.6 doesn't start probably. One time, 
2.1.6 had seen only other nodes, not himself. Now I deleted all data again on 
that node, removed the node with nodetool and now the bootstrap w/ 2.0.15 works 
well in write survey mode. I'll wait until bootstrap finished and then I try to 
upgrade the same node to 2.1.6 w/ write survey as well. Seems like 2.1.6 is not 
backward compatible bootstrapping to a 2.0.x cluster.

 Exception during startup: Unable to gossip with any seeds
 -

 Key: CASSANDRA-8072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan Springer
Assignee: Brandon Williams
 Fix For: 2.1.x, 2.0.x

 Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
 cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
 cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
 casandra-system-log-with-assert-patch.log, trace_logs.tar.bz2


 When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
 in either ec2 or locally, an error occurs sometimes with one of the nodes 
 refusing to start C*.  The error in the /var/log/cassandra/system.log is:
 ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
 Exception encountered during startup
 java.lang.RuntimeException: Unable to gossip with any seeds
 at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
 at 
 org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
 at 
 org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
 (line 1279) Announcing shutdown
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
 MessagingService.java (line 701) Waiting for messaging service to quiesce
  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
 MessagingService.java (line 941) MessagingService has terminated the accept() 
 thread
 This errors does not always occur when provisioning a 2-node cluster, but 
 probably around half of the time on only one of the nodes.  I haven't been 
 able to reproduce this error with DSC 2.0.9, and there have been no code or 
 definition file changes in Opscenter.
 I can reproduce locally with the above steps.  I'm happy to test any proposed 
 fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-04-19 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14500545#comment-14500545
 ] 

Brandon Williams edited comment on CASSANDRA-8072 at 4/19/15 6:39 PM:
--

After deep packet inspection, I believe I've found the root non-reconnectable 
snitch part of this issue.  When you decom a node, it never correctly tears 
down its ITC pools, which leaves the other side with a dead OTC pool:

{noformat}
tcp1  0 10.208.8.123:33441  10.208.8.63:7000CLOSE_WAIT  
18401/java  
{noformat}

Now when you try to bootstrap with the same IP, the shadow syn is correctly 
sent and the ack reply is built and queued, but MS tries to use the now defunct 
OTC pool and the message never makes it back to the node, since it just sends 
TCP RSTs which finally kills the connection.  But since the gossip syn is only 
sent once, the seed has nothing else to send the node and never reestablishes 
the connection, leaving the bootstrapping node thinking it never talked to a 
seed and throwing this error.


was (Author: brandon.williams):
After deep packet inspection, I believe I've found the root non-reconnectable 
snitch part of this issue.  When you decom a node, it never correctly tears 
down its ITC pools, which leaves the other side with a dead OTC pool:

{noformat}
tcp1  0 10.208.8.123:33441  10.208.8.63:7000CLOSE_WAIT  
18401/java  
{noformat}

Now when you try to bootstrap with the same IP, the shadow syn is correctly 
sent and the ack reply is built and queued, but MS tries to use the now default 
OTC pool and the message never makes it back to the node, since it just sends 
RSTs which finally kills the connection.  But since the syn is only sent once, 
the seed has nothing else to send the node and never reestablishes the 
connection, leaving the bootstrapping node thinking it never talked to a seed 
and throwing this error.

 Exception during startup: Unable to gossip with any seeds
 -

 Key: CASSANDRA-8072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan Springer
Assignee: Brandon Williams
 Fix For: 2.0.15, 2.1.5

 Attachments: cas-dev-dt-01-uw1-cassandra-seed01_logs.tar.bz2, 
 cas-dev-dt-01-uw1-cassandra-seed02_logs.tar.bz2, 
 cas-dev-dt-01-uw1-cassandra02_logs.tar.bz2, 
 casandra-system-log-with-assert-patch.log, trace_logs.tar.bz2


 When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
 in either ec2 or locally, an error occurs sometimes with one of the nodes 
 refusing to start C*.  The error in the /var/log/cassandra/system.log is:
 ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
 Exception encountered during startup
 java.lang.RuntimeException: Unable to gossip with any seeds
 at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
 at 
 org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
 at 
 org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
 (line 1279) Announcing shutdown
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
 MessagingService.java (line 701) Waiting for messaging service to quiesce
  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
 MessagingService.java (line 941) MessagingService has terminated the accept() 
 thread
 This errors does not always occur when provisioning a 2-node cluster, but 
 probably around half of the time on only one of the nodes.  I haven't been 
 able to reproduce this error with DSC 2.0.9, and there have been no code or 
 definition file changes in Opscenter.
 I can reproduce locally with the above steps.  I'm happy to test any proposed 
 fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2015-04-10 Thread John Alberts (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14490383#comment-14490383
 ] 

John Alberts edited comment on CASSANDRA-8072 at 4/10/15 9:23 PM:
--

Logs from cassandra cluster with logging set to TRACE.  This is from a new node 
launched and cassandra failed to start.
This is for a cluster running on EC2 using the ec2multiregion snitch.
I was able to reproduce this issue on a new cluster, decommissioned a node, 
shut it down, brought up a new node with the same EIP and this failed.



was (Author: albertsj1):
Logs from cassandra cluster with logging set to TRACE.  This is from a new node 
launched and cassandra failed to start.

 Exception during startup: Unable to gossip with any seeds
 -

 Key: CASSANDRA-8072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan Springer
Assignee: Brandon Williams
 Fix For: 2.0.15, 2.1.5

 Attachments: casandra-system-log-with-assert-patch.log, 
 trace_logs.tar.bz2


 When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
 in either ec2 or locally, an error occurs sometimes with one of the nodes 
 refusing to start C*.  The error in the /var/log/cassandra/system.log is:
 ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
 Exception encountered during startup
 java.lang.RuntimeException: Unable to gossip with any seeds
 at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
 at 
 org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
 at 
 org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
 (line 1279) Announcing shutdown
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
 MessagingService.java (line 701) Waiting for messaging service to quiesce
  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
 MessagingService.java (line 941) MessagingService has terminated the accept() 
 thread
 This errors does not always occur when provisioning a 2-node cluster, but 
 probably around half of the time on only one of the nodes.  I haven't been 
 able to reproduce this error with DSC 2.0.9, and there have been no code or 
 definition file changes in Opscenter.
 I can reproduce locally with the above steps.  I'm happy to test any proposed 
 fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (CASSANDRA-8072) Exception during startup: Unable to gossip with any seeds

2014-11-06 Thread Joseph Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-8072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14201236#comment-14201236
 ] 

Joseph Clark edited comment on CASSANDRA-8072 at 11/6/14 11:52 PM:
---

[CASSANDRA-8274|https://issues.apache.org/jira/browse/CASSANDRA-8274] appears 
to me to be the root cause, in my situation at least, to 
[CASSANDRA-7292|https://issues.apache.org/jira/browse/CASSANDRA-7292]. I'm 
still not convinced that 7292 and 8072 are duplicate issues.


was (Author: jw.clark):
[CASSANDRA-8274|https://issues.apache.org/jira/browse/CASSANDRA-8274] appears 
to me to be the root cause, in my situation at least, to 
[CASSANDRA-7292|https://issues.apache.org/jira/browse/CASSANDRA-7292]. I'm 
still not convinced that 7292 is a duplicate issue.

 Exception during startup: Unable to gossip with any seeds
 -

 Key: CASSANDRA-8072
 URL: https://issues.apache.org/jira/browse/CASSANDRA-8072
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan Springer
Assignee: Brandon Williams

 When Opscenter 4.1.4 or 5.0.1 tries to provision a 2-node DSC 2.0.10 cluster 
 in either ec2 or locally, an error occurs sometimes with one of the nodes 
 refusing to start C*.  The error in the /var/log/cassandra/system.log is:
 ERROR [main] 2014-10-06 15:54:52,292 CassandraDaemon.java (line 513) 
 Exception encountered during startup
 java.lang.RuntimeException: Unable to gossip with any seeds
 at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1200)
 at 
 org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:444)
 at 
 org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:609)
 at 
 org.apache.cassandra.service.StorageService.initServer(StorageService.java:502)
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:52,326 Gossiper.java 
 (line 1279) Announcing shutdown
  INFO [StorageServiceShutdownHook] 2014-10-06 15:54:54,326 
 MessagingService.java (line 701) Waiting for messaging service to quiesce
  INFO [ACCEPT-localhost/127.0.0.1] 2014-10-06 15:54:54,327 
 MessagingService.java (line 941) MessagingService has terminated the accept() 
 thread
 This errors does not always occur when provisioning a 2-node cluster, but 
 probably around half of the time on only one of the nodes.  I haven't been 
 able to reproduce this error with DSC 2.0.9, and there have been no code or 
 definition file changes in Opscenter.
 I can reproduce locally with the above steps.  I'm happy to test any proposed 
 fixes since I'm the only person able to reproduce reliably so far.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)