Its worth checking your connectivity on each node to see if the connections are established:
For example: # netstat -ant | awk 'NR==2;/7001/' Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 172.31.10.93:7001 0.0.0.0:* LISTEN tcp 0 0 172.31.10.93:56771 172.31.10.93:7001 ESTABLISHED tcp 0 0 172.31.10.93:7001 54.183.204.110:42231 ESTABLISHED tcp 0 0 172.31.10.93:52031 54.183.204.110:7001 ESTABLISHED tcp 0 0 172.31.10.93:50759 54.183.204.110:7001 ESTABLISHED tcp 0 0 172.31.10.93:38986 172.31.10.93:7001 ESTABLISHED tcp 0 0 172.31.10.93:7001 172.31.10.93:42408 ESTABLISHED tcp 0 0 172.31.10.93:7001 172.31.10.93:38986 ESTABLISHED tcp 0 0 172.31.10.93:42408 172.31.10.93:7001 ESTABLISHED tcp 0 0 172.31.10.93:7001 172.31.10.93:56771 ESTABLISHED tcp 0 0 172.31.10.93:7001 54.183.204.110:37491 ESTABLISHED Note i'm using 7001 here because my cluster uses SSL but you can use 7000 for the standard gossip port Thanks Mark On 21 January 2016 at 14:08, Bernardino Mota < bernardino.m...@knowledgeworks.pt> wrote: > In the logs nothing strange but “nodetool gossipinfo” seems OK > > ./nodetool gossipinfo > /192.168.1.10 > generation:1453316804 > heartbeat:206518 > STATUS:18:NORMAL,-1003341236369672970 > LOAD:206420:4.3533596E7 > SCHEMA:14:6f97097b-45ce-3479-8b2f-af2fef4967e7 > DC:8:DC2 > RACK:10:rack1 > RELEASE_VERSION:4:2.2.4 > INTERNAL_IP:6:192.168.1.10 > RPC_ADDRESS:3:127.0.0.1 > SEVERITY:206517:0.0 > NET_VERSION:1:9 > HOST_ID:2:51650afd-84dd-4e25-a6f0-13627858d5dc > RPC_READY:49:true > TOKENS:17:<hidden> > /192.168.1.102 > generation:1453316986 > heartbeat:84622 > STATUS:28:NORMAL,-1085177681742913545 > LOAD:84535:1.2606418E7 > SCHEMA:14:6f97097b-45ce-3479-8b2f-af2fef4967e7 > DC:8:DC1 > RACK:10:rack1 > RELEASE_VERSION:4:2.2.4 > INTERNAL_IP:6:10.0.2.10 > RPC_ADDRESS:3:127.0.0.1 > SEVERITY:84624:0.0 > NET_VERSION:1:9 > HOST_ID:2:ff906882-8224-40ac-8cdb-98f5e725814d > RPC_READY:98:true > TOKENS:27:<hidden> > > > > > On 21 Jan 2016, at 13:17, Adil <adil.cha...@gmail.com> wrote: > > Hi, > do you see any message related to gossip info? > > > 2016-01-21 14:09 GMT+01:00 Bernardino Mota < > bernardino.m...@knowledgeworks.pt>: > >> Using Cassandra 2.2.4 on Ubuntu. >> >> We have a cluster with two nodes that during several hours failed to >> connect with each other due to network problems. The database continued to >> be used in one of the nodes with writes being stored in the Hints file as >> supposed. >> >> But now that the network is OK again and each machine can communicate we >> see that each node indicates the other is DOWN and does not replicates. >> >> When the network came up we started to see in log files "Convicting / >> 192.168.1.102 with status NORMAL - alive false" >> >> It seems each node evictions each other and later failing to reconnect. >> >> Is there some configuration that we might be missing ? Any help would be >> much appreciated. >> >> >> >> - NODE 192.168.1.10 - "nodetool status” >> >> Datacenter: DC1 >> =============== >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns Host ID >> Rack >> DN 192.168.1.102 12.02 MB 256 ? >> ff906882-8224-40ac-8cdb-98f5e725814d rack1 >> Datacenter: DC2 >> =============== >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns Host ID >> Rack >> UN 192.168.1.10 41.87 MB 256 ? >> 51650afd-84dd-4e25-a6f0-13627858d5dc rack1 >> >> >> >> - NODE 192.168.1.102 - “nodetool status" >> >> Datacenter: DC1 >> =============== >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns Host ID >> Rack >> UN 192.168.1.102 12.4 MB 256 ? >> ff906882-8224-40ac-8cdb-98f5e725814d rack1 >> Datacenter: DC2 >> =============== >> Status=Up/Down >> |/ State=Normal/Leaving/Joining/Moving >> -- Address Load Tokens Owns Host ID >> Rack >> DN 192.168.1.10 26.31 MB 256 ? >> 51650afd-84dd-4e25-a6f0-13627858d5dc rack1 >> >> >> > >