Gossiper ConcurrentModificationException after Decommissioning
--------------------------------------------------------------
Key: CASSANDRA-1494
URL: https://issues.apache.org/jira/browse/CASSANDRA-1494
Project: Cassandra
Issue Type: Bug
Components: Core
Affects Versions: 0.6.5
Environment: Linux 2.6.33.8-149.fc13.x86_64 #1 SMP Tue Aug 17 22:53:15
UTC 2010 x86_64 x86_64 x86_64 GNU/Linux
Reporter: Dan Retzlaff
Priority: Critical
After decommissioning 192.168.2.147, the Gossiper caused a
ConcurrentModificationException in 192.168.2.55. This cascaded into
192.168.2.55 thinking that 192.168.2.148 and 192.168.2.149 repeatedly went UP
and then DOWN. Eventually this left so many intranode (storage port) TCP
connections in CLOSE_WAIT that other nodes started failing with "too many open
files" exceptions.
INFO [Timer-0] 2010-09-08 17:00:02,398 Gossiper.java (line 402) FatClient
/192.168.2.147 has been silent for 3600000ms, removing from gossip
ERROR [Timer-0] 2010-09-08 17:00:02,418 Gossiper.java (line 99) Gossip error
java.util.ConcurrentModificationException
at java.util.Hashtable$Enumerator.next(Hashtable.java:1031)
at org.apache.cassandra.gms.Gossiper.doStatusCheck(Gossiper.java:383)
at
org.apache.cassandra.gms.Gossiper$GossipTimerTask.run(Gossiper.java:93)
at java.util.TimerThread.mainLoop(Timer.java:512)
at java.util.TimerThread.run(Timer.java:462)
INFO [Timer-0] 2010-09-08 17:00:12,398 Gossiper.java (line 180) InetAddress
/192.168.2.148 is now dead.
INFO [Timer-0] 2010-09-08 17:00:14,399 Gossiper.java (line 180) InetAddress
/192.168.2.149 is now dead.
INFO [GMFD:1] 2010-09-08 17:00:19,400 Gossiper.java (line 578) InetAddress
/192.168.2.149 is now UP
INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:19,400 HintedHandOffManager.java
(line 165) Started hinted handoff for endPoint /192.168.2.149
INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:19,401 HintedHandOffManager.java
(line 222) Finished hinted handoff of 0 rows to endpoint /192.168.2.149
INFO [Timer-0] 2010-09-08 17:00:20,399 Gossiper.java (line 180) InetAddress
/192.168.2.149 is now dead.
INFO [GMFD:1] 2010-09-08 17:00:43,409 Gossiper.java (line 578) InetAddress
/192.168.2.148 is now UP
INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:43,409 HintedHandOffManager.java
(line 165) Started hinted handoff for endPoint /192.168.2.148
INFO [HINTED-HANDOFF-POOL:1] 2010-09-08 17:00:43,410 HintedHandOffManager.java
(line 222) Finished hinted handoff of 0 rows to endpoint /192.168.2.148
INFO [Timer-0] 2010-09-08 17:00:44,404 Gossiper.java (line 180) InetAddress
/192.168.2.148 is now dead.
INFO [GMFD:1] 2010-09-08 17:01:18,415 Gossiper.java (line 578) InetAddress
/192.168.2.149 is now UP
(UP/DOWN cycle repeats until the target node *really* goes DOWN due to too many
TCP sockets in CLOSE_WAIT.)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.