cluster split due to schema disagreement
----------------------------------------
Key: CASSANDRA-3463
URL: https://issues.apache.org/jira/browse/CASSANDRA-3463
Project: Cassandra
Issue Type: Bug
Affects Versions: 0.8.7
Reporter: Radim Kolar
i found interesting situation in 2 node cluster. Replication factor is 1.
gossip (nodetool ring) thinks on both nodes that they are both up.
Address DC Rack Status State Load Owns
Token
99070591730234615865843651857942052864
****.104.18 datacenter1 rack1 Up Normal 19.36 GB 41.77% 0
****.99.40 datacenter1 rack1 Up Normal 26.24 GB 58.23%
one node works fine, while second thinks that other node is down even if his
gossip correctly recognizes other node as up. Problem is in schema agreement,
but i dont know if logs contains enough information to discover why nodes could
not reach schema agreement.
[default@test] describe cluster;
Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
9f2b5be0-06e2-11e1-0000-d14dd490cdf6: [****.104.18]
UNREACHABLE: [****.99.40]
INFO [GossipTasks:1] 2011-11-06 18:49:56,325 Gossiper.java (line 716)
InetAddress /*****99.40 is now dead.
INFO [GossipStage:1] 2011-11-06 18:50:01,345 Gossiper.java (line 702)
InetAddress /*****99.40 is now UP
INFO [GossipTasks:1] 2011-11-06 18:50:02,331 Gossiper.java (line 716)
InetAddress /*****99.40 is now dead.
INFO [GossipStage:1] 2011-11-06 18:50:06,444 Gossiper.java (line 702)
InetAddress /*****99.40 is now UP
INFO [GossipTasks:1] 2011-11-06 18:50:07,336 Gossiper.java (line 716)
InetAddress /*****99.40 is now dead.
INFO [GossipStage:1] 2011-11-06 18:50:11,544 Gossiper.java (line 702)
InetAddress /*****99.40 is now UP
INFO [GossipTasks:1] 2011-11-06 18:50:12,341 Gossiper.java (line 716)
InetAddress /*****99.40 is now dead.
INFO [GossipStage:1] 2011-11-06 18:50:16,644 Gossiper.java (line 702)
InetAddress /*****99.40 is now UP
INFO [GossipTasks:1] 2011-11-06 18:50:17,347 Gossiper.java (line 716)
InetAddress /*****99.40 is now dead.
INFO [GossipStage:1] 2011-11-06 18:50:31,944 Gossiper.java (line 702)
InetAddress /*****99.40 is now UP
INFO [GossipTasks:1] 2011-11-06 18:50:32,362 Gossiper.java (line 716)
InetAddress /*****99.40 is now dead.
INFO [GossipStage:1] 2011-11-06 18:50:37,044 Gossiper.java (line 702)
InetAddress /*****99.40 is now UP
ERROR [HintedHandoff:6] 2011-11-06 18:50:42,010 AbstractCassandraDaemon.java
(line 139) Fatal exception in thread Thread[HintedHandoff:6,1,main]
java.lang.RuntimeException: java.lang.RuntimeException: Could not reach schema
agreement with /*****99.40 in 60000ms
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.RuntimeException: Could not reach schema agreement with
/*****99.40 in 60000ms
at
org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293)
at
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304)
at
org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89)
at
org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
ERROR [HintedHandoff:6] 2011-11-06 18:50:42,028 AbstractCassandraDaemon.java
(line 139) Fatal exception in thread Thread[HintedHandoff:6,1,main]
java.lang.RuntimeException: java.lang.RuntimeException: Could not reach schema
agreement with /*****99.40 in 60000ms
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:679)
Caused by: java.lang.RuntimeException: Could not reach schema agreement with
/*****99.40 in 60000ms
at
org.apache.cassandra.db.HintedHandOffManager.waitForSchemaAgreement(HintedHandOffManager.java:293)
at
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:304)
at
org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:89)
at
org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:397)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira