[ https://issues.apache.org/jira/browse/CASSANDRA-4323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559406#comment-13559406 ]
Mike Heffner commented on CASSANDRA-4323: ----------------------------------------- We are seeing this while trying to join a new 1.1.9 node to a running 1.1.6 cluster (which was upgraded earlier from 1.1.0). The new node goes straight from Joining->Normal without streaming the full sstable load and QUORUM writes were failing to the ring. This is a describe cluster after the offending new node was forcibly removed from the ring after it failed to join (10.241.3.3 was the 1.1.9 node that was removed): {{ [default@unknown] describe Metrics; Keyspace: Metrics: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [us-east:2] [default@unknown] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: af3cac9b-e8a4-3a3c-abc4-b8bbe24e0493: [10.124.1.1, 10.241.2.2] UNREACHABLE: [10.241.3.3] }} Next I'll try removing the schema_ datafiles listed above and restarting. Anything else that would help? > Joining new node to cluster fails with error in add column family > ----------------------------------------------------------------- > > Key: CASSANDRA-4323 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4323 > Project: Cassandra > Issue Type: Bug > Affects Versions: 1.1.1 > Environment: CentOS 6, Java 1.6 > Reporter: Bryce Godfrey > > I tried joining a new node to the cluster, and before boostrap started it > reported this error: > INFO 08:20:51,584 Enqueuing flush of Memtable-schema_columns@1493418651(0/0 > serialized/live bytes, 1 ops) > INFO 08:20:51,584 Writing Memtable-schema_columns@1493418651(0/0 > serialized/live bytes, 1 ops) > INFO 08:20:51,589 Completed flushing > /opt/cassandra/data/system/schema_columns/system-schema_columns-hc-1-Data.db > (61 bytes) > ERROR 08:20:51,889 Exception in thread Thread[MigrationStage:1,5,main] > java.lang.IllegalArgumentException: value already present: 1015 > at > com.google.common.base.Preconditions.checkArgument(Preconditions.java:115) > at > com.google.common.collect.AbstractBiMap.putInBothMaps(AbstractBiMap.java:111) > at com.google.common.collect.AbstractBiMap.put(AbstractBiMap.java:96) > at com.google.common.collect.HashBiMap.put(HashBiMap.java:84) > at org.apache.cassandra.config.Schema.load(Schema.java:385) > at > org.apache.cassandra.db.DefsTable.addColumnFamily(DefsTable.java:426) > at > org.apache.cassandra.db.DefsTable.mergeColumnFamilies(DefsTable.java:361) > at org.apache.cassandra.db.DefsTable.mergeSchema(DefsTable.java:270) > at > org.apache.cassandra.db.DefsTable.mergeRemoteSchema(DefsTable.java:248) > at > org.apache.cassandra.service.MigrationManager$MigrationTask.runMayThrow(MigrationManager.java:416) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) > at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) > at java.util.concurrent.FutureTask.run(Unknown Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > INFO 08:20:51,931 Enqueuing flush of > Memtable-schema_keyspaces@833041663(943/1178 serialized/live bytes, 20 ops) > INFO 08:20:51,932 Writing Memtable-schema_keyspaces@833041663(943/1178 > serialized/live bytes, 20 ops) > And continued on, then started writing these errors non-stop: > ERROR 08:21:45,959 Error in row mutation > org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=1019 > at > org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126) > at > org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439) > at > org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447) > at org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395) > at > org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > ERROR 08:21:45,814 Error in row mutation > org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=1019 > at > org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126) > at > org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439) > at > org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447) > at org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395) > at > org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > ERROR 08:21:45,813 Error in row mutation > org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=1020 > at > org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:126) > at > org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:439) > at > org.apache.cassandra.db.RowMutation$RowMutationSerializer.deserialize(RowMutation.java:447) > at org.apache.cassandra.db.RowMutation.fromBytes(RowMutation.java:395) > at > org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:42) > at > org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown > Source) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Unknown Source) > ERROR 08:21:45,813 Error in row mutation > I had a problem a while ago of someone trying to create a new column family > while a node was hung. The new node never picked up the new column family, > so we deleted it and tried again and everything was fine. Not sure if its > related though. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira