During a recent upgrade of our cassandra ring from 0.6.8 to 0.7.3 and prior to a drain on the 0.6.8 nodes, we lost a node for reasons unrelated to cassandra. We decided to push forward with the drain on the remaining healthy nodes. The upgrade completed successfully for the remaining nodes and the ring was healthy. However, we're unable to boostrap in a new node. The bootstrap process starts and we can see streaming activity in the logs for the node giving up tokens, but the bootstrapping node encounters the following:
INFO [main] 2011-03-07 10:37:32,671 StorageService.java (line 505) Joining: sleeping 30000 ms for pending range setup INFO [main] 2011-03-07 10:38:02,679 StorageService.java (line 505) Bootstrapping INFO [HintedHandoff:1] 2011-03-07 10:38:02,899 HintedHandOffManager.java (line 304) Started hinted handoff for endpoint /10.211.14.200 INFO [HintedHandoff:1] 2011-03-07 10:38:02,900 HintedHandOffManager.java (line 360) Finished hinted handoff of 0 rows to endpoint /10.211.14.200 INFO [CompactionExecutor:1] 2011-03-07 10:38:04,924 SSTableReader.java (line 154) Opening /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuff-f-1 INFO [CompactionExecutor:1] 2011-03-07 10:38:05,390 SSTableReader.java (line 154) Opening /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuff-f-2 INFO [CompactionExecutor:1] 2011-03-07 10:38:05,768 SSTableReader.java (line 154) Opening /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-1 INFO [CompactionExecutor:1] 2011-03-07 10:38:06,389 SSTableReader.java (line 154) Opening /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-2 INFO [CompactionExecutor:1] 2011-03-07 10:38:06,581 SSTableReader.java (line 154) Opening /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-3 ERROR [CompactionExecutor:1] 2011-03-07 10:38:07,056 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[CompactionExecutor:1,1,main] java.io.EOFException at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) INFO [CompactionExecutor:1] 2011-03-07 10:38:08,480 SSTableReader.java (line 154) Opening /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid-f-5 INFO [CompactionExecutor:1] 2011-03-07 10:38:08,582 SSTableReader.java (line 154) Opening /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid_reg_idx-f-1 ERROR [CompactionExecutor:1] 2011-03-07 10:38:08,635 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[CompactionExecutor:1,1,main] java.io.EOFException at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) ERROR [CompactionExecutor:1] 2011-03-07 10:38:08,666 AbstractCassandraDaemon.java (line 114) Fatal exception in thread Thread[CompactionExecutor:1,1,main] java.io.EOFException at org.apache.cassandra.io.sstable.IndexHelper.skipIndex(IndexHelper.java:65) at org.apache.cassandra.io.sstable.SSTableWriter$Builder.build(SSTableWriter.java:303) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:923) at org.apache.cassandra.db.CompactionManager$9.call(CompactionManager.java:916) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) INFO [CompactionExecutor:1] 2011-03-07 10:38:08,855 SSTableReader.java (line 154) Opening /mnt/services/cassandra/var/data/0.7.3/data/Stuff/stuffid_reg_idx-f-4 The same behavior has happened on both attempts. Logs from the node giving up tokens show activity by the StreamStage thread but after the failure on the bootstrapping node not much else relative to the stream. Lastly, the behavior in both cases seems to have issue with the third data file. Files f-1,f-2 and f-4 are present but f-3 is not. Any help would be appreciated. -erik