[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420685#comment-13420685 ] Jonathan Ellis commented on CASSANDRA-4456: --- I think this was introduced by CASSANDRA-3721: getOverlappingSSTables assumes that the sstables we check for overlaps are part of the live set, but now we can validate over a snapshot instead. AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair -- Key: CASSANDRA-4456 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Sylvain Lebresne We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs. ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main] java.lang.AssertionError at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874) at org.apache.cassandra.db.compaction.CompactionController.init(CompactionController.java:69) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:834) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) In building this ring we migrated sstables from an identical 0.8.8 ring by: 1. Creating the schema on our new 1.1.2 ring. 2. Rsyncing over sstables from 0.8.8 ring. 3. Renaming the sstables to match the directory and file naming structure of 1.1.x. 4. Ran nodetool refresh keyspace cf for each CF across each node. 5. Ran nodetool upgradesstables for each CF across each node. When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420746#comment-13420746 ] Sylvain Lebresne commented on CASSANDRA-4456: - Actually I think this can happen even when snapshots are not used since a sstable can finish to be compacted just between when we chose sstable for repair and when we create the CompactionController for the validation compaction. In particular, I wonder if Michael and Mike have used -snapshot for their compaction. Though it's true that repair on snapshot will make that way more likely to happen. But actually I don't think we need to call getOverlappingSStables at all in the first place for repair, since this is used only to decide if we can purge but repair does not do purging. Attaching a simple patch to skip the call entirely. AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair -- Key: CASSANDRA-4456 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Sylvain Lebresne Fix For: 1.1.3 Attachments: 4456.txt We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs. ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main] java.lang.AssertionError at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874) at org.apache.cassandra.db.compaction.CompactionController.init(CompactionController.java:69) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:834) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) In building this ring we migrated sstables from an identical 0.8.8 ring by: 1. Creating the schema on our new 1.1.2 ring. 2. Rsyncing over sstables from 0.8.8 ring. 3. Renaming the sstables to match the directory and file naming structure of 1.1.x. 4. Ran nodetool refresh keyspace cf for each CF across each node. 5. Ran nodetool upgradesstables for each CF across each node. When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13420755#comment-13420755 ] Jonathan Ellis commented on CASSANDRA-4456: --- You need to wire VCC in to ValidationCompactionIterable, but otherwise +1. AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair -- Key: CASSANDRA-4456 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner Assignee: Sylvain Lebresne Fix For: 1.1.3 Attachments: 4456.txt We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs. ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main] java.lang.AssertionError at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874) at org.apache.cassandra.db.compaction.CompactionController.init(CompactionController.java:69) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:834) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) In building this ring we migrated sstables from an identical 0.8.8 ring by: 1. Creating the schema on our new 1.1.2 ring. 2. Rsyncing over sstables from 0.8.8 ring. 3. Renaming the sstables to match the directory and file naming structure of 1.1.x. 4. Ran nodetool refresh keyspace cf for each CF across each node. 5. Ran nodetool upgradesstables for each CF across each node. When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419912#comment-13419912 ] Michael Theroux commented on CASSANDRA-4456: I just hit this problem myself today, on a single node in a six node cluster. I was running nodetool repair, and it halted with this exception in the log. I was monitoring the repair pretty closely. A couple of observations: 1) It happened while compaction of the same column family was happening simultaneously 2) When I re-ran it, it worked. Note: I am not a cassandra developer, but I looked at the code. A highly uneducated guess is that an sstable was compacted and deleted while validation was expecting it to be there? AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair -- Key: CASSANDRA-4456 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs. ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main] java.lang.AssertionError at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874) at org.apache.cassandra.db.compaction.CompactionController.init(CompactionController.java:69) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:834) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) In building this ring we migrated sstables from an identical 0.8.8 ring by: 1. Creating the schema on our new 1.1.2 ring. 2. Rsyncing over sstables from 0.8.8 ring. 3. Renaming the sstables to match the directory and file naming structure of 1.1.x. 4. Ran nodetool refresh keyspace cf for each CF across each node. 5. Ran nodetool upgradesstables for each CF across each node. When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-4456) AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair
[ https://issues.apache.org/jira/browse/CASSANDRA-4456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13419913#comment-13419913 ] Michael Theroux commented on CASSANDRA-4456: I am also on 1.1.2. AssertionError in ColumnFamilyStore.getOverlappingSSTables() during repair -- Key: CASSANDRA-4456 URL: https://issues.apache.org/jira/browse/CASSANDRA-4456 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.1.2 Environment: Ubuntu 11.04 64-bit Reporter: Mike Heffner We have hit the following exception on several nodes while running repairs across our 1.1.2 ring. We've observed it happen on either the node executing the repair or a participating replica in the repair operation. The result in either case is that the repair hangs. ERROR [ValidationExecutor:9] 2012-07-21 01:54:03,019 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ValidationExecutor:9,1,main] java.lang.AssertionError at org.apache.cassandra.db.ColumnFamilyStore.getOverlappingSSTables(ColumnFamilyStore.java:874) at org.apache.cassandra.db.compaction.CompactionController.init(CompactionController.java:69) at org.apache.cassandra.db.compaction.CompactionManager$ValidationCompactionIterable.init(CompactionManager.java:834) at org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:698) at org.apache.cassandra.db.compaction.CompactionManager.access$600(CompactionManager.java:68) at org.apache.cassandra.db.compaction.CompactionManager$8.call(CompactionManager.java:438) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) In building this ring we migrated sstables from an identical 0.8.8 ring by: 1. Creating the schema on our new 1.1.2 ring. 2. Rsyncing over sstables from 0.8.8 ring. 3. Renaming the sstables to match the directory and file naming structure of 1.1.x. 4. Ran nodetool refresh keyspace cf for each CF across each node. 5. Ran nodetool upgradesstables for each CF across each node. When those steps had completed, we began rolling repairs. Not all of the repair operations have hit the exception -- some of the repairs have completed successfully. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira