[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077429#comment-14077429 ] Benedict commented on CASSANDRA-7582: - We can add some explanation in the error message, suggesting that in the event of power failure it is safe to ignore these errors, and can also make them more robust under power failure (e.g. we could write a check-summed value instead of zeros to the next marker on completion of sync, which differs from the marker present once the next section has begun writing, so that we can ignore any errors occurring on a section without a fully-synced marker). The exception in general can mention that the message is _likely_ the result of corruption, either from power or hardware failure. As an end user I would still prefer in this scenario to know that it happened, and opt-in to repairing it by ignoring the errors. What hole are we talking about? We *shouldn't* in any version of C* see any CL records for dropped tables since we checkpoint and discard the commit log after a DROP. So we'd only see this message if there's a bug anyway, unless you're using PIT restore. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1406#comment-1406 ] Jonathan Ellis commented on CASSANDRA-7582: --- bq. We can add some explanation in the error message, suggesting that in the event of power failure it is safe to ignore these errors This is a nonstarter for me. Power failures happen; forcing people to troubleshoot a non-bug as if it were a potential bug is a terrible plan. So I'm back to the opt in for tests idea. bq. What hole are we talking about? The hole where dropped tables aren't remembered but are in the CL on upgrade, as in this ticket. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077792#comment-14077792 ] Benedict commented on CASSANDRA-7582: - Right, and this hole is down to a CL bug we would likely have caught had we had this enabled previously. A correctly functioning drain would not permit this to happen. However this is really a non-issue, since the only plausibly affected table is going to be this system table we dropped, as otherwise there would have to be a table drop immediately prior to shutting down the node for upgrade. Especially since schema disagreement during upgrade is a no-no, this should not happen. Since drain does sometimes work, this should reduce the risk profile even further. Further, we can relatively safely prevent almost all power-failure exceptions by introducing the change I suggested, and ignoring any errors on CLS reading if the header hashes are consistent with the header's id (which we now have available to us), and this id is in the past, as this is obviously a recycled segment that had not yet had its header reset before a power failure. This leaves only those that managed to write only a partial block during a power failure, which will be dealt with by OS journalling (and should be impossible on SSDs anyway). So I don't think there are any power-off risk scenarios left to warn about. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077828#comment-14077828 ] Jonathan Ellis commented on CASSANDRA-7582: --- I'm not convinced, because all those sanity-checks resulted from actual problems people hit in the wild (pre-commitlog recycling). Hardware is weird and will bite you. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077878#comment-14077878 ] Benedict commented on CASSANDRA-7582: - It wouldn't be foolproof at suppressing all errors, that would require two fsyncs, which is too costly. Anyway, I've made the best case I can. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077907#comment-14077907 ] Jeremiah Jordan commented on CASSANDRA-7582: We just need to make sure we throw out big giant ERROR messages in the logs when the CL has issues. We shouldn't stop. The only thing for someone to do is run repair. So put out a giant message saying run repair. Don't stop the service from actually starting. This will be a huge annoyance for operators who will just add the flag and move on the first time they see it. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077922#comment-14077922 ] T Jake Luciani commented on CASSANDRA-7582: --- Seems like for an upgrade we shouldn't have a CL at all, is drain not working? 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077929#comment-14077929 ] Aleksey Yeschenko commented on CASSANDRA-7582: -- bq. Seems like for an upgrade we shouldn't have a CL at all, is drain not working? CL code explicitly supports replaying previous-version encoded entries. drain is recommended, but not required. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077930#comment-14077930 ] Benedict commented on CASSANDRA-7582: - bq. drain is recommended, but not required. It is also broken in 2.0 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077946#comment-14077946 ] T Jake Luciani commented on CASSANDRA-7582: --- [~benedict] is there a ticket for broken drain? 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077955#comment-14077955 ] Benedict commented on CASSANDRA-7582: - CASSANDRA-5911 (fixed in CASSANDRA-3578) 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077970#comment-14077970 ] Jonathan Ellis commented on CASSANDRA-7582: --- bq. We just need to make sure we throw out big giant ERROR messages in the logs when the CL has issues. We shouldn't stop. The only thing for someone to do is run repair. So put out a giant message saying run repair. Don't stop the service from actually starting. This will be a huge annoyance for operators who will just add the flag and move on the first time they see it. Exactly. I will reopen CASSANDRA-7125. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078035#comment-14078035 ] Ryan McGuire commented on CASSANDRA-7582: - Are there any commits with lasting effect from this ticket, or were they all reverted? 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078039#comment-14078039 ] Jonathan Ellis commented on CASSANDRA-7582: --- No commits except the 7125 revert (a5bc52eee90e342efcdc53282612008d3dbaeaeb). 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078159#comment-14078159 ] Benedict commented on CASSANDRA-7582: - It's worth pointing out that this strategy of logging an ERROR, ignoring the fact that operators may miss it, with the recommended RF=3 this leaves a gap wherein broken data can be returned with QUORUM reads unless the node is brought up in a non-participating state until it is repaired. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076202#comment-14076202 ] Jonathan Ellis commented on CASSANDRA-7582: --- How do we know what's in the system ks when all we have is a cfid that doesn't match anything known? More generally, I'm not sure how stop on unknown cfid is going to be a useful feature. It's definitely going to happen if you replay a commitlog after dropping a table, for instance, if we have an unclean shutdown in between. This is normal behavior and not a bug per se, so whacking users and not starting up is definitely antisocial. On the other hand I can't picture a scenario where the user *can* take meaningful action based on failing startup here. Put another way, ignoring the mutations is the Right Thing to do in every scenario I can think of. So I propose we just log it at info and ignore. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076203#comment-14076203 ] Aleksey Yeschenko commented on CASSANDRA-7582: -- Indeed, there is no obvious way to recover from it that I can think of. +1 on logging it and going on. -Dcassandra.commitlog.stop_on_missing_tables should also go. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076399#comment-14076399 ] Jonathan Ellis commented on CASSANDRA-7582: --- This was introduced by CASSANDRA-7125 for 2.1.1 and is not in the 2.1.0 branch. Is this actually a problem with rc4 [~enigmacurry]? 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076465#comment-14076465 ] Ryan McGuire commented on CASSANDRA-7582: - I think this was version tagged incorrectly. I'm seeing CASSANDRA-7593 on rc4 instead of this one. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076469#comment-14076469 ] Jonathan Ellis commented on CASSANDRA-7582: --- Thanks, Ryan. Benedict, I'm starting to think 7125 was misguided. If the CL errors out there just isn't much you can do about it except pass the flag and try again, so why not cut out the extra step? 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076509#comment-14076509 ] Benedict commented on CASSANDRA-7582: - I'm -1000 on encountering an error and silently swallowing it on something as core to correctness as the commit log - this at least gives the user a big red flag they may want to seek expert help. I think there are two distinct problems here - there are the 'unexpected' errors which should almost certainly involve the user seeking help from an expert to diagnose (or perhaps JIRA, since it possibly means a bug), and the unknown table exceptions. The latter are debatably more ok to ignore, but I would much rather we simply retain information about dropped tables, much as we do truncated tables, so that we can suppress those known to have been dropped (with knowledge of exactly _when_ they were dropped, so if we see CL records past that time we can still fail and ask the user to at least file a bug report). Consider the following (pretty plausible scenario): * User turns on CL saving * User creates table X, populates it with some data (let's say it's a fairly static dataset) * User uses the database for a period, mostly changing other tables * At time T, user drops table X, recreates it (instead of, e.g. truncate (which is separately also subtly dangerous in this scenario), and repopulates it with subtly but business-wise importantly different data * Some time after T, user has to restore the cluster, and restores the schema from prior to T by mistake (let's say the team member restoring doesn't realise the table was recreated since then), then performs a PIT restore The user now has no idea they have stale business data in their tables. Now, assuming we have saved the ids of all dropped tables we could report to the user that they are likely restoring data from a future schema, and they could then decide if this was safe or not; in this case they would be able to restore a newer schema (assuming they had saved it) and a major business error would have been averted. In general this fail-fast is likely to result in an increase in JIRA filing, and possibly for relatively benign bugs, but on the whole I would prefer that scenario than leaving subtle bugs in the CL. We've already caught at least one as a result of this, and we've had long standing bugs with respect to drain that still affect 2.0 that would have been caught a long time ago with better reporting. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076516#comment-14076516 ] Benedict commented on CASSANDRA-7582: - Hmm. Separately this scenario also points out that TRUNCATE is even more broken than I thought - since it doesn't get logged to the CL, if you restore a schema prior to a TRUNCATE you will simply get the old data supplemented with the new data. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076540#comment-14076540 ] Jonathan Ellis commented on CASSANDRA-7582: --- I see two actual classes of CL errors: # Table is dropped and we are replaying stale data that should also have been dropped. Blocking startup is the Wrong Solution. # Hardware problem caused a checksum mismatch. Blocking startup is the Wrong Solution. Granted that blocking startup can help prevent user errors during PIT recover, that's an entirely hypothetical situation today; PIT is only nominally usable. (Fork the JVM every time a CL segment finishes? Yeah.) So let's not optimize for that at the expense of scenarios we see frequently. I think we should roll back 7125 until we can do it right. Doing it right probably means, remembering old cfids in 2.1.x, then we can get paranoid about seeing them in the CL for 3.0. (Getting paranoid in the same version as we start remembering is bad for obvious reasons.) 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076586#comment-14076586 ] Benedict commented on CASSANDRA-7582: - 3. We've busted something This is the main type I'm trying to catch with this behaviour. I would prefer to know earlier if 2.1 is broken instead of corrupting the user's CL in some way without realising it. We've had several bugs in the 2.1 release cycle that would have been caught earlier had we had this feature enabled, and I would be surprised if we don't see some more once it gets released into the wild as a result of this. There are still bugs in 2.0 that we've fixed in 2.1 that we would certainly have caught earlier. Enforcing correctness from other avenues is a strong secondary concern. This isn't a point of optimisation, we're talking about providing an unsafe PIT feature (and we've already got a ticket filed for removing forking), and also more importantly risking an unsafe regular _replay_. I disagree that hardware problem causing checksum mismatch shouldn't block startup - in this case you may have alternative copies of the data that are not corrupted, or can choose to analyse the logs yourself to establish what is happening. If you don't care, you set the don't care flag; but without the failure you maybe don't even know there are records that haven't been replayed (possibly whole files) 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076721#comment-14076721 ] Jonathan Ellis commented on CASSANDRA-7582: --- But that's still not a common *production* scenario. So we're still optimizing bassackwards. How about this? Leave the checks in, but backwards: they're disabled, *unless* there's a flag. Then we set the flag in utest and dtest. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076746#comment-14076746 ] Benedict commented on CASSANDRA-7582: - What isn't a common production scenario? Commit Log bugs? We know there are some still in 2.0. There are potentially some in 2.1 too, and we probably won't spot them without something like this to help users know they encountered them and report them. Optimizing != Correctness. I am very negative on disabling this. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076797#comment-14076797 ] Jonathan Ellis commented on CASSANDRA-7582: --- I'm skeptical. Looking at the 2.0 changelog, we've fixed CASSANDRA-6652 and CASSANDRA-6714 since 2.0.0 final, and this wouldn't have helped catch those. So, I'm not saying that ignoring errors is a Good Thing, but when there's more false positives than true positives, users will learn to ignore it anyway and we're not actually helping anyone. At the very least, this is demonstrably broken in 2.1.1 given this ticket right here. So I see two reasonable courses of action: # remembering old cfids in 2.1.x, then we can get paranoid about seeing them in the CL for 3.0. # using the checks as a kind of assert that we enable for tests but not (without opt-in) for production I'm open to alternatives, but leaving things the way they are now is not one of them. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076820#comment-14076820 ] Benedict commented on CASSANDRA-7582: - I am less adamant about CfId checks as I am about failing on commit log checksum/mutation replay failures. I could just about live with (2), but naturally we will get better coverage by enabling this with all users. We don't know what bugs we might catch with it. So, I would prefer one of: 1) Start remembering old cfids in 2.1.1 along with this feature, so we can start complaining immediately; or 2) For now simply assert on non-CfId errors (i.e. make that opt-in rather than opt-out), introduce CfId recording at some point and make it opt-out at some point after 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077317#comment-14077317 ] Jonathan Ellis commented on CASSANDRA-7582: --- I suppose we can compromise on enabling the check as soon as we remember the cfids, even though that leaves a hole where we can false-positive on upgrade. How sure are you that the MalformedCommitLogException aren't going to false-positive on power failure? On first inspection all of those except the serializedSize check look like they will be prone to that. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Benedict Priority: Critical Fix For: 2.1.1 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071468#comment-14071468 ] Benedict commented on CASSANDRA-7582: - I would feel safer ignoring a specific whitelist, in case of some weird unexpected bug around some later table in the system keyspace. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Russ Hatch Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069907#comment-14069907 ] Benedict commented on CASSANDRA-7582: - Looks like this was probably broken all along, and the default fail startup on CL failure is what exposed it. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Russ Hatch Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070655#comment-14070655 ] Ryan McGuire commented on CASSANDRA-7582: - This also affects the snapshot_test.TestArchiveCommitlog tests. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Russ Hatch Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070668#comment-14070668 ] Benedict commented on CASSANDRA-7582: - Well, sort of. We knew that was broken already - whether or not it's the same root cause I'm not convinced. That failure is because we never fixed the tests after deciding we weren't going to 'fix' behaviour in CASSANDRA-6694 (related to CASSANDRA-5202) 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Russ Hatch Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070913#comment-14070913 ] Benedict commented on CASSANDRA-7582: - We've dropped system.NodeIdInfo, but drain on 2.0 is a little bit broken and so we don't end up with the clean CL we should do, and we barf when we see mutations for this table. We should probably create a whitelist of cfIds we'll accept being missing, as this will no doubt crop up in future as well. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Russ Hatch Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071099#comment-14071099 ] Brandon Williams commented on CASSANDRA-7582: - Seems like anything in the system ks is fair game to silently ignore. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Components: Core Reporter: Ryan McGuire Assignee: Russ Hatch Priority: Critical Fix For: 2.1.0 Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors
[ https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069788#comment-14069788 ] Ryan McGuire commented on CASSANDRA-7582: - [~rhatch] can you repro and bisect this? It was passing last week. 2.1 multi-dc upgrade errors --- Key: CASSANDRA-7582 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582 Project: Cassandra Issue Type: Bug Reporter: Ryan McGuire Assignee: Russ Hatch Multi-dc upgrade [was working from 2.0 - 2.1 fairly recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/], but is currently failing. Running upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test I get the following errors when starting 2.1 upgraded from 2.0: {code} ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay failed due to replaying a mutation for a missing table. This error can be ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on the command line ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception encountered during startup java.lang.RuntimeException: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) [main/:na] at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457) [main/:na] at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) [main/:na] Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004 at org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164) ~[main/:na] at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353) ~[main/:na] at org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) ~[main/:na] at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) ~[main/:na] {code} -- This message was sent by Atlassian JIRA (v6.2#6252)