[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077429#comment-14077429
 ] 

Benedict commented on CASSANDRA-7582:
-

We can add some explanation in the error message, suggesting that in the event 
of power failure it is safe to ignore these errors, and can also make them more 
robust under power failure (e.g. we could write a check-summed value instead of 
zeros to the next marker on completion of sync, which differs from the marker 
present once the next section has begun writing, so that we can ignore any 
errors occurring on a section without a fully-synced marker). The exception in 
general can mention that the message is _likely_ the result of corruption, 
either from power or hardware failure. As an end user I would still prefer in 
this scenario to know that it happened, and opt-in to repairing it by ignoring 
the errors.

What hole are we talking about? We *shouldn't* in any version of C* see any CL 
records for dropped tables since we checkpoint and discard the commit log after 
a DROP. So we'd only see this message if there's a bug anyway, unless you're 
using PIT restore.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1406#comment-1406
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

bq. We can add some explanation in the error message, suggesting that in the 
event of power failure it is safe to ignore these errors

This is a nonstarter for me.  Power failures happen; forcing people to 
troubleshoot a non-bug as if it were a potential bug is a terrible plan.  So 
I'm back to the opt in for tests idea.

bq. What hole are we talking about?

The hole where dropped tables aren't remembered but are in the CL on upgrade, 
as in this ticket.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077792#comment-14077792
 ] 

Benedict commented on CASSANDRA-7582:
-

Right, and this hole is down to a CL bug we would likely have caught had we had 
this enabled previously. A correctly functioning drain would not permit this to 
happen. However this is really a non-issue, since the only plausibly affected 
table is going to be this system table we dropped, as otherwise there would 
have to be a table drop immediately prior to shutting down the node for 
upgrade. Especially since schema disagreement during upgrade is a no-no, this 
should not happen. Since drain does sometimes work, this should reduce the risk 
profile even further.

Further, we can relatively safely prevent almost all power-failure exceptions 
by introducing the change I suggested, and ignoring any errors on CLS reading 
if the header hashes are consistent with the header's id (which we now have 
available to us), and this id is in the past, as this is obviously a recycled 
segment that had not yet had its header reset before a power failure. This 
leaves only those that managed to write only a partial block during a power 
failure, which will be dealt with by OS journalling (and should be impossible 
on SSDs anyway). So I don't think there are any power-off risk scenarios left 
to warn about.


 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077828#comment-14077828
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

I'm not convinced, because all those sanity-checks resulted from actual 
problems people hit in the wild (pre-commitlog recycling).  Hardware is weird 
and will bite you.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077878#comment-14077878
 ] 

Benedict commented on CASSANDRA-7582:
-

It wouldn't be foolproof at suppressing all errors, that would require two 
fsyncs, which is too costly. Anyway, I've made the best case I can.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077907#comment-14077907
 ] 

Jeremiah Jordan commented on CASSANDRA-7582:


We just need to make sure we throw out big giant ERROR messages in the logs 
when the CL has issues.  We shouldn't stop.  The only thing for someone to do 
is run repair.  So put out a giant message saying run repair.  Don't stop the 
service from actually starting.  This will be a huge annoyance for operators 
who will just add the flag and move on the first time they see it.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077922#comment-14077922
 ] 

T Jake Luciani commented on CASSANDRA-7582:
---

Seems like for an upgrade we shouldn't have a CL at all, is drain not working?

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077929#comment-14077929
 ] 

Aleksey Yeschenko commented on CASSANDRA-7582:
--

bq. Seems like for an upgrade we shouldn't have a CL at all, is drain not 
working?

CL code explicitly supports replaying previous-version encoded entries. drain 
is recommended, but not required.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077930#comment-14077930
 ] 

Benedict commented on CASSANDRA-7582:
-

bq.  drain is recommended, but not required.

It is also broken in 2.0

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread T Jake Luciani (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077946#comment-14077946
 ] 

T Jake Luciani commented on CASSANDRA-7582:
---

[~benedict] is there a ticket for broken drain?

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077955#comment-14077955
 ] 

Benedict commented on CASSANDRA-7582:
-

CASSANDRA-5911 (fixed in CASSANDRA-3578)

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077970#comment-14077970
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

bq. We just need to make sure we throw out big giant ERROR messages in the logs 
when the CL has issues. We shouldn't stop. The only thing for someone to do is 
run repair. So put out a giant message saying run repair. Don't stop the 
service from actually starting. This will be a huge annoyance for operators who 
will just add the flag and move on the first time they see it.

Exactly.

I will reopen CASSANDRA-7125.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078035#comment-14078035
 ] 

Ryan McGuire commented on CASSANDRA-7582:
-

Are there any commits with lasting effect from this ticket, or were they all 
reverted?

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078039#comment-14078039
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

No commits except the 7125 revert (a5bc52eee90e342efcdc53282612008d3dbaeaeb).

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-29 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14078159#comment-14078159
 ] 

Benedict commented on CASSANDRA-7582:
-

It's worth pointing out that this strategy of logging an ERROR, ignoring the 
fact that operators may miss it, with the recommended RF=3 this leaves a gap 
wherein broken data can be returned with QUORUM reads unless the node is 
brought up in a non-participating state until it is repaired.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076202#comment-14076202
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

How do we know what's in the system ks when all we have is a cfid that doesn't 
match anything known?

More generally, I'm not sure how stop on unknown cfid is going to be a useful 
feature.  It's definitely going to happen if you replay a commitlog after 
dropping a table, for instance, if we have an unclean shutdown in between.  
This is normal behavior and not a bug per se, so whacking users and not 
starting up is definitely antisocial.

On the other hand I can't picture a scenario where the user *can* take 
meaningful action based on failing startup here.  Put another way, ignoring the 
mutations is the Right Thing to do in every scenario I can think of.

So I propose we just log it at info and ignore.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Aleksey Yeschenko (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076203#comment-14076203
 ] 

Aleksey Yeschenko commented on CASSANDRA-7582:
--

Indeed, there is no obvious way to recover from it that I can think of. +1 on 
logging it and going on.

-Dcassandra.commitlog.stop_on_missing_tables should also go.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076399#comment-14076399
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

This was introduced by CASSANDRA-7125 for 2.1.1 and is not in the 2.1.0 branch. 
 Is this actually a problem with rc4 [~enigmacurry]?

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076465#comment-14076465
 ] 

Ryan McGuire commented on CASSANDRA-7582:
-

I think this was version tagged incorrectly. I'm seeing CASSANDRA-7593 on rc4 
instead of this one.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076469#comment-14076469
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

Thanks, Ryan.

Benedict, I'm starting to think 7125 was misguided.  If the CL errors out there 
just isn't much you can do about it except pass the flag and try again, so why 
not cut out the extra step?

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076509#comment-14076509
 ] 

Benedict commented on CASSANDRA-7582:
-

I'm -1000 on encountering an error and silently swallowing it on something as 
core to correctness as the commit log - this at least gives the user a big red 
flag they may want to seek expert help. I think there are two distinct problems 
here - there are the 'unexpected' errors which should almost certainly involve 
the user seeking help from an expert to diagnose (or perhaps JIRA, since it 
possibly means a bug), and the unknown table exceptions. The latter are 
debatably more ok to ignore, but I would much rather we simply retain 
information about dropped tables, much as we do truncated tables, so that we 
can suppress those known to have been dropped (with knowledge of exactly _when_ 
they were dropped, so if we see CL records past that time we can still fail and 
ask the user to at least file a bug report). 

Consider the following (pretty plausible scenario):

* User turns on CL saving
* User creates table X, populates it with some data (let's say it's a fairly 
static dataset) 
* User uses the database for a period, mostly changing other tables
* At time T, user drops table X, recreates it (instead of, e.g. truncate (which 
is separately also subtly dangerous in this scenario), and repopulates it with 
subtly but business-wise importantly different data
* Some time after T, user has to restore the cluster, and restores the schema 
from prior to T by mistake (let's say the team member restoring doesn't realise 
the table was recreated since then), then performs a PIT restore

The user now has no idea they have stale business data in their tables. Now, 
assuming we have saved the ids of all dropped tables we could report to the 
user that they are likely restoring data from a future schema, and they could 
then decide if this was safe or not; in this case they would be able to restore 
a newer schema (assuming they had saved it) and a major business error would 
have been averted.

In general this fail-fast is likely to result in an increase in JIRA filing, 
and possibly for relatively benign bugs, but on the whole I would prefer that 
scenario than leaving subtle bugs in the CL. We've already caught at least one 
as a result of this, and we've had long standing bugs with respect to drain 
that still affect 2.0 that would have been caught a long time ago with better 
reporting.



 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 

[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076516#comment-14076516
 ] 

Benedict commented on CASSANDRA-7582:
-

Hmm. Separately this scenario also points out that TRUNCATE is even more broken 
than I thought - since it doesn't get logged to the CL, if you restore a schema 
prior to a TRUNCATE you will simply get the old data supplemented with the new 
data.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076540#comment-14076540
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

I see two actual classes of CL errors:

# Table is dropped and we are replaying stale data that should also have been 
dropped.  Blocking startup is the Wrong Solution.
# Hardware problem caused a checksum mismatch.  Blocking startup is the Wrong 
Solution.

Granted that blocking startup can help prevent user errors during PIT recover, 
that's an entirely hypothetical situation today; PIT is only nominally usable.  
(Fork the JVM every time a CL segment finishes?  Yeah.)  So let's not optimize 
for that at the expense of scenarios we see frequently.

I think we should roll back 7125 until we can do it right.  Doing it right 
probably means, remembering old cfids in 2.1.x, then we can get paranoid about 
seeing them in the CL for 3.0.  (Getting paranoid in the same version as we 
start remembering is bad for obvious reasons.)

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076586#comment-14076586
 ] 

Benedict commented on CASSANDRA-7582:
-

3. We've busted something

This is the main type I'm trying to catch with this behaviour. I would prefer 
to know earlier if 2.1 is broken instead of corrupting the user's CL in some 
way without realising it. We've had several bugs in the 2.1 release cycle that 
would have been caught earlier had we had this feature enabled, and I would be 
surprised if we don't see some more once it gets released into the wild as a 
result of this. There are still bugs in 2.0 that we've fixed in 2.1 that we 
would certainly have caught earlier.

Enforcing correctness from other avenues is a strong secondary concern. This 
isn't a point of optimisation, we're talking about providing an unsafe PIT 
feature (and we've already got a ticket filed for removing forking), and also 
more importantly risking an unsafe regular _replay_. I disagree that hardware 
problem causing checksum mismatch shouldn't block startup - in this case you 
may have alternative copies of the data that are not corrupted, or can choose 
to analyse the logs yourself to establish what is happening. If you don't care, 
you set the don't care flag; but without the failure you maybe don't even 
know there are records that haven't been replayed (possibly whole files)


 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076721#comment-14076721
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

But that's still not a common *production* scenario.  So we're still optimizing 
bassackwards.

How about this?  Leave the checks in, but backwards: they're disabled, *unless* 
there's a flag.  Then we set the flag in utest and dtest.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076746#comment-14076746
 ] 

Benedict commented on CASSANDRA-7582:
-

What isn't a common production scenario? Commit Log bugs? We know there are 
some still in 2.0. There are potentially some in 2.1 too, and we probably won't 
spot them without something like this to help users know they encountered them 
and report them. Optimizing != Correctness.

I am very negative on disabling this.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076797#comment-14076797
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

I'm skeptical.  Looking at the 2.0 changelog, we've fixed CASSANDRA-6652 and 
CASSANDRA-6714 since 2.0.0 final, and this wouldn't have helped catch those.

So, I'm not saying that ignoring errors is a Good Thing, but when there's more 
false positives than true positives, users will learn to ignore it anyway and 
we're not actually helping anyone.

At the very least, this is demonstrably broken in 2.1.1 given this ticket right 
here.  So I see two reasonable courses of action:

# remembering old cfids in 2.1.x, then we can get paranoid about seeing them in 
the CL for 3.0.
# using the checks as a kind of assert that we enable for tests but not 
(without opt-in) for production 

I'm open to alternatives, but leaving things the way they are now is not one of 
them.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14076820#comment-14076820
 ] 

Benedict commented on CASSANDRA-7582:
-

I am less adamant about CfId checks as I am about failing on commit log 
checksum/mutation replay failures. I could just about live with (2), but 
naturally we will get better coverage by enabling this with all users. We don't 
know what bugs we might catch with it. So, I would prefer one of:

1) Start remembering old cfids in 2.1.1 along with this feature, so we can 
start complaining immediately; or
2) For now simply assert on non-CfId errors (i.e. make that opt-in rather than 
opt-out), introduce CfId recording at some point and make it opt-out at some 
point after

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-28 Thread Jonathan Ellis (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14077317#comment-14077317
 ] 

Jonathan Ellis commented on CASSANDRA-7582:
---

I suppose we can compromise on enabling the check as soon as we remember the 
cfids, even though that leaves a hole where we can false-positive on upgrade.

How sure are you that the MalformedCommitLogException aren't going to 
false-positive on power failure?  On first inspection all of those except the 
serializedSize check look like they will be prone to that.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Benedict
Priority: Critical
 Fix For: 2.1.1


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-23 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071468#comment-14071468
 ] 

Benedict commented on CASSANDRA-7582:
-

I would feel safer ignoring a specific whitelist, in case of some weird 
unexpected bug around some later table in the system keyspace.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Russ Hatch
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-22 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069907#comment-14069907
 ] 

Benedict commented on CASSANDRA-7582:
-

Looks like this was probably broken all along, and the default fail startup on 
CL failure is what exposed it.



 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Russ Hatch
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-22 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070655#comment-14070655
 ] 

Ryan McGuire commented on CASSANDRA-7582:
-

This also affects the snapshot_test.TestArchiveCommitlog tests.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Russ Hatch
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-22 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070668#comment-14070668
 ] 

Benedict commented on CASSANDRA-7582:
-

Well, sort of. We knew that was broken already - whether or not it's the same 
root cause I'm not convinced. That failure is because we never fixed the tests 
after deciding we weren't going to 'fix' behaviour in CASSANDRA-6694 (related 
to CASSANDRA-5202)

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Russ Hatch
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-22 Thread Benedict (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14070913#comment-14070913
 ] 

Benedict commented on CASSANDRA-7582:
-

We've dropped system.NodeIdInfo, but drain on 2.0 is a little bit broken and so 
we don't end up with the clean CL we should do, and we barf when we see 
mutations for this table. We should probably create a whitelist of cfIds we'll 
accept being missing, as this will no doubt crop up in future as well. 

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Russ Hatch
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-22 Thread Brandon Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14071099#comment-14071099
 ] 

Brandon Williams commented on CASSANDRA-7582:
-

Seems like anything in the system ks is fair game to silently ignore.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
  Components: Core
Reporter: Ryan McGuire
Assignee: Russ Hatch
Priority: Critical
 Fix For: 2.1.0


 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (CASSANDRA-7582) 2.1 multi-dc upgrade errors

2014-07-21 Thread Ryan McGuire (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14069788#comment-14069788
 ] 

Ryan McGuire commented on CASSANDRA-7582:
-

[~rhatch] can you repro and bisect this? It was passing last week.

 2.1 multi-dc upgrade errors
 ---

 Key: CASSANDRA-7582
 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
 Project: Cassandra
  Issue Type: Bug
Reporter: Ryan McGuire
Assignee: Russ Hatch

 Multi-dc upgrade [was working from 2.0 - 2.1 fairly 
 recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
  but is currently failing.
 Running 
 upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
  I get the following errors when starting 2.1 upgraded from 2.0:
 {code}
 ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
 failed due to replaying a mutation for a missing table. This error can be 
 ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
 the command line
 ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
 encountered during startup
 java.lang.RuntimeException: 
 org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
 cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
 [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
  [main/:na]
 at 
 org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
 [main/:na]
 Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
 find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
  ~[main/:na]
 at 
 org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
  ~[main/:na]
 at 
 org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
  ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
 ~[main/:na]
 at 
 org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
 ~[main/:na]
 {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)