[ 
https://issues.apache.org/jira/browse/CASSANDRA-7582?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077792#comment-14077792
 ] 

Benedict commented on CASSANDRA-7582:
-------------------------------------

Right, and this hole is down to a CL bug we would likely have caught had we had 
this enabled previously. A correctly functioning drain would not permit this to 
happen. However this is really a non-issue, since the only plausibly affected 
table is going to be this system table we dropped, as otherwise there would 
have to be a table drop immediately prior to shutting down the node for 
upgrade. Especially since schema disagreement during upgrade is a no-no, this 
should not happen. Since drain does sometimes work, this should reduce the risk 
profile even further.

Further, we can relatively safely prevent almost all power-failure exceptions 
by introducing the change I suggested, and ignoring any errors on CLS reading 
if the header hashes are consistent with the header's id (which we now have 
available to us), and this id is in the past, as this is obviously a recycled 
segment that had not yet had its header reset before a power failure. This 
leaves only those that managed to write only a partial block during a power 
failure, which will be dealt with by OS journalling (and should be impossible 
on SSDs anyway). So I don't think there are any power-off risk scenarios left 
to warn about.


> 2.1 multi-dc upgrade errors
> ---------------------------
>
>                 Key: CASSANDRA-7582
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7582
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Ryan McGuire
>            Assignee: Benedict
>            Priority: Critical
>             Fix For: 2.1.1
>
>
> Multi-dc upgrade [was working from 2.0 -> 2.1 fairly 
> recently|http://cassci.datastax.com/job/cassandra_upgrade_dtest/55/testReport/upgrade_through_versions_test/TestUpgrade_from_cassandra_2_0_latest_tag_to_cassandra_2_1_HEAD/],
>  but is currently failing.
> Running 
> upgrade_through_versions_test.py:TestUpgrade_from_cassandra_2_0_HEAD_to_cassandra_2_1_HEAD.bootstrap_multidc_test
>  I get the following errors when starting 2.1 upgraded from 2.0:
> {code}
> ERROR [main] 2014-07-21 23:54:20,862 CommitLog.java:143 - Commit log replay 
> failed due to replaying a mutation for a missing table. This error can be 
> ignored by providing -Dcassandra.commitlog.stop_on_missing_tables=false on 
> the command line
> ERROR [main] 2014-07-21 23:54:20,869 CassandraDaemon.java:474 - Exception 
> encountered during startup
> java.lang.RuntimeException: 
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find 
> cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
>         at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:300) 
> [main/:na]
>         at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:457)
>  [main/:na]
>         at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:546) 
> [main/:na]
> Caused by: org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't 
> find cfId=a1b676f3-0c5d-3276-bfd5-07cf43397004
>         at 
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeCfId(ColumnFamilySerializer.java:164)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:97)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserializeOneCf(Mutation.java:353)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:333)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:365)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(CommitLogReplayer.java:98)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:137) 
> ~[main/:na]
>         at 
> org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:115) 
> ~[main/:na]
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to