[ 
https://issues.apache.org/jira/browse/CASSANDRA-13559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16031243#comment-16031243
 ] 

Stefania commented on CASSANDRA-13559:
--------------------------------------

bq. could you please elaborate a bit further on the .13-.14 upgrade path?

When an upgraded node joins the ring, all non-upgraded nodes will pull the 
schema from it, after 1 minute. They shouldn't pull any longer from this node 
unless a real schema change happens. Since the schema is identical, a schema 
pull results in all schema mutations being applied and the schema tables being 
flushed, nothing else since the delta would be nil. When another node is 
upgraded and joins, the process repeats itself. 

Furthermore, the upgraded nodes will see the schema of the non-upgraded nodes 
as different and pull from them. Eventually the process converges, when all 
nodes are upgraded the schema versions will be the same. 

Things visible to the operator should be the schema migrations in the logs. 
Other noticeable things will be the schema version logged, which will be 
different after the upgrade, and different schema versions appearing in 
{{nodetool describecluster|describering}}. Also, in the system local table. 
This means that if any problems are likely to occur, they will mostly likely be 
client side, i.e. clients will not see a schema agreement until the upgrade is 
completed. This might cause issues in some applications. 

To limit the flow of schema pulls. the schema should not be changed during the 
upgrade, if at all possible. I'll add a section to NEWS.txt. 

That's all I could find from code inspection. [~spo...@gmail.com] rightly 
pointed out in the dev mailing list that the only urgent thing is to pull 
3.0.13. We can take our time before releasing 3.0.14, at a minimum we should 
write an upgrade test.

The startup problem is not related to the digest mismatch: the {{ALL}} fields 
exist, so I don't understand it at all. I also struggle to explain the unknown 
cf exceptions, they may be unrelated and the upgrade test should shed some 
light.

> Schema version id mismatch while upgrading to 3.0.13
> ----------------------------------------------------
>
>                 Key: CASSANDRA-13559
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-13559
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>            Reporter: Jay Zhuang
>            Assignee: Stefania
>            Priority: Blocker
>
> As the order of SchemaKeyspace is changed ([6991556 | 
> https://github.com/apache/cassandra/commit/6991556e431a51575744248a4c484270c4f918c9],
>  CASSANDRA-12213), the result of function 
> [{{calculateSchemaDigest}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/schema/SchemaKeyspace.java#L311]
>  is also changed for the same schema. Which causes schema mismatch while 
> upgrading 3.0.x -> 3.0.13.
> It could cause cassandra fail to start because Unknown CF exception. And 
> streaming will fail:
> {noformat}
> ERROR [main] 2017-05-26 18:58:57,572 CassandraDaemon.java:709 - Exception 
> encountered during startup
> java.lang.IllegalArgumentException: Unknown CF 
> 83c8eae0-3a65-11e7-9a27-e17fd11571e3
> {noformat}
> {noformat}
> WARN  [MessagingService-Incoming-/IP] 2017-05-26 19:27:11,523 
> IncomingTcpConnection.java:101 - UnknownColumnFamilyException reading from 
> socket; closing
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
> cfId 922b7940-3a65-11e7-adf3-a3ff55d9bcf1. If a table was just created, this 
> is likely due to the schema not being fully propagated.  Please wait for 
> schema agreement on table creation.
> {noformat}
> Restart the new node will cause:
> {noformat}
> Exception (java.lang.NoSuchFieldError) encountered during startup: ALL
> java.lang.NoSuchFieldError: ALL
>         at 
> org.apache.cassandra.service.ClientState.<clinit>(ClientState.java:67)
>         at 
> org.apache.cassandra.cql3.QueryProcessor$InternalStateInstance.<init>(QueryProcessor.java:155)
>         at 
> org.apache.cassandra.cql3.QueryProcessor$InternalStateInstance.<clinit>(QueryProcessor.java:149)
>         at 
> org.apache.cassandra.cql3.QueryProcessor.internalQueryState(QueryProcessor.java:163)
>         at 
> org.apache.cassandra.cql3.QueryProcessor.prepareInternal(QueryProcessor.java:286)
>         at 
> org.apache.cassandra.cql3.QueryProcessor.executeInternal(QueryProcessor.java:294)
>         at 
> org.apache.cassandra.db.SystemKeyspace.checkHealth(SystemKeyspace.java:900)
>         at 
> org.apache.cassandra.service.StartupChecks$9.execute(StartupChecks.java:354)
>         at 
> org.apache.cassandra.service.StartupChecks.verify(StartupChecks.java:110)
>         at 
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:179)
>         at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:569)
>         at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:697)
> {noformat}
> I would suggest to have the older list back for digest calculation and 
> release 3.0.14.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to