[ 
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186245#comment-17186245
 ] 

Ekaterina Dimitrova commented on CASSANDRA-16063:
-------------------------------------------------

 These are the four branches I worked on for this patch:

[C* 
3.0|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.0] 
| [C* 
3.11|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.11]
 | [trunk 
|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063] | 
[DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest/tree/CASSANDRA-16063]
 

4 things have been done:
 1) Check SSTables for latest version before dropping compact storage commits - 
[3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/9ff9130808c751c9253bdecaa27c453bb5e7a71c]
 and 
[3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/c0c43e90644b28b9b363fa7aba55adbf95dd5bd7]
 2) Allow skipping commit log replay was not failing on descriptor errors, this 
was corrected on all branches in order to support our strategy from the next 
point.
 
[3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/dd6207e6639576a8090ea5930f46c3cf2a4a5971]
 | 
[3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/ad6f1f9d5106b07a001a847a63bb4b3068b0c599]
 | 
[trunk|https://github.com/ekaterinadimitrova2/cassandra/commit/62affe6e9e5dc654ba729a97d136e16c37f392fd]
 3) Move compact storage validation earlier in startup process with detailed 
[instructions|https://github.com/ekaterinadimitrova2/cassandra/commit/c4835158f2815ab90bf6fdb907e95861984f2c72#diff-2220885f2835194f87cce0d27ac87c73R915]
 to the users 
[here|https://github.com/ekaterinadimitrova2/cassandra/commit/c4835158f2815ab90bf6fdb907e95861984f2c72]

With this change we no longer write out sstable data (only a CL segment).

4) Two upgrade test created 
[here|https://github.com/ekaterinadimitrova2/cassandra-dtest/commits/CASSANDRA-16063]
Trunk CI run: 

[Java 
8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/338/workflows/324e7e64-097e-4cb3-8521-d5c4eacaca9c]
 and [Java 
11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/338/workflows/559d91f6-d80a-4eb9-8211-7220db8618a5]
 CI runs did not show anything disturbing. The failures presented look related 
to the latest timeouts and incremental repair failures which are tackled by the 
community this week. I will check tomorrow whether any of them requires a new 
ticket to be raised. 

[~slebresne] do you mind to make a review, please? Also, I need the upgrade 
tests to be pushed in Jenkins as they did not appear in CircleCI and I have 
some troubles to run them locally. Do you mind to help me with that too? (I 
have no access there)

> Fix user experience when upgrading to 4.0 with compact tables
> -------------------------------------------------------------
>
>                 Key: CASSANDRA-16063
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Legacy/CQL
>            Reporter: Sylvain Lebresne
>            Assignee: Ekaterina Dimitrova
>            Priority: Normal
>             Fix For: 4.0-beta
>
>
> The code to handle compact tables has been removed from 4.0, and the intended 
> upgrade path to 4.0 for users having compact tables on 3.x is that they must 
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables 
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table) 
> and may try upgrading despite still having compact tables. If they do so, the 
> intent is that the node will _not_ start, with a message clearly indicating 
> the pre-upgrade step the user has missed. The user will then downgrade back 
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and 
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables 
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within 
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
>   But by then, we've _at least_ called 
> {{SystemKeyspace.persistLocalMetadata()}}} and 
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, 
> and even possibly flush new {{na}} format sstables. As a results, a user 
> might not be able to seemlessly restart the node on 3.x (to drop compact 
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0 
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0 
> with compact storage, you can downgrade back with no intervention whatsoever).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to