[ https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186245#comment-17186245 ]
Ekaterina Dimitrova commented on CASSANDRA-16063: ------------------------------------------------- These are the four branches I worked on for this patch: [C* 3.0|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.0] | [C* 3.11|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.11] | [trunk |https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063] | [DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest/tree/CASSANDRA-16063] 4 things have been done: 1) Check SSTables for latest version before dropping compact storage commits - [3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/9ff9130808c751c9253bdecaa27c453bb5e7a71c] and [3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/c0c43e90644b28b9b363fa7aba55adbf95dd5bd7] 2) Allow skipping commit log replay was not failing on descriptor errors, this was corrected on all branches in order to support our strategy from the next point. [3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/dd6207e6639576a8090ea5930f46c3cf2a4a5971] | [3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/ad6f1f9d5106b07a001a847a63bb4b3068b0c599] | [trunk|https://github.com/ekaterinadimitrova2/cassandra/commit/62affe6e9e5dc654ba729a97d136e16c37f392fd] 3) Move compact storage validation earlier in startup process with detailed [instructions|https://github.com/ekaterinadimitrova2/cassandra/commit/c4835158f2815ab90bf6fdb907e95861984f2c72#diff-2220885f2835194f87cce0d27ac87c73R915] to the users [here|https://github.com/ekaterinadimitrova2/cassandra/commit/c4835158f2815ab90bf6fdb907e95861984f2c72] With this change we no longer write out sstable data (only a CL segment). 4) Two upgrade test created [here|https://github.com/ekaterinadimitrova2/cassandra-dtest/commits/CASSANDRA-16063] Trunk CI run: [Java 8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/338/workflows/324e7e64-097e-4cb3-8521-d5c4eacaca9c] and [Java 11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/338/workflows/559d91f6-d80a-4eb9-8211-7220db8618a5] CI runs did not show anything disturbing. The failures presented look related to the latest timeouts and incremental repair failures which are tackled by the community this week. I will check tomorrow whether any of them requires a new ticket to be raised. [~slebresne] do you mind to make a review, please? Also, I need the upgrade tests to be pushed in Jenkins as they did not appear in CircleCI and I have some troubles to run them locally. Do you mind to help me with that too? (I have no access there) > Fix user experience when upgrading to 4.0 with compact tables > ------------------------------------------------------------- > > Key: CASSANDRA-16063 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16063 > Project: Cassandra > Issue Type: Bug > Components: Legacy/CQL > Reporter: Sylvain Lebresne > Assignee: Ekaterina Dimitrova > Priority: Normal > Fix For: 4.0-beta > > > The code to handle compact tables has been removed from 4.0, and the intended > upgrade path to 4.0 for users having compact tables on 3.x is that they must > execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables > *before* attempting the upgrade. > Obviously, some users won't read the upgrade instructions (or miss a table) > and may try upgrading despite still having compact tables. If they do so, the > intent is that the node will _not_ start, with a message clearly indicating > the pre-upgrade step the user has missed. The user will then downgrade back > the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and > then upgrade again. > But while 4.0 does currently fail startup when finding any compact tables > with a decent message, I believe the check is done too late during startup. > Namely, that check is done as we read the tables schema, so within > [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241]. > But by then, we've _at least_ called > {{SystemKeyspace.persistLocalMetadata()}}} and > {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log, > and even possibly flush new {{na}} format sstables. As a results, a user > might not be able to seemlessly restart the node on 3.x (to drop compact > storage on the appropriate tables). > Basically, we should make sure the check for compact tables done at 4.0 > startup is done as a {{StartupCheck}}, before the node does anything. > We should also add a test for this (checking that if you try upgrading to 4.0 > with compact storage, you can downgrade back with no intervention whatsoever). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org