[
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17186245#comment-17186245
]
Ekaterina Dimitrova edited comment on CASSANDRA-16063 at 8/28/20, 4:25 AM:
---------------------------------------------------------------------------
These are the four branches I worked on for this patch:
[C*
3.0|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.0]
| [C*
3.11|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.11]
| [trunk
|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063] |
[DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest/tree/CASSANDRA-16063]
4 things have been done:
1) Check SSTables for latest version before dropping compact storage commits -
[3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/9ff9130808c751c9253bdecaa27c453bb5e7a71c]
and
[3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/c0c43e90644b28b9b363fa7aba55adbf95dd5bd7]
2) Allow skipping commit log replay was not failing on descriptor errors, this
was corrected on all branches in order to support our strategy from the next
point.
[3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/dd6207e6639576a8090ea5930f46c3cf2a4a5971]
|
[3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/ad6f1f9d5106b07a001a847a63bb4b3068b0c599]
|
[trunk|https://github.com/ekaterinadimitrova2/cassandra/commit/62affe6e9e5dc654ba729a97d136e16c37f392fd]
3) Move compact storage validation earlier in startup process with detailed
[instructions|https://github.com/ekaterinadimitrova2/cassandra/commit/c4835158f2815ab90bf6fdb907e95861984f2c72#diff-2220885f2835194f87cce0d27ac87c73R915]
to the users
[here|https://github.com/ekaterinadimitrova2/cassandra/commit/c4835158f2815ab90bf6fdb907e95861984f2c72]
With this change we no longer write out sstable data (only a CL segment).
4) Two upgrade test created
[here|https://github.com/ekaterinadimitrova2/cassandra-dtest/commits/CASSANDRA-16063]
Trunk CI run:
[Java
8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/338/workflows/324e7e64-097e-4cb3-8521-d5c4eacaca9c]
and [Java
11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/338/workflows/559d91f6-d80a-4eb9-8211-7220db8618a5]
CI runs did not show anything disturbing. The failures presented look related
to the latest timeouts and incremental repair failures which are tackled by the
community this week. I will check tomorrow whether any of them requires a new
ticket to be raised.
[~slebresne] do you mind to make a review, please? Also, I need the upgrade
tests to be pushed in Jenkins as they did not appear in CircleCI and I have
some troubles to run them locally. Do you mind to help me with that too? (I
have no access there)
_About the order of commits, I would say:
1st the tests; Then 1) -> 2) -> 3)_
was (Author: e.dimitrova):
These are the four branches I worked on for this patch:
[C*
3.0|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.0]
| [C*
3.11|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.11]
| [trunk
|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063] |
[DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest/tree/CASSANDRA-16063]
4 things have been done:
1) Check SSTables for latest version before dropping compact storage commits -
[3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/9ff9130808c751c9253bdecaa27c453bb5e7a71c]
and
[3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/c0c43e90644b28b9b363fa7aba55adbf95dd5bd7]
2) Allow skipping commit log replay was not failing on descriptor errors, this
was corrected on all branches in order to support our strategy from the next
point.
[3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/dd6207e6639576a8090ea5930f46c3cf2a4a5971]
|
[3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/ad6f1f9d5106b07a001a847a63bb4b3068b0c599]
|
[trunk|https://github.com/ekaterinadimitrova2/cassandra/commit/62affe6e9e5dc654ba729a97d136e16c37f392fd]
3) Move compact storage validation earlier in startup process with detailed
[instructions|https://github.com/ekaterinadimitrova2/cassandra/commit/c4835158f2815ab90bf6fdb907e95861984f2c72#diff-2220885f2835194f87cce0d27ac87c73R915]
to the users
[here|https://github.com/ekaterinadimitrova2/cassandra/commit/c4835158f2815ab90bf6fdb907e95861984f2c72]
With this change we no longer write out sstable data (only a CL segment).
4) Two upgrade test created
[here|https://github.com/ekaterinadimitrova2/cassandra-dtest/commits/CASSANDRA-16063]
Trunk CI run:
[Java
8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/338/workflows/324e7e64-097e-4cb3-8521-d5c4eacaca9c]
and [Java
11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/338/workflows/559d91f6-d80a-4eb9-8211-7220db8618a5]
CI runs did not show anything disturbing. The failures presented look related
to the latest timeouts and incremental repair failures which are tackled by the
community this week. I will check tomorrow whether any of them requires a new
ticket to be raised.
[~slebresne] do you mind to make a review, please? Also, I need the upgrade
tests to be pushed in Jenkins as they did not appear in CircleCI and I have
some troubles to run them locally. Do you mind to help me with that too? (I
have no access there)
> Fix user experience when upgrading to 4.0 with compact tables
> -------------------------------------------------------------
>
> Key: CASSANDRA-16063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
> Project: Cassandra
> Issue Type: Bug
> Components: Legacy/CQL
> Reporter: Sylvain Lebresne
> Assignee: Ekaterina Dimitrova
> Priority: Normal
> Fix For: 4.0-beta
>
>
> The code to handle compact tables has been removed from 4.0, and the intended
> upgrade path to 4.0 for users having compact tables on 3.x is that they must
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table)
> and may try upgrading despite still having compact tables. If they do so, the
> intent is that the node will _not_ start, with a message clearly indicating
> the pre-upgrade step the user has missed. The user will then downgrade back
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
> But by then, we've _at least_ called
> {{SystemKeyspace.persistLocalMetadata()}}} and
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log,
> and even possibly flush new {{na}} format sstables. As a results, a user
> might not be able to seemlessly restart the node on 3.x (to drop compact
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0
> with compact storage, you can downgrade back with no intervention whatsoever).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]