[
https://issues.apache.org/jira/browse/CASSANDRA-16063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17201519#comment-17201519
]
Ekaterina Dimitrova edited comment on CASSANDRA-16063 at 9/24/20, 1:28 PM:
---------------------------------------------------------------------------
Detailed code review showed the error I was getting as not being related. Empty
segments should be skipped on startup.
I recreated my test environment. Managed to fix my issues with running the
upgrade tests locally.
No flag is needed.
This is how the solution works now:
These are the four branches I worked on for this patch:
[C*
3.0|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.0]
| [C*
3.11|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.11]
| [trunk
|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063]|
[DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest/tree/CASSANDRA-16063]
1) Check SSTables for latest version before dropping compact storage commits -
[3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/9ff9130808c751c9253bdecaa27c453bb5e7a71c]
and
[3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/c0c43e90644b28b9b363fa7aba55adbf95dd5bd7]
2) Move compact storage [validation
|https://github.com/ekaterinadimitrova2/cassandra/commit/1a8b3ea2823d8424e2018c686fb2d6e5d67270f7#diff-a5df240149285ae528cdd3c41aa59360R104]
earlier in the startup process.
4) Two new upgrade tests created and an old one was fixed
[here|https://github.com/ekaterinadimitrova2/cassandra-dtest/commits/CASSANDRA-16063]
Trunk CI run:
[java
8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/349/workflows/04bccc52-4e3e-41e2-9c04-93501ea4ce77]
and [Java
11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/349/workflows/04bccc52-4e3e-41e2-9c04-93501ea4ce77]
CI runs did not show any new issues introduced. The two failing tests already
have opened corresponding tickets:
_test_tracing_does_not_interfere_with_digest_calculation -
cql_tracing_test.TestCqlTracing - CASSANDRA-14157_
_testMessagePurging - org.apache.cassandra.net.ConnectionTest -
CASSANDRA-15958_
Attached is the log of the upgrade tests successfully passing
[~slebresne] do you have time to review it again? Or maybe [~adelapena] can
help here?
was (Author: e.dimitrova):
Detailed code review showed the error I was getting as not being related. Empty
segments should be skipped on startup.
I recreated my test environment. Managed to fix my issues with running the
upgrade tests locally.
No flag is needed.
This is how the solution works now:
These are the four branches I worked on for this patch:
[C*
3.0|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.0]
| [C*
3.11|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063-3.11]
| [trunk
|https://github.com/ekaterinadimitrova2/cassandra/tree/CASSANDRA-16063]|
[DTests|https://github.com/ekaterinadimitrova2/cassandra-dtest/tree/CASSANDRA-16063]
1) Check SSTables for latest version before dropping compact storage commits -
[3.0|https://github.com/ekaterinadimitrova2/cassandra/commit/9ff9130808c751c9253bdecaa27c453bb5e7a71c]
and
[3.11|https://github.com/ekaterinadimitrova2/cassandra/commit/c0c43e90644b28b9b363fa7aba55adbf95dd5bd7]
2) Move compact storage [validation
|https://github.com/ekaterinadimitrova2/cassandra/commit/1a8b3ea2823d8424e2018c686fb2d6e5d67270f7#diff-a5df240149285ae528cdd3c41aa59360R104]
is moved earlier in startup process.
4) Two new upgrade tests created and an old one was fixed
[here|https://github.com/ekaterinadimitrova2/cassandra-dtest/commits/CASSANDRA-16063]
Trunk CI run:
[java
8|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/349/workflows/04bccc52-4e3e-41e2-9c04-93501ea4ce77]
and [Java
11|https://app.circleci.com/pipelines/github/ekaterinadimitrova2/cassandra/349/workflows/04bccc52-4e3e-41e2-9c04-93501ea4ce77]
CI runs did not show any new issues introduced. The two failing tests already
have opened corresponding tickets:
_test_tracing_does_not_interfere_with_digest_calculation -
cql_tracing_test.TestCqlTracing - CASSANDRA-14157_
_testMessagePurging - org.apache.cassandra.net.ConnectionTest -
CASSANDRA-15958_
Attached is the log of the upgrade tests successfully passing
[~slebresne] do you have time to review it again? Or maybe [~adelapena] can
help here?
> Fix user experience when upgrading to 4.0 with compact tables
> -------------------------------------------------------------
>
> Key: CASSANDRA-16063
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16063
> Project: Cassandra
> Issue Type: Bug
> Components: Legacy/CQL
> Reporter: Sylvain Lebresne
> Assignee: Ekaterina Dimitrova
> Priority: Normal
> Fix For: 4.0-beta
>
> Attachments: Compact_storage_upgrade_tests.txt
>
>
> The code to handle compact tables has been removed from 4.0, and the intended
> upgrade path to 4.0 for users having compact tables on 3.x is that they must
> execute {{ALTER ... DROP COMPACT STORAGE}} on all of their compact tables
> *before* attempting the upgrade.
> Obviously, some users won't read the upgrade instructions (or miss a table)
> and may try upgrading despite still having compact tables. If they do so, the
> intent is that the node will _not_ start, with a message clearly indicating
> the pre-upgrade step the user has missed. The user will then downgrade back
> the node(s) to 3.x, run the proper {{ALTER ... DROP COMPACT STORAGE}}, and
> then upgrade again.
> But while 4.0 does currently fail startup when finding any compact tables
> with a decent message, I believe the check is done too late during startup.
> Namely, that check is done as we read the tables schema, so within
> [{{Schema.instance.loadFromDisk()}}|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/CassandraDaemon.java#L241].
> But by then, we've _at least_ called
> {{SystemKeyspace.persistLocalMetadata()}}} and
> {{SystemKeyspaceMigrator40.migrate()}}, which will get into the commit log,
> and even possibly flush new {{na}} format sstables. As a results, a user
> might not be able to seemlessly restart the node on 3.x (to drop compact
> storage on the appropriate tables).
> Basically, we should make sure the check for compact tables done at 4.0
> startup is done as a {{StartupCheck}}, before the node does anything.
> We should also add a test for this (checking that if you try upgrading to 4.0
> with compact storage, you can downgrade back with no intervention whatsoever).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]