[
https://issues.apache.org/jira/browse/CASSANDRA-16387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17277729#comment-17277729
]
Caleb Rackliffe edited comment on CASSANDRA-16387 at 2/3/21, 10:40 PM:
-----------------------------------------------------------------------
As an experiment, I changed {{UpgradeTestBase}} to use {{auto_bootstrap}} by
default, and so far I've been unable to trigger the failure locally (dozens of
runs) or on
[Circle|https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16181],
where the failure was almost constant beforehand. I'm not sure if that's a
solution though, if we want to preserve the performance improvement I'm
guessing we get with concurrent {{Instance}} startup.
Perhaps all we need to do is avoid {{pushSchemaMutation()}} when we're creating
the traces keyspace at every instance. It seems redundant anyway, and I'm
actually not even sure why we push the mutations to other nodes via
{{setUpDistributedSystemKeyspaces()}} in a production scenario, given each node
has the mutations applied locally already on join. (Like forcing serial
startup, avoiding the push also stabilizes the failing upgrade tests locally.)
CC [~aleksey]
was (Author: maedhroz):
As an experiment, I changed {{UpgradeTestBase}} to use {{auto_bootstrap}} by
default, and so far I've been unable to trigger the failure locally (dozens of
runs) or on
[Circle|https://app.circleci.com/pipelines/github/maedhroz/cassandra?branch=CASSANDRA-16181],
where the failure was almost constant beforehand. I'm not sure if that's a
solution though, if we want to preserve the performance improvement I'm
guessing we get with concurrent {{Instance}} startup.
Perhaps all we need to do is avoid {{pushSchemaMutation()}} when we're creating
the traces keyspace at every instance. It seems redundant anyway, and I'm
actually not even sure why we push the mutations to other nodes via
{{setUpDistributedSystemKeyspaces()}} in a production scenario, given each node
has the mutations applied locally already on join.
CC [~aleksey]
> UpgradeTest sporadically failing on schema updates
> --------------------------------------------------
>
> Key: CASSANDRA-16387
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16387
> Project: Cassandra
> Issue Type: Bug
> Components: Test/dtest/java
> Reporter: Caleb Rackliffe
> Assignee: Caleb Rackliffe
> Priority: Normal
> Fix For: 3.0.x, 3.11.x, 4.0-rc
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> We’ve observed {{UpdateTest}} failing during what appears to be a schema
> change:
> https://app.circleci.com/pipelines/github/maedhroz/cassandra/192/workflows/ed5305e6-e4f9-420e-9f0a-6153333746dc/jobs/1068
> It almost looks like the Gossiper can’t find its own endpoint state in the
> endpoint state map, and the failure is not consistent, which might suggest a
> race.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]