[
https://issues.apache.org/jira/browse/CASSANDRA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193009#comment-17193009
]
Stefan Miklosovic commented on CASSANDRA-15158:
-----------------------------------------------
I have improved the original work of mine and I wrote a test for that. Jenkins
build does not fail anymore so I believe I have totally on par solution when it
comes to dtests as I do not have time to fix dtests which the other solution
breaks. While I admit that the improved version is technicaly more superior,
the necessity to have clean build and same behaviour when it comes to dtests is
more important to me at this moment. It would be awesome if dtests and issues I
spotted are resolved thought.
The test is here (1), the main logic is that a cluster of two nodes is started,
the third node is started afterwards and I am dropping all migration messages
to the other two, simulating some communication error between them. After some
time, migration messages starts to flow again. So by doing this, I ll test the
internals of the logic I wrote and it seems to do its job.
One issue I am little bit concerned of is that StorageService is issuing schema
migration requests on "onAlive, onJoin ..." in StorageService and these
requests are not part of the waitForSchema() logic. It is understandable that
it is like that as we need to track migration requests after a node fully
bootstraps but we should skip this from happening when a node is under
bootstrapping. I wrapped the bodies of these methods into "if (hasJoined())"
but it was invoked anyway. However, it does not matter too much if this is
outside of the logic I did because if schema migration was sucessful, the
rewritten logic in waitForSchema does not have anything to deal with so we are
done anyway. For skipping this in test, I used ByteBuddy to intercept
MigrationManager#scheduleSchemaPull to do nothing hence I effectively skip
migration schemas to be sent outside of the change I did.
(1)
https://github.com/instaclustr/cassandra/blob/0ceb1d6edb55916e68ae436e99c932e5ce28f68a/test/distributed/org/apache/cassandra/distributed/test/BootstrappingSchemaAgreementTest.java
> Wait for schema agreement rather than in flight schema requests when
> bootstrapping
> ----------------------------------------------------------------------------------
>
> Key: CASSANDRA-15158
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15158
> Project: Cassandra
> Issue Type: Bug
> Components: Cluster/Gossip, Cluster/Schema
> Reporter: Vincent White
> Assignee: Blake Eggleston
> Priority: Normal
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently when a node is bootstrapping we use a set of latches
> (org.apache.cassandra.service.MigrationTask#inflightTasks) to keep track of
> in-flight schema pull requests, and we don't proceed with
> bootstrapping/stream until all the latches are released (or we timeout
> waiting for each one). One issue with this is that if we have a large schema,
> or the retrieval of the schema from the other nodes was unexpectedly slow
> then we have no explicit check in place to ensure we have actually received a
> schema before we proceed.
> While it's possible to increase "migration_task_wait_in_seconds" to force the
> node to wait on each latche longer, there are cases where this doesn't help
> because the callbacks for the schema pull requests have expired off the
> messaging service's callback map
> (org.apache.cassandra.net.MessagingService#callbacks) after
> request_timeout_in_ms (default 10 seconds) before the other nodes were able
> to respond to the new node.
> This patch checks for schema agreement between the bootstrapping node and the
> rest of the live nodes before proceeding with bootstrapping. It also adds a
> check to prevent the new node from flooding existing nodes with simultaneous
> schema pull requests as can happen in large clusters.
> Removing the latch system should also prevent new nodes in large clusters
> getting stuck for extended amounts of time as they wait
> `migration_task_wait_in_seconds` on each of the latches left orphaned by the
> timed out callbacks.
>
> ||3.11||
> |[PoC|https://github.com/apache/cassandra/compare/cassandra-3.11...vincewhite:check_for_schema]|
> |[dtest|https://github.com/apache/cassandra-dtest/compare/master...vincewhite:wait_for_schema_agreement]|
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]