[ https://issues.apache.org/jira/browse/CASSANDRA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193009#comment-17193009 ]
Stefan Miklosovic edited comment on CASSANDRA-15158 at 9/9/20, 4:19 PM: ------------------------------------------------------------------------ I have improved the original work of mine and I wrote a test for that. Jenkins build does not fail anymore so I believe I have totally on par solution when it comes to dtests as I do not have time to fix dtests which the other solution breaks. While I admit that the improved version is technicaly more superior, the necessity to have clean build and same behaviour when it comes to dtests is more important to me at this moment. It would be awesome if dtests and issues I spotted are resolved though. The test is here (1), the main logic is that a cluster of two nodes is started, the third node is started afterwards and I am dropping all migration messages to the other two, simulating some communication error between them. After some time, migration messages starts to flow again. So by doing this, I ll test the internals of the logic I wrote and it seems to do its job. One issue I am little bit concerned of is that StorageService is issuing schema migration requests on "onAlive, onJoin ..." in StorageService and these requests are not part of the waitForSchema() logic. It is understandable that it is like that as we need to track migration requests after a node fully bootstraps but we should skip this from happening when a node is under bootstrapping. I wrapped the bodies of these methods into "if (hasJoined())" but it was invoked anyway. However, it does not matter too much if this is outside of the logic I did because if schema migration was sucessful, the rewritten logic in waitForSchema does not have anything to deal with so we are done anyway. For skipping this in test, I used ByteBuddy to intercept MigrationManager#scheduleSchemaPull to do nothing hence I effectively skip migration schemas to be sent outside of the change I did. (1) https://github.com/instaclustr/cassandra/blob/0ceb1d6edb55916e68ae436e99c932e5ce28f68a/test/distributed/org/apache/cassandra/distributed/test/BootstrappingSchemaAgreementTest.java was (Author: stefan.miklosovic): I have improved the original work of mine and I wrote a test for that. Jenkins build does not fail anymore so I believe I have totally on par solution when it comes to dtests as I do not have time to fix dtests which the other solution breaks. While I admit that the improved version is technicaly more superior, the necessity to have clean build and same behaviour when it comes to dtests is more important to me at this moment. It would be awesome if dtests and issues I spotted are resolved thought. The test is here (1), the main logic is that a cluster of two nodes is started, the third node is started afterwards and I am dropping all migration messages to the other two, simulating some communication error between them. After some time, migration messages starts to flow again. So by doing this, I ll test the internals of the logic I wrote and it seems to do its job. One issue I am little bit concerned of is that StorageService is issuing schema migration requests on "onAlive, onJoin ..." in StorageService and these requests are not part of the waitForSchema() logic. It is understandable that it is like that as we need to track migration requests after a node fully bootstraps but we should skip this from happening when a node is under bootstrapping. I wrapped the bodies of these methods into "if (hasJoined())" but it was invoked anyway. However, it does not matter too much if this is outside of the logic I did because if schema migration was sucessful, the rewritten logic in waitForSchema does not have anything to deal with so we are done anyway. For skipping this in test, I used ByteBuddy to intercept MigrationManager#scheduleSchemaPull to do nothing hence I effectively skip migration schemas to be sent outside of the change I did. (1) https://github.com/instaclustr/cassandra/blob/0ceb1d6edb55916e68ae436e99c932e5ce28f68a/test/distributed/org/apache/cassandra/distributed/test/BootstrappingSchemaAgreementTest.java > Wait for schema agreement rather than in flight schema requests when > bootstrapping > ---------------------------------------------------------------------------------- > > Key: CASSANDRA-15158 > URL: https://issues.apache.org/jira/browse/CASSANDRA-15158 > Project: Cassandra > Issue Type: Bug > Components: Cluster/Gossip, Cluster/Schema > Reporter: Vincent White > Assignee: Blake Eggleston > Priority: Normal > Time Spent: 10m > Remaining Estimate: 0h > > Currently when a node is bootstrapping we use a set of latches > (org.apache.cassandra.service.MigrationTask#inflightTasks) to keep track of > in-flight schema pull requests, and we don't proceed with > bootstrapping/stream until all the latches are released (or we timeout > waiting for each one). One issue with this is that if we have a large schema, > or the retrieval of the schema from the other nodes was unexpectedly slow > then we have no explicit check in place to ensure we have actually received a > schema before we proceed. > While it's possible to increase "migration_task_wait_in_seconds" to force the > node to wait on each latche longer, there are cases where this doesn't help > because the callbacks for the schema pull requests have expired off the > messaging service's callback map > (org.apache.cassandra.net.MessagingService#callbacks) after > request_timeout_in_ms (default 10 seconds) before the other nodes were able > to respond to the new node. > This patch checks for schema agreement between the bootstrapping node and the > rest of the live nodes before proceeding with bootstrapping. It also adds a > check to prevent the new node from flooding existing nodes with simultaneous > schema pull requests as can happen in large clusters. > Removing the latch system should also prevent new nodes in large clusters > getting stuck for extended amounts of time as they wait > `migration_task_wait_in_seconds` on each of the latches left orphaned by the > timed out callbacks. > > ||3.11|| > |[PoC|https://github.com/apache/cassandra/compare/cassandra-3.11...vincewhite:check_for_schema]| > |[dtest|https://github.com/apache/cassandra-dtest/compare/master...vincewhite:wait_for_schema_agreement]| > -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org