[ 
https://issues.apache.org/jira/browse/CASSANDRA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17193009#comment-17193009
 ] 

Stefan Miklosovic edited comment on CASSANDRA-15158 at 9/9/20, 4:19 PM:
------------------------------------------------------------------------

I have improved the original work of mine and I wrote a test for that. Jenkins 
build does not fail anymore so I believe I have totally on par solution when it 
comes to dtests as I do not have time to fix dtests which the other solution 
breaks. While I admit that the improved version is technicaly more superior, 
the necessity to have clean build and same behaviour when it comes to dtests is 
more important to me at this moment. It would be awesome if dtests and issues I 
spotted are resolved though.

The test is here (1), the main logic is that a cluster of two nodes is started, 
the third node is started afterwards and I am dropping all migration messages 
to the other two, simulating some communication error between them. After some 
time, migration messages starts to flow again. So by doing this, I ll test the 
internals of the logic I wrote and it seems to do its job.

One issue I am little bit concerned of is that StorageService is issuing schema 
migration requests on "onAlive, onJoin ..." in StorageService and these 
requests are not part of the waitForSchema() logic. It is understandable that 
it is like that as we need to track migration requests after a node fully 
bootstraps but we should skip this from happening when a node is under 
bootstrapping. I wrapped the bodies of these methods into "if (hasJoined())" 
but it was invoked anyway. However, it does not matter too much if this is 
outside of the logic I did because if schema migration was sucessful, the 
rewritten logic in waitForSchema does not have anything to deal with so we are 
done anyway. For skipping this in test, I used ByteBuddy to intercept 
MigrationManager#scheduleSchemaPull to do nothing hence I effectively skip 
migration schemas to be sent outside of the change I did.

(1) 
https://github.com/instaclustr/cassandra/blob/0ceb1d6edb55916e68ae436e99c932e5ce28f68a/test/distributed/org/apache/cassandra/distributed/test/BootstrappingSchemaAgreementTest.java




was (Author: stefan.miklosovic):
I have improved the original work of mine and I wrote a test for that. Jenkins 
build does not fail anymore so I believe I have totally on par solution when it 
comes to dtests as I do not have time to fix dtests which the other solution 
breaks. While I admit that the improved version is technicaly more superior, 
the necessity to have clean build and same behaviour when it comes to dtests is 
more important to me at this moment. It would be awesome if dtests and issues I 
spotted are resolved thought.

The test is here (1), the main logic is that a cluster of two nodes is started, 
the third node is started afterwards and I am dropping all migration messages 
to the other two, simulating some communication error between them. After some 
time, migration messages starts to flow again. So by doing this, I ll test the 
internals of the logic I wrote and it seems to do its job.

One issue I am little bit concerned of is that StorageService is issuing schema 
migration requests on "onAlive, onJoin ..." in StorageService and these 
requests are not part of the waitForSchema() logic. It is understandable that 
it is like that as we need to track migration requests after a node fully 
bootstraps but we should skip this from happening when a node is under 
bootstrapping. I wrapped the bodies of these methods into "if (hasJoined())" 
but it was invoked anyway. However, it does not matter too much if this is 
outside of the logic I did because if schema migration was sucessful, the 
rewritten logic in waitForSchema does not have anything to deal with so we are 
done anyway. For skipping this in test, I used ByteBuddy to intercept 
MigrationManager#scheduleSchemaPull to do nothing hence I effectively skip 
migration schemas to be sent outside of the change I did.

(1) 
https://github.com/instaclustr/cassandra/blob/0ceb1d6edb55916e68ae436e99c932e5ce28f68a/test/distributed/org/apache/cassandra/distributed/test/BootstrappingSchemaAgreementTest.java



> Wait for schema agreement rather than in flight schema requests when 
> bootstrapping
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15158
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15158
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip, Cluster/Schema
>            Reporter: Vincent White
>            Assignee: Blake Eggleston
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when a node is bootstrapping we use a set of latches 
> (org.apache.cassandra.service.MigrationTask#inflightTasks) to keep track of 
> in-flight schema pull requests, and we don't proceed with 
> bootstrapping/stream until all the latches are released (or we timeout 
> waiting for each one). One issue with this is that if we have a large schema, 
> or the retrieval of the schema from the other nodes was unexpectedly slow 
> then we have no explicit check in place to ensure we have actually received a 
> schema before we proceed.
> While it's possible to increase "migration_task_wait_in_seconds" to force the 
> node to wait on each latche longer, there are cases where this doesn't help 
> because the callbacks for the schema pull requests have expired off the 
> messaging service's callback map 
> (org.apache.cassandra.net.MessagingService#callbacks) after 
> request_timeout_in_ms (default 10 seconds) before the other nodes were able 
> to respond to the new node.
> This patch checks for schema agreement between the bootstrapping node and the 
> rest of the live nodes before proceeding with bootstrapping. It also adds a 
> check to prevent the new node from flooding existing nodes with simultaneous 
> schema pull requests as can happen in large clusters.
> Removing the latch system should also prevent new nodes in large clusters 
> getting stuck for extended amounts of time as they wait 
> `migration_task_wait_in_seconds` on each of the latches left orphaned by the 
> timed out callbacks.
>  
> ||3.11||
> |[PoC|https://github.com/apache/cassandra/compare/cassandra-3.11...vincewhite:check_for_schema]|
> |[dtest|https://github.com/apache/cassandra-dtest/compare/master...vincewhite:wait_for_schema_agreement]|
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

Reply via email to