[jira] [Commented] (CASSANDRA-15158) Wait for schema agreement rather than in flight schema requests when bootstrapping

Stefan Miklosovic (Jira) Mon, 27 Jul 2020 04:01:13 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17165619#comment-17165619
 ]


Stefan Miklosovic commented on CASSANDRA-15158:
-----------------------------------------------

In general looks good to me in spite of getting lost a bit on the actual schema 
migration response:

 

 
{code:java}
Future<Void> response(Collection<Mutation> mutations)
{
    synchronized (info)
    {
        if (shouldApplySchemaFrom(endpoint, info))
            mergeSchemaFrom(endpoint, mutations);
        return pullComplete(endpoint, info, true);  /// why?
    }
}
{code}
 

I am not completely sure why are we pulling again here. I would rewrite the 
whole solution in a such way that this Callable just does one thing on a 
successful response (merging of a schema) and the actual "retry" would be 
handled from outside. The reader has to make quite a mental exercise to 
visualise that this callback might actually call another callback in it until 
some "version" is completed etc ... At least for me, it was quite tedious to 
track.

I understand the motivation behind that but callback should just do one task 
and thats it, it shouldnt be responsible for potentially scheduling another 
callback recursively until some conditions on some signals or what have you are 
met ... just my 2 cents.

There is also this:

 
{code:java}
synchronized Future<Void> reportEndpointVersion(InetAddress endpoint, UUID 
version)
{
    UUID current = endpointVersions.get(endpoint);
    if (current != null && current.equals(version))
        return FINISHED_FUTURE;

    VersionInfo info = versionInfo.computeIfAbsent(version, VersionInfo::new);
    if (isLocalVersion(version))
        info.markReceived();
    info.endpoints.add(endpoint);
    info.requestQueue.add(endpoint);
    endpointVersions.put(endpoint, version);

    removeEndpointFromVersion(endpoint, current); ///// why?
    return maybePullSchema(info);
}
{code}
 

TBH that is quite counterintuitive too, maybe renaming of that method would 
help.

 

The test has failed for me (repeatedly):

 

 
{code:java}
java.lang.AssertionError: java.lang.AssertionError: 
Expected :2
Actual   :0
<Click to see difference> at org.junit.Assert.fail(Assert.java:92) at 
org.junit.Assert.failNotEquals(Assert.java:689) at 
org.junit.Assert.assertEquals(Assert.java:127) at 
org.junit.Assert.assertEquals(Assert.java:514) at 
org.junit.Assert.assertEquals(Assert.java:498) at 
org.apache.cassandra.service.MigrationCoordinatorTest.testWeKeepSendingRequests(MigrationCoordinatorTest.java:278)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) 
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:44)
 at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:180) at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:41) at 
org.junit.runners.ParentRunner$1.evaluate(ParentRunner.java:173) at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31) 
at org.junit.runners.ParentRunner.run(ParentRunner.java:220) at 
org.junit.runner.JUnitCore.run(JUnitCore.java:159) at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
{code}
 

> Wait for schema agreement rather than in flight schema requests when 
> bootstrapping
> ----------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15158
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15158
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Gossip, Cluster/Schema
>            Reporter: Vincent White
>            Assignee: Ben Bromhead
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently when a node is bootstrapping we use a set of latches 
> (org.apache.cassandra.service.MigrationTask#inflightTasks) to keep track of 
> in-flight schema pull requests, and we don't proceed with 
> bootstrapping/stream until all the latches are released (or we timeout 
> waiting for each one). One issue with this is that if we have a large schema, 
> or the retrieval of the schema from the other nodes was unexpectedly slow 
> then we have no explicit check in place to ensure we have actually received a 
> schema before we proceed.
> While it's possible to increase "migration_task_wait_in_seconds" to force the 
> node to wait on each latche longer, there are cases where this doesn't help 
> because the callbacks for the schema pull requests have expired off the 
> messaging service's callback map 
> (org.apache.cassandra.net.MessagingService#callbacks) after 
> request_timeout_in_ms (default 10 seconds) before the other nodes were able 
> to respond to the new node.
> This patch checks for schema agreement between the bootstrapping node and the 
> rest of the live nodes before proceeding with bootstrapping. It also adds a 
> check to prevent the new node from flooding existing nodes with simultaneous 
> schema pull requests as can happen in large clusters.
> Removing the latch system should also prevent new nodes in large clusters 
> getting stuck for extended amounts of time as they wait 
> `migration_task_wait_in_seconds` on each of the latches left orphaned by the 
> timed out callbacks.
>  
> ||3.11||
> |[PoC|https://github.com/apache/cassandra/compare/cassandra-3.11...vincewhite:check_for_schema]|
> |[dtest|https://github.com/apache/cassandra-dtest/compare/master...vincewhite:wait_for_schema_agreement]|
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org

[jira] [Commented] (CASSANDRA-15158) Wait for schema agreement rather than in flight schema requests when bootstrapping

Reply via email to