[ 
https://issues.apache.org/jira/browse/CASSANDRA-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845360#comment-17845360
 ] 

Cameron Zemek commented on CASSANDRA-18866:
-------------------------------------------

found a bug with this patch,
{code:java}
    private void handleMajorStateChange(InetAddressAndPort ep, EndpointState 
epState)
    {
        // omitted for brevity
        endpointStateMap.put(ep, epState);
        if (localEpState != null)
        {   // the node restarted: it is up to the subscriber to take whatever 
action is necessary
            for (IEndpointStateChangeSubscriber subscriber : subscribers)
                subscriber.onRestart(ep, localEpState);
        }
        if (!isDeadState(epState))
            markAlive(ep, epState);
 {code}
markAlive is passed the remote epState that just got put into the 
endpointStateMap which has isAlive = true on it.
{code:java}
     private void markAlive(final InetAddressAndPort addr, final EndpointState 
localState)
     {
        if (inflightEcho.contains(addr))
        {
            return;
        }
        inflightEcho.add(addr);

        localState.markDead(); {code}
 But we don't enter markAlive when already inflight echo request. So 
endpointStateMap now has entry with isAlive = true, but unreachableEndpoints 
has the down node. So now `nodetool status` and down endpoint count do not 
match.

 

The fix is have the onResponse to ECHO update the entry currently in the map. 
And always update the passed in state to dead.
{code:java}
    private void markAlive(final InetAddressAndPort addr, final EndpointState 
localState)
    {
        localState.markDead();
        if (!inflightEcho.add(addr))
        {
            return;
        }
        Message<NoPayload> echoMessage = Message.out(ECHO_REQ, noPayload);
        logger.trace("Sending ECHO_REQ to {}", addr);
        RequestCallback echoHandler = new RequestCallback()
        {
            @Override
            public void onResponse(Message msg)
            {
                // force processing of the echo response onto the gossip stage, 
as it comes in on the REQUEST_RESPONSE stage
                runInGossipStageBlocking(() -> {
                    try
                    {
                        EndpointState localEpStatePtr = 
endpointStateMap.get(addr);
                        realMarkAlive(addr, localEpStatePtr);
                    }
                    finally
                    {
                        inflightEcho.remove(addr);
                    }
                });
            }
 {code}
Not sure if this allows for race condition around endpointStateMap (eg. you 
have a call to handleMajorChange putting a new entry that gets marked dead 
after the call to get localEpStatePtr in the onResponse callback)

> Node sends multiple inflight echos
> ----------------------------------
>
>                 Key: CASSANDRA-18866
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-18866
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Cluster/Gossip
>            Reporter: Cameron Zemek
>            Assignee: Cameron Zemek
>            Priority: Normal
>             Fix For: 5.x
>
>         Attachments: 18866-regression.patch, duplicates.log, echo.log
>
>
> CASSANDRA-18854 rolled back the changes from CASSANDRA-18845. In particular, 
> 18845 had change to only allow 1 inflight ECHO request at a time. As per 
> 18854 some tests have an error rate due to this change. Creating this ticket 
> to discuss this further. As the current state also does not have retry logic, 
> it just allowing multiple ECHO requests inflight at the same time so less 
> likely that all ECHO will timeout or get lost.
> With the change from 18845 adding in some extra logging to track what is 
> going on, I do see it retrying ECHOs. Likewise, I patched a node to drop ECHO 
> requests from a node and also see it retrying ECHOs when it doesn't get a 
> reply.
> Therefore, I think the problem is more specific than the dropping of one ECHO 
> request. Yes there no retry logic for failed ECHO requests, but this is the 
> case even both before and after 18845. ECHO requests are only sent via gossip 
> verb handlers calling applyStateLocally. In these failed tests I therefore 
> assuming their cases where it won't call markAlive when other nodes consider 
> the node UP but its marked DOWN by a node.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to