[
https://issues.apache.org/jira/browse/CASSANDRA-18866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17845360#comment-17845360
]
Cameron Zemek commented on CASSANDRA-18866:
-------------------------------------------
found a bug with this patch,
{code:java}
private void handleMajorStateChange(InetAddressAndPort ep, EndpointState
epState)
{
// omitted for brevity
endpointStateMap.put(ep, epState);
if (localEpState != null)
{ // the node restarted: it is up to the subscriber to take whatever
action is necessary
for (IEndpointStateChangeSubscriber subscriber : subscribers)
subscriber.onRestart(ep, localEpState);
}
if (!isDeadState(epState))
markAlive(ep, epState);
{code}
markAlive is passed the remote epState that just got put into the
endpointStateMap which has isAlive = true on it.
{code:java}
private void markAlive(final InetAddressAndPort addr, final EndpointState
localState)
{
if (inflightEcho.contains(addr))
{
return;
}
inflightEcho.add(addr);
localState.markDead(); {code}
But we don't enter markAlive when already inflight echo request. So
endpointStateMap now has entry with isAlive = true, but unreachableEndpoints
has the down node. So now `nodetool status` and down endpoint count do not
match.
The fix is have the onResponse to ECHO update the entry currently in the map.
And always update the passed in state to dead.
{code:java}
private void markAlive(final InetAddressAndPort addr, final EndpointState
localState)
{
localState.markDead();
if (!inflightEcho.add(addr))
{
return;
}
Message<NoPayload> echoMessage = Message.out(ECHO_REQ, noPayload);
logger.trace("Sending ECHO_REQ to {}", addr);
RequestCallback echoHandler = new RequestCallback()
{
@Override
public void onResponse(Message msg)
{
// force processing of the echo response onto the gossip stage,
as it comes in on the REQUEST_RESPONSE stage
runInGossipStageBlocking(() -> {
try
{
EndpointState localEpStatePtr =
endpointStateMap.get(addr);
realMarkAlive(addr, localEpStatePtr);
}
finally
{
inflightEcho.remove(addr);
}
});
}
{code}
Not sure if this allows for race condition around endpointStateMap (eg. you
have a call to handleMajorChange putting a new entry that gets marked dead
after the call to get localEpStatePtr in the onResponse callback)
> Node sends multiple inflight echos
> ----------------------------------
>
> Key: CASSANDRA-18866
> URL: https://issues.apache.org/jira/browse/CASSANDRA-18866
> Project: Cassandra
> Issue Type: Improvement
> Components: Cluster/Gossip
> Reporter: Cameron Zemek
> Assignee: Cameron Zemek
> Priority: Normal
> Fix For: 5.x
>
> Attachments: 18866-regression.patch, duplicates.log, echo.log
>
>
> CASSANDRA-18854 rolled back the changes from CASSANDRA-18845. In particular,
> 18845 had change to only allow 1 inflight ECHO request at a time. As per
> 18854 some tests have an error rate due to this change. Creating this ticket
> to discuss this further. As the current state also does not have retry logic,
> it just allowing multiple ECHO requests inflight at the same time so less
> likely that all ECHO will timeout or get lost.
> With the change from 18845 adding in some extra logging to track what is
> going on, I do see it retrying ECHOs. Likewise, I patched a node to drop ECHO
> requests from a node and also see it retrying ECHOs when it doesn't get a
> reply.
> Therefore, I think the problem is more specific than the dropping of one ECHO
> request. Yes there no retry logic for failed ECHO requests, but this is the
> case even both before and after 18845. ECHO requests are only sent via gossip
> verb handlers calling applyStateLocally. In these failed tests I therefore
> assuming their cases where it won't call markAlive when other nodes consider
> the node UP but its marked DOWN by a node.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]