[
https://issues.apache.org/jira/browse/CASSANDRA-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206609#comment-13206609
]
Peter Schuller commented on CASSANDRA-3895:
-------------------------------------------
So I talked to Chris and he seemed to agree with you. But I just have trouble
getting it. Here is the code from Gossiper (the version from the parent ticket):
{code}
if ( epState != null )
{
long duration = now - epState.getUpdateTimestamp();
// check if this is a fat client. fat clients are removed
automatically from
// gossip after FatClientTimeout. Do not remove dead states
here.
if (!isDeadState(epState) && !epState.isAlive() &&
!StorageService.instance.getTokenMetadata().isReadEligibleMember(endpoint) &&
!justRemovedEndpoints.containsKey(endpoint) && (duration > FatClientTimeout))
{
logger.info("FatClient " + endpoint + " has been silent for
" + FatClientTimeout + "ms, removing from gossip");
removeEndpoint(endpoint); // will put it in
justRemovedEndpoints to respect quarantine delay
evictFromMembership(endpoint); // can get rid of the state
immediately
}
// check for dead state removal
long expireTime = getExpireTimeForEndpoint(endpoint);
if (!epState.isAlive() && (now > expireTime)
&&
(!StorageService.instance.getTokenMetadata().isReadEligibleMember(endpoint)))
{
if (logger.isDebugEnabled())
{
logger.debug("time is expiring for endpoint : " +
endpoint + " (" + expireTime + ")");
}
evictFromMembership(endpoint);
}
}
{code}
So we have two distinct checks; one claims to be removing a fat client, while
the other is just expiring it silently (unless debug is enabled). So what is
the reason for the distinction? And the fat client timeout is different from
the expiration timeout.
> Gossiper.doStatusCheck() uses isMember() suspiciously
> -----------------------------------------------------
>
> Key: CASSANDRA-3895
> URL: https://issues.apache.org/jira/browse/CASSANDRA-3895
> Project: Cassandra
> Issue Type: Sub-task
> Components: Core
> Reporter: Peter Schuller
> Assignee: Peter Schuller
> Priority: Minor
>
> There is code for fat client removal and "old" endpoint (non-fat) removal
> which uses {{TokenMetadata.isMember()}} which only considers nodes that are
> joined (takes reads) in the cluster.
> aVeryLongTime is set to 3 days.
> I could very well be wrong, but the fat client identification code, the way I
> interpret it, is using isMember() to check basically whether a node is "part
> of the cluster" (in the most vague/broad sense) in order to differentiate a
> "real" node (part of the cluster) from just a fat client. But a node that is
> boot strapping is not a fat client, nor will be me a member according to
> isMember().
> I'm also a bit scared of, even in the case of there not being a fat client
> identification, simply forgetting an endpoint. It seems that an operator
> request should be relied upon to actively forget an endpoint (i.e., forced
> remove token).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira