[ 
https://issues.apache.org/jira/browse/CASSANDRA-3895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206609#comment-13206609
 ] 

Peter Schuller commented on CASSANDRA-3895:
-------------------------------------------

So I talked to Chris and he seemed to agree with you. But I just have trouble 
getting it. Here is the code from Gossiper (the version from the parent ticket):

{code}
            if ( epState != null )
            {
                long duration = now - epState.getUpdateTimestamp();

                // check if this is a fat client. fat clients are removed 
automatically from
                // gossip after FatClientTimeout.  Do not remove dead states 
here.
                if (!isDeadState(epState) && !epState.isAlive() && 
!StorageService.instance.getTokenMetadata().isReadEligibleMember(endpoint) && 
!justRemovedEndpoints.containsKey(endpoint) && (duration > FatClientTimeout))
                {
                    logger.info("FatClient " + endpoint + " has been silent for 
" + FatClientTimeout + "ms, removing from gossip");
                    removeEndpoint(endpoint); // will put it in 
justRemovedEndpoints to respect quarantine delay
                    evictFromMembership(endpoint); // can get rid of the state 
immediately
                }

                // check for dead state removal
                long expireTime = getExpireTimeForEndpoint(endpoint);
                if (!epState.isAlive() && (now > expireTime)
                        && 
(!StorageService.instance.getTokenMetadata().isReadEligibleMember(endpoint)))
                {
                    if (logger.isDebugEnabled())
                    {
                        logger.debug("time is expiring for endpoint : " + 
endpoint + " (" + expireTime + ")");
                    }
                    evictFromMembership(endpoint);
                }
            }
{code}

So we have two distinct checks; one claims to be removing a fat client, while 
the other is just expiring it silently (unless debug is enabled). So what is 
the reason for the distinction? And the fat client timeout is different from 
the expiration timeout.

                
> Gossiper.doStatusCheck() uses isMember() suspiciously
> -----------------------------------------------------
>
>                 Key: CASSANDRA-3895
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3895
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Core
>            Reporter: Peter Schuller
>            Assignee: Peter Schuller
>            Priority: Minor
>
> There is code for fat client removal and "old" endpoint (non-fat) removal 
> which uses {{TokenMetadata.isMember()}} which only considers nodes that are 
> joined (takes reads) in the cluster.
> aVeryLongTime is set to 3 days.
> I could very well be wrong, but the fat client identification code, the way I 
> interpret it, is using isMember() to check basically whether a node is "part 
> of the cluster" (in the most vague/broad sense) in order to differentiate a 
> "real" node (part of the cluster) from just a fat client. But a node that is 
> boot strapping is not a fat client, nor will be me a member according to 
> isMember().
> I'm also a bit scared of, even in the case of there not being a fat client 
> identification, simply forgetting an endpoint. It seems that an operator 
> request should be relied upon to actively forget an endpoint (i.e., forced 
> remove token).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to