[jira] [Updated] (CASSANDRA-11038) Is node being restarted treated as node joining?

Sam Tunnicliffe (JIRA) Fri, 20 May 2016 09:45:49 -0700

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-11038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sam Tunnicliffe updated CASSANDRA-11038:
----------------------------------------
    Fix Version/s: 3.x
                   3.0.x
                   2.2.x
           Status: Patch Available  (was: Open)


Pushed branches with fixes for 2.2/3.0/3.7/trunk - though the fix merges 
forward cleanly except for conflicts where I've cleaned up imports. Basically, 
these preserve the existing behaviour of delivering both {{NEW_NODE}} and 
{{UP}} events when a node first joins the cluster & of delaying both until 
after the node becomes available for clients. The erroneous {{NEW_NODE}} when a 
known node is restarted has been removed. The tracking of pushed notifications 
in {{EventNotifier}} is still necessary at the moment (because 
[reasons|https://issues.apache.org/jira/browse/CASSANDRA-7816?focusedCommentId=14346387&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14346387]),
 but they will go away with CASSANDRA-9156. See CASSANDRA-11731 for some 
related discussion.

dtest branch [here|https://github.com/beobal/cassandra-dtest/tree/11038]

||branch||testall||dtest||
|[11038-2.2|https://github.com/beobal/cassandra/tree/11038-2.2]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-2.2-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-2.2-dtest]|
|[11038-3.0|https://github.com/beobal/cassandra/tree/11038-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.0-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.0-dtest]|
|[11038-3.7|https://github.com/beobal/cassandra/tree/11038-3.7]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.7-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-3.7-dtest]|
|[11038-trunk|https://github.com/beobal/cassandra/tree/11038-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-trunk-testall]|[dtest|http://cassci.datastax.com/view/Dev/view/beobal/job/beobal-11038-trunk-dtest]|

(so far I've only kicked off CI for the 2.2 branch, just in case there's some 
problem I didn't run into locally, will kick off the other jobs when that 
finishes).

> Is node being restarted treated as node joining?
> ------------------------------------------------
>
>                 Key: CASSANDRA-11038
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11038
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Distributed Metadata
>            Reporter: cheng ren
>            Assignee: Sam Tunnicliffe
>            Priority: Minor
>             Fix For: 2.2.x, 3.0.x, 3.x
>
>
> Hi, 
> What we found recently is that every time we restart a node, all other nodes 
> in the cluster treat the restarted node as a new node joining and issue node 
> joining notification to clients. We have traced the code path being hit when 
> a peer node detected a restarted node:
> src/java/org/apache/cassandra/gms/Gossiper.java
> {code}
>     private void handleMajorStateChange(InetAddress ep, EndpointState epState)
>     {
>         if (!isDeadState(epState))
>         {
>             if (endpointStateMap.get(ep) != null)
>                 logger.info("Node {} has restarted, now UP", ep);
>             else
>                 logger.info("Node {} is now part of the cluster", ep);
>         }
>         if (logger.isTraceEnabled())
>             logger.trace("Adding endpoint state for " + ep);
>         endpointStateMap.put(ep, epState);
>         // the node restarted: it is up to the subscriber to take whatever 
> action is necessary
>         for (IEndpointStateChangeSubscriber subscriber : subscribers)
>             subscriber.onRestart(ep, epState);
>         if (!isDeadState(epState))
>             markAlive(ep, epState);
>         else
>         {
>             logger.debug("Not marking " + ep + " alive due to dead state");
>             markDead(ep, epState);
>         }
>         for (IEndpointStateChangeSubscriber subscriber : subscribers)
>             subscriber.onJoin(ep, epState);
>     }
> {code}
> subscriber.onJoin(ep, epState) ends up with calling onJoinCluster in 
> Server.java
> {code}
> src/java/org/apache/cassandra/transport/Server.java
>         public void onJoinCluster(InetAddress endpoint)
>         {
> server.connectionTracker.send(Event.TopologyChange.newNode(getRpcAddress(endpoint),
>  server.socket.getPort()));
>         }
> {code}
> We have a full trace of code path and skip some intermedia function calls 
> here for being brief. 
> Upon receiving the node joining notification, clients would go and scan 
> system peer table to fetch the latest topology information. Since we have 
> tens of thousands of client connections, scans from all of them put an 
> enormous load to our cluster. 
> Although in the newer version of driver, client skips fetching peer table if 
> the new node has already existed in local metadata, we are still curious why 
> node being restarted is handled as node joining on server side? Did we hit a 
> bug or this is the way supposed to be? Our old java driver version is 1.0.4 
> and cassandra version is 2.0.12.
> Thanks!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11038) Is node being restarted treated as node joining?

Reply via email to