Hi all,

We are using Akka cluster where we have 2 types of nodes, master and 
worker.  There are 2 master nodes, both are also seed nodes, and the actors 
for those nodes are cluster singletons. There are 8 worker nodes. All 
process are started and stopped with Wrapper (Version 3.2.3) 
http://wrapper.tanukisoftware.org and each node is on it's own virtual host.

The issue we are noticing is if we stop and start the worker the cluster 
will ignore it's attempt to rejoin. The log message is:
2014-09-03 22:36:35,107  INFO 
[ClusterSystem-akka.actor.default-dispatcher-3] Cluster Node 
[akka.tcp://blah blah blah] - Existing member 
[UniqueAddress(akka.tcp://blah blah blah)] is trying to join, ignoring

We tried waiting for a while before restarting the worker but it didn't 
solve the issue.  This does't happen in our staging environment which has 2 
workers. This points to a configuration setting between the 2 environments 
but I have double checked them and their identical other than the ip 
addresses and cluster name.

Interestingly, once we stop a worker our production logs do show the 
cluster constantly repeating the gated message every 10 seconds or so.
2014-09-03 22:36:11,130  WARN 
[ClusterSystem-akka.actor.default-dispatcher-2] Association with remote 
system [akka.tcp://blah blah blah] has failed, address is now gated for 
[5000] ms. Reason is: [Association failed with [akka.tcp://blah blah blah]].

There's another issue that maybe related and it only happens in our 
production environment. The issue is if we shut the "active" master process 
down the 2nd master actor does not start up. The log files do show the 
cluster has detected that the "active" master is no longer responding.

Below are the configurations for both Master and Worker.

Any ideas? thanks.

Regards,

**** MASTER config ****
akka {
  actor {
    provider = "akka.cluster.ClusterActorRefProvider"
    debug{
      autoreceive = off
      lifecycle = off
      event-stream = off
    }
  }

  cluster-dispatcher{
   type = "Dispatcher"
   executor = "fork-join-executor"
   fork-join-executor{
     parallelism-min = 2
     parallelism-max = 4
   }
  }

  remote {
    log-remote-lifecycle-events = off
    log-reveived-message = off
    netty.tcp {
      hostname = "10.6.206.154"
      port = 40000
    }
  }

  cluster {
    seed-nodes = [
      "akka.tcp://[email protected]:40000",
      "akka.tcp://[email protected]:40000"
    ]
 
    roles=["MASTER", "SCHEDULER"]
    retry-unsuccessful-join-after = 5s

    auto-down-unreachable-after = 10s
    #unreachable-nodes-reaper-interval = 1s

    failure-detector{
      #heartbeat-interval=1s
      threshold = 12.0
      #acceptable-heartbeat-pause=2s
      #expected-response-after=2s
    }

    use-dispatcher = akka.cluster-dispatcher

  }

  loggers = ["akka.event.slf4j.Slf4jLogger"]
  # Options: OFF, ERROR, WARNING, INFO, DEBUG
  loglevel = "DEBUG"
  log-config-on-start = off
  
}

**** WORKER config****
akka {
  actor {
    provider = "akka.cluster.ClusterActorRefProvider"
    debug{
      autoreceive = off
      lifecycle = off
      event-stream = off
    }
  }

  cluster-dispatcher{
    type = "Dispatcher"
    executor = "fork-join-executor"
    fork-join-executor{
      parallelism-min = 2
      parallelism-max = 4
    }
  }

  remote {
    log-remote-lifecycle-events = off
    log-reveived-message = off
    netty.tcp {
      hostname = "10.6.206.136"
      port = 45000
    }
  }

  cluster {
    seed-nodes = [
      "akka.tcp://[email protected]:40000",
      "akka.tcp://[email protected]:40000"]

    roles=["WORKER"]
    retry-unsuccessful-join-after = 5s
    #disable auto-down - worker should never leave the cluster
    #auto-down-unreachable-after = 10s

    use-dispatcher = akka.cluster-dispatcher
  }

  loggers = ["akka.event.slf4j.Slf4jLogger"]
  # Options: OFF, ERROR, WARNING, INFO, DEBUG
  loglevel = "INFO"
  log-config-on-start = off

}


-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to