Hi Martynas,

Thanks for the response. I checked the setting and can confirm they do not 
share the same hostname, port, and seed nodes. 

I was wondering can we force the Cluster to allow a node to rejoin? 

Regards,



On Saturday, September 6, 2014 2:09:26 AM UTC-7, Martynas Mickevičius wrote:
>
> Hi Joe,
>
> your configuration seems correct and I tried to run a small example with 
> it and it works as expected.
>
> Are you sure you do not share hostname, port and seed-nodes configuration 
> between your staging and production environments? My guess would be that 
> ActorSystem from staging interfere with an ActorSystem from production. I 
> know its a long shot, but worth checking.
>
>
> On Thu, Sep 4, 2014 at 8:21 PM, Joe Wong <[email protected] <javascript:>> 
> wrote:
>
>> Hi all,
>>
>> We are using Akka cluster where we have 2 types of nodes, master and 
>> worker.  There are 2 master nodes, both are also seed nodes, and the actors 
>> for those nodes are cluster singletons. There are 8 worker nodes. All 
>> process are started and stopped with Wrapper (Version 3.2.3) 
>> http://wrapper.tanukisoftware.org and each node is on it's own virtual 
>> host.
>>
>> The issue we are noticing is if we stop and start the worker the cluster 
>> will ignore it's attempt to rejoin. The log message is:
>> 2014-09-03 22:36:35,107  INFO 
>> [ClusterSystem-akka.actor.default-dispatcher-3] Cluster Node 
>> [akka.tcp://blah blah blah] - Existing member 
>> [UniqueAddress(akka.tcp://blah blah blah)] is trying to join, ignoring
>>
>> We tried waiting for a while before restarting the worker but it didn't 
>> solve the issue.  This does't happen in our staging environment which has 2 
>> workers. This points to a configuration setting between the 2 environments 
>> but I have double checked them and their identical other than the ip 
>> addresses and cluster name.
>>
>> Interestingly, once we stop a worker our production logs do show the 
>> cluster constantly repeating the gated message every 10 seconds or so.
>> 2014-09-03 22:36:11,130  WARN 
>> [ClusterSystem-akka.actor.default-dispatcher-2] Association with remote 
>> system [akka.tcp://blah blah blah] has failed, address is now gated for 
>> [5000] ms. Reason is: [Association failed with [akka.tcp://blah blah blah]].
>>
>> There's another issue that maybe related and it only happens in our 
>> production environment. The issue is if we shut the "active" master process 
>> down the 2nd master actor does not start up. The log files do show the 
>> cluster has detected that the "active" master is no longer responding.
>>
>> Below are the configurations for both Master and Worker.
>>
>> Any ideas? thanks.
>>
>> Regards,
>>
>> **** MASTER config ****
>> akka {
>>   actor {
>>     provider = "akka.cluster.ClusterActorRefProvider"
>>     debug{
>>       autoreceive = off
>>       lifecycle = off
>>       event-stream = off
>>     }
>>   }
>>
>>   cluster-dispatcher{
>>    type = "Dispatcher"
>>    executor = "fork-join-executor"
>>    fork-join-executor{
>>      parallelism-min = 2
>>      parallelism-max = 4
>>    }
>>   }
>>
>>   remote {
>>     log-remote-lifecycle-events = off
>>     log-reveived-message = off
>>     netty.tcp {
>>       hostname = "10.6.206.154"
>>       port = 40000
>>     }
>>   }
>>
>>   cluster {
>>     seed-nodes = [
>>       "akka.tcp://[email protected]:40000",
>>       "akka.tcp://[email protected]:40000"
>>     ]
>>  
>>     roles=["MASTER", "SCHEDULER"]
>>     retry-unsuccessful-join-after = 5s
>>
>>     auto-down-unreachable-after = 10s
>>     #unreachable-nodes-reaper-interval = 1s
>>
>>     failure-detector{
>>       #heartbeat-interval=1s
>>       threshold = 12.0
>>       #acceptable-heartbeat-pause=2s
>>       #expected-response-after=2s
>>     }
>>
>>     use-dispatcher = akka.cluster-dispatcher
>>
>>   }
>>
>>   loggers = ["akka.event.slf4j.Slf4jLogger"]
>>   # Options: OFF, ERROR, WARNING, INFO, DEBUG
>>   loglevel = "DEBUG"
>>   log-config-on-start = off
>>   
>> }
>>
>> **** WORKER config****
>> akka {
>>   actor {
>>     provider = "akka.cluster.ClusterActorRefProvider"
>>     debug{
>>       autoreceive = off
>>       lifecycle = off
>>       event-stream = off
>>     }
>>   }
>>
>>   cluster-dispatcher{
>>     type = "Dispatcher"
>>     executor = "fork-join-executor"
>>     fork-join-executor{
>>       parallelism-min = 2
>>       parallelism-max = 4
>>     }
>>   }
>>
>>   remote {
>>     log-remote-lifecycle-events = off
>>     log-reveived-message = off
>>     netty.tcp {
>>       hostname = "10.6.206.136"
>>       port = 45000
>>     }
>>   }
>>
>>   cluster {
>>     seed-nodes = [
>>       "akka.tcp://[email protected]:40000",
>>       "akka.tcp://[email protected]:40000"]
>>
>>     roles=["WORKER"]
>>     retry-unsuccessful-join-after = 5s
>>     #disable auto-down - worker should never leave the cluster
>>     #auto-down-unreachable-after = 10s
>>
>>     use-dispatcher = akka.cluster-dispatcher
>>   }
>>
>>   loggers = ["akka.event.slf4j.Slf4jLogger"]
>>   # Options: OFF, ERROR, WARNING, INFO, DEBUG
>>   loglevel = "INFO"
>>   log-config-on-start = off
>>
>> }
>>
>>
>>  -- 
>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>> >>>>>>>>>> Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at http://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Martynas Mickevičius
> Typesafe <http://typesafe.com/> – Reactive 
> <http://www.reactivemanifesto.org/> Apps on the JVM
>  

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to