Thanks Steven.

We have an application setup that is similar to yours and we are also using 
death watch as you described. 

Do your SrvA actors have state that they don't want to lose in case of a 
quarantine? If so, in your remoting actor system on SrvA do you just use 
forwarding actors to pass messages between application actors in SrvA and 
actors in SrvB? I am thinking this strategy may need special care if 
messages can have embedded actor references to SrvA's application actors.

Thanks again.

Azad

On Friday, August 29, 2014 4:29:12 PM UTC-7, Steven Scott wrote:
>
> In my setup for the issue in question at the time of my first post, I have 
> 2 long-lived services, SrvA & SrvB - SrvB is a "server"-type service (SrvA 
> initiates requests, as needed, to SrvB, and expects replies to those 
> requests), so in my case it was easiest to create one "Remoting" 
> ActorSystem on SrvA; SrvB only runs one ActorSystem, ever. Theoretically we 
> may run many copies of SrvA, all connecting to SrvB, which made it more 
> imperative that SrvB never restart its ActorSystem. SrvA subscribes to the 
> QuarantinedEvent, and if it receives one then a guardian actor restarts the 
> "Remoting" ActorSystem (which is able to re-connect to SrvB since it has a 
> new UID).
>
> As a kind-of-related aside, in the issue my first post was about, SrvB 
> also pushes messages to SrvA, after SrvA sends a 
> SendThisActorPushMessages(receiver: ActorRef) message. SrvB subscribes to 
> <receiver>'s DeathWatch (context watch <receiver>), and when it receives a 
> Terminated(<receiver>) message it simply removes <receiver> from its list 
> of push destinations. So SrvB doesn't really care about Quarantines, but it 
> does care about Terminated/DeathWatch. In the Akka Remoting case, any time 
> a remote is Quarantined you get a Terminated message for any of its actors 
> that you're DeathWatch'ing.
>
> On Friday, August 29, 2014 6:29:51 PM UTC-4, Azad Bolour wrote:
>>
>> Thanks Endre.
>>
>> My take from this is that the QuarantinedEvent will eventually be fired 
>> on both sides, so if I subscribe to it on one side only and recycle the 
>> actor system on that side I should be good. The only issue is that it might 
>> take a long time to get notified of the QuarantinedEvent, and for that I 
>> have to go study the timeout settings and adjust them accordingly.
>>
>> Azad
>>
>> On Friday, August 29, 2014 3:04:59 AM UTC-7, Akka Team wrote:
>>>
>>> Hi Azad,
>>>
>>> The Quarantined will likely to happen on both sides (but it might take a 
>>> long time depending on timeout settings) but you will get Terminated 
>>> messages only in those actors that are watching a remote actor.
>>>
>>> So if you have node A and B, and actor A1 on A watches the actor B1 on 
>>> B, and the link between A and B goes away, then you will see eventually 
>>> (might take long) Quarantined on both A and B, and A1 will receive a 
>>> Terminated for B1. Since B1 did not watch anything on A1, it of course does 
>>> not receive any Terminated messages.
>>>
>>> -Endre
>>>
>>>
>>> On Thu, Aug 28, 2014 at 11:36 PM, Azad Bolour <[email protected]> 
>>> wrote:
>>>
>>>> Thank you Endre and Steven for your responses.
>>>>
>>>> A follow-up question. Do we have to set up remote death watch or 
>>>> subscribe to the QuarantinedEvent in both peers? Or do we get the 
>>>> terminated message and the quarantined event on one side no matter which 
>>>> side has quarantined the other? It would be a little simpler if we could 
>>>> just do this on one side, and recycle the actor system only on that side 
>>>> to 
>>>> re-establish the link.
>>>>
>>>> Thanks again.
>>>>
>>>> Azad
>>>>
>>>>
>>>> On Thursday, August 28, 2014 6:42:23 AM UTC-7, Steven Scott wrote:
>>>>>
>>>>> There's also the Remote Events section of http://doc.akka.io/docs/
>>>>> akka/2.3.4/scala/remoting.html; I subscribe to 
>>>>> akka.remote.QuarantinedEvent 
>>>>> events on the remoting ActorSystem.
>>>>>
>>>>> On Thursday, August 28, 2014 5:50:36 AM UTC-4, Akka Team wrote:
>>>>>>
>>>>>> Hi Azad,
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 27, 2014 at 6:18 PM, Azad Bolour <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> I am wondering how one peer in Akka remoting can detect that the 
>>>>>>> other peer has been quaranteed, so that it can restart the actor system 
>>>>>>> used for remoting to clear the quarantine? Is there an API call for 
>>>>>>> finding 
>>>>>>> out that Akka has quarantined a peer? Or does the application need to 
>>>>>>> use 
>>>>>>> say hearbeats to detect that the peer is unreachable and deduce that it 
>>>>>>> has 
>>>>>>> been quarantined? 
>>>>>>>
>>>>>>
>>>>>> The easiest way is just to use DeathWatch and watch one of the actors 
>>>>>> on the remote machine. If the other host goes away and gets quarantined 
>>>>>> you 
>>>>>> will get a Terminated event. If this actor never stops otherwise (i.e. 
>>>>>> it 
>>>>>> only stops when the actor system goes away) then it basically does what 
>>>>>> you 
>>>>>> need. If you use clustering though you can just listen to cluster 
>>>>>> membership events, since that handles all these things for you.
>>>>>>
>>>>>> -Endre
>>>>>>  
>>>>>>
>>>>>>>
>>>>>>> Many thanks.
>>>>>>>
>>>>>>> Azad
>>>>>>>
>>>>>>>
>>>>>>> On Monday, June 2, 2014 10:16:17 PM UTC-7, rkuhn wrote:
>>>>>>>
>>>>>>>> Hi Steven,
>>>>>>>>
>>>>>>>> thanks for this write-up, your analysis is thorough and correct on 
>>>>>>>> all counts.
>>>>>>>>
>>>>>>>> Remoting needs to use a simplistic approach to the coroner problem 
>>>>>>>> (i.e. when to declare another system “dead”—and zombies are not 
>>>>>>>> tolerated), 
>>>>>>>> which is mostly just a timeout that you should set high enough to 
>>>>>>>> avoid 
>>>>>>>> false positives given your expected outages (network and GC). Using a 
>>>>>>>> dedicated (minimal) ActorSystem for the remoting should be the optimal 
>>>>>>>> solution for your use-case, the overhead is a few hundred milliseconds 
>>>>>>>> for 
>>>>>>>> starting it up plus its default dispatcher (which then should run the 
>>>>>>>> remoting etc.), saving the remoting dispatcher on the heavy local 
>>>>>>>> ActorSystem behind it. The added benefit you note is that you can then 
>>>>>>>> reconfigure the remoting part of the application at runtime.
>>>>>>>>
>>>>>>>> One thing to watch out for is that you don’t accidentally share an 
>>>>>>>> ActorRef from the local system in or with a remote message—including 
>>>>>>>> sender()—because that can of course not work if the originating system 
>>>>>>>> does 
>>>>>>>> not have remoting enabled. The symptom would be dropped messages.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>> Roland
>>>>>>>>
>>>>>>>> 23 maj 2014 kl. 20:27 skrev Steven Scott <[email protected]>:
>>>>>>>>
>>>>>>>> I've been slowly migrating an application to Akka since scala 2.10 
>>>>>>>> came out and pushed me away from scala actors; sorry for any stupid 
>>>>>>>> questions as I'm always learning.
>>>>>>>>
>>>>>>>> As a general picture, the application consists of multiple 
>>>>>>>> long-lived JVMs communicating over ActiveMQ. The standard deployment 
>>>>>>>> is to 
>>>>>>>> a single machine, but with multiple services communicating over AMQ 
>>>>>>>> for the 
>>>>>>>> ability to move specific pieces of functionality to other boxes. As 
>>>>>>>> the 
>>>>>>>> migration and component rewrites have progressed, I'm solely left with 
>>>>>>>> actors communicating with each other over AMQ using akka-camel. The 
>>>>>>>> natural 
>>>>>>>> next step was to explore akka-remote.
>>>>>>>>
>>>>>>>> My questions started out as "is this an abuse/unintended usage of 
>>>>>>>> akka-remote? Is akka-remote meant to be used outside of akka-cluster? 
>>>>>>>> Is it 
>>>>>>>> useful for communicating to local JVMs? What about network hiccups for 
>>>>>>>> remote JVMs?"
>>>>>>>>
>>>>>>>> I did as much reading as I could and found that Victor Klang has 
>>>>>>>> said it's useful for transient networks: http://stackoverflow
>>>>>>>> .com/questions/6401500/is-akka-suitable-for-systems-with-transient-
>>>>>>>> network-coverage, and the smoking gun for same-machine inter-JVM 
>>>>>>>> communication being an expected use-case was Dr. Kuhn's comment here: 
>>>>>>>> http://stackoverflow.com/questions/10268613/whats-the-
>>>>>>>> equivalent-of-akka/11787971#comment13246146_10268748
>>>>>>>>
>>>>>>>> I went ahead and implemented a decent amount of code for using 
>>>>>>>> akka-remote to talk to one of the services after bumping our akka 
>>>>>>>> version 
>>>>>>>> to 2.3.3, and have to say I'm pleased, especially when comparing to 
>>>>>>>> ActiveMQ. Local machine communication is flawless, but once I started 
>>>>>>>> testing with remote machines and doing "ifdown eth0; sleep 20; ifup 
>>>>>>>> eth0" 
>>>>>>>> network disruption tests, I'm left with questions about how to handle 
>>>>>>>> quarantines. I looked at reference.conf and heeded the admonition to 
>>>>>>>> NOT 
>>>>>>>> change the quarantine timeout from 5 days - restarting one of the 
>>>>>>>> actor 
>>>>>>>> systems is the only alternative.
>>>>>>>>
>>>>>>>> So - what're the best practices concerning restarting the 
>>>>>>>> ActorSystem? 
>>>>>>>>
>>>>>>>>  - I'm not clustering - these are a few long-lived "heavy" 
>>>>>>>> services, not just nodes spinning up to do small processing tasks
>>>>>>>>  - Our general deployment is not HA, we don't usually have standbys 
>>>>>>>> waiting
>>>>>>>>  - Restarting the JVM isn't optimal
>>>>>>>>    * since the services are fairly substantial and there's a 
>>>>>>>> non-trivial amount of initialization including database hits to 
>>>>>>>> pre-fill 
>>>>>>>> caches, restarting the JVM is a possibility (less time than the 
>>>>>>>> remoting 
>>>>>>>> gate time), but isn't the first route I'd choose
>>>>>>>>    * we (very rarely) run on non-linux platforms and so tend to try 
>>>>>>>> to keep stuff in the JVM instead of relying on upstart/launchd/windows 
>>>>>>>> services/etc
>>>>>>>>
>>>>>>>> My only other thought is to run an additional ActorSystem for 
>>>>>>>> remoting.
>>>>>>>>
>>>>>>>>  - allows programmatic configuration (our runtime configuration 
>>>>>>>> system could change remoting settings and restart the remoting 
>>>>>>>> ActorSystem 
>>>>>>>> with the new settings)
>>>>>>>>  - a quarantine situation would just require the remoting 
>>>>>>>> ActorSystem to be recreated, not a restart of the whole JVM
>>>>>>>>
>>>>>>>> However, one of the very earliest entries in the Akka documentation 
>>>>>>>> states "An ActorSystem is a heavyweight structure that will 
>>>>>>>> allocate 1…N Threads, so create one per logical application." I know 
>>>>>>>> creating multiple dispatchers in the same ActorSystem is fine, and 
>>>>>>>> sometimes (at least historically) a dedicated dispatcher was 
>>>>>>>> recommended 
>>>>>>>> for some remoting cases; I also know starting a new ActorSystem takes 
>>>>>>>> some 
>>>>>>>> amount of time to create dispatchers, parse configs, etc; so I'm 
>>>>>>>> thinking 
>>>>>>>> that the big yellow warning in the documentation is a general 
>>>>>>>> guideline for 
>>>>>>>> getting started with Akka, not a hard and fast rule.
>>>>>>>>
>>>>>>>> Sorry for the long post, can anybody give me some guidance on the 
>>>>>>>> situation?
>>>>>>>>
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/c
>>>>>>>> urrent/additional/faq.html
>>>>>>>> >>>>>>>>>> Search the archives: https://groups.google.com/grou
>>>>>>>> p/akka-user
>>>>>>>> --- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "Akka User List" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to [email protected].
>>>>>>>> To post to this group, send email to [email protected].
>>>>>>>>
>>>>>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> *Dr. Roland Kuhn*
>>>>>>>> *Akka Tech Lead*
>>>>>>>> Typesafe <http://typesafe.com/> – Reactive apps on the JVM.
>>>>>>>> twitter: @rolandkuhn
>>>>>>>>  <http://twitter.com/#!/rolandkuhn>
>>>>>>>>  
>>>>>>>>  -- 
>>>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/
>>>>>>> current/additional/faq.html
>>>>>>> >>>>>>>>>> Search the archives: https://groups.google.com/
>>>>>>> group/akka-user
>>>>>>> --- 
>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>> Groups "Akka User List" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>> send an email to [email protected].
>>>>>>> To post to this group, send email to [email protected].
>>>>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- 
>>>>>> Akka Team
>>>>>> Typesafe - The software stack for applications that scale
>>>>>> Blog: letitcrash.com
>>>>>> Twitter: @akkateam
>>>>>>  
>>>>>  -- 
>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>> >>>>>>>>>> Check the FAQ: 
>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>> >>>>>>>>>> Search the archives: 
>>>> https://groups.google.com/group/akka-user
>>>> --- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "Akka User List" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> -- 
>>> Akka Team
>>> Typesafe - The software stack for applications that scale
>>> Blog: letitcrash.com
>>> Twitter: @akkateam
>>>  
>>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to