On Mon, Nov 10, 2014 at 5:43 PM, Behrad Zari <[email protected]> wrote:
> The funny and bad thing is that when I tested my code today it was
> working!!! (as I said in my previous post it also was working at start but
> lately I couldn't get it working)
> I'm confused since I haven't changed anything related to this :( So, Am i
> missing a bit of change mine, or it could it depend on
> 1) bad termination of previous sbt run's in developments!? (So how could
> the remoting port be opened if it's not been released)
> 2) anything related to network/configuration that leads to that
> misbehave... !?
> hum?
>
> My concerns is two-fold:
>
> 1) I'm really eager to reproduce that, and will push a test case if I
> found one
>
> 2) there are still unclear points for me in akka clustering philosophy:
> I saw node B rejoining my node A seed, after B restarted today, but when
> it[B] didn't get aware of seed node A restart!!! Why is that happening?
> here is both nodes conf:
>
> remote {
> log-remote-lifecycle-events = off
> netty.tcp {
> hostname = "127.0.0.1"
> port = 2552
> }
>
> transport-failure-detector {
> heartbeat-interval = 30s
> acceptable-heartbeat-pause = 35s
> }
>
> }
>
> cluster {
> seed-nodes = [
> "akka.tcp://[email protected]:2552"
> ]
> auto-down-unreachable-after = 10s
> }
>
> P.S. can we continue topic on the github issue page? there feels more
> comfortable for me :)
>
Yes, let's continue there <https://github.com/akka/akka/issues/16224>.
Describe step-by-step what you do, and supply log files.
Thanks.
Please remove settings for the transport-failure-detector. Should not
influence this, but I prefer that we debug this with default settings as
much as possible.
/Patrik
>
>
>
> On Monday, November 10, 2014 5:48:21 PM UTC+3:30, Patrik Nordwall wrote:
>>
>> Hi again,
>>
>> My hypothesis of why the node was marked as REACHABLE was wrong. The
>> cluster heartbeat replies include the UID and replies from wrong
>> incarnation are ignored.
>>
>> I have created a test <https://github.com/akka/akka/pull/16266> that
>> simulates the scenario as I have understood it. It behaves as expected,
>> i.e. the restarted node can join after a while when the old incarnation has
>> been removed from the cluster.
>>
>> Behrad, do you have a sample that we can use to reproduce the issue?
>>
>> Regards,
>> Patrik
>>
>> On Wed, Nov 5, 2014 at 10:46 AM, Björn Antonsson <[email protected]
>> > wrote:
>>
>>> Thanks for confirming my suspicion Patrik. A ticket has been created
>>> https://github.com/akka/akka/issues/16224.
>>>
>>> B/
>>>
>>> On 5 November 2014 at 10:20:30, Patrik Nordwall ([email protected])
>>> wrote:
>>>
>>> I think I understand what is going on and what we can consider to
>>> improve.
>>>
>>> The heartbeat messages don't include the system uid, and there fore the
>>> restarted system starts responding to heartbeat messages that are targeted
>>> to the old incarnation. Then the cluster marks it as reachable again,
>>> before the auto-down takes affect, i.e. it is never removed from the
>>> cluster. The new system tries to join, but that is not possible because the
>>> cluster already contains same host:port.
>>>
>>> I think this is best solved by including system uid in the heartbeat
>>> messages, but that increase the payload size of these messages.
>>>
>>> An issue ticket would be good.
>>>
>>> Regards,
>>> Patrik
>>>
>>> On Wed, Nov 5, 2014 at 10:00 AM, Björn Antonsson <
>>> [email protected]> wrote:
>>>
>>>> Hi Behrad,
>>>>
>>>> On 5 November 2014 at 09:53:00, Behrad ([email protected]) wrote:
>>>>
>>>>
>>>>
>>>> 2014-11-05 11:59 GMT+03:30 Björn Antonsson <[email protected]>:
>>>>
>>>>> Hi Richard,
>>>>>
>>>>> On 5 November 2014 at 00:22:55, richard ([email protected])
>>>>> wrote:
>>>>>
>>>>> I am seeing something similar with this github
>>>>> <https://github.com/searler/akka-datareplication-experimentation> code,
>>>>> based on akka-datareplication, using Akka 2.3.6
>>>>> (That might be a little too complex for a ticket)
>>>>>
>>>>> Note that *auto-down-unreachable-after* is commented out
>>>>>
>>>>>
>>>>> If the old node is never downed and removed from the cluster, then the
>>>>> new node can never join.
>>>>>
>>>>
>>>>
>>>>
>>>> Does this mean we should always set auto-down to a small value so
>>>> that we can recover from
>>>> (and reconnect)
>>>> cluster
>>>>
>>>> note crashes? What is the "unreachable" -> "reachable state" state
>>>> change for then !? I'd expect that my node went to unreachable state again
>>>> is reachable when it's again up in between the failure detection threshold.
>>>>
>>>> It also isn't happening for me, in both cases.
>>>>
>>>>
>>>> If you want to have the nodes automatically be downed is a different
>>>> issue than the reachability. The states reachabel/unreachable is for a node
>>>> instance that experiences connection failures (network outages et.c.) but
>>>> not restarts, while the downing is necessary when a new node with the same
>>>> address/port as the old one is joining (in effect a restarted actor
>>>> system).
>>>>
>>>> B/
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> B/
>>>>>
>>>>>
>>>>> Started two instances, one on 2551 (the seed) and another on 1234.
>>>>> Enter text into each instance, which is correctly replicated to each.
>>>>> Kill and restart the 1234 instance.
>>>>>
>>>>> The new 1234 instance receives the current state (from 2551) and
>>>>> continues to
>>>>> replicate in both directions!
>>>>>
>>>>> The log on 2551 does indicate a problem
>>>>> [INFO] [11/04/2014 17:20:07.309]
>>>>> [ClusterSystem-akka.actor.default-dispatcher-20]
>>>>> [Cluster(akka://ClusterSystem)] Cluster Node
>>>>> [akka.tcp://ClusterSystem@localhost:2551] - Existing member
>>>>> [UniqueAddress(akka.tcp://ClusterSystem@localhost:1234,1772853420)]
>>>>> is trying to join, ignoring
>>>>> [INFO] [11/04/2014 17:20:17.319]
>>>>> [ClusterSystem-akka.actor.default-dispatcher-17]
>>>>> [Cluster(akka://ClusterSystem)] Cluster Node
>>>>> [akka.tcp://ClusterSystem@localhost:2551] - Existing member
>>>>> [UniqueAddress(akka.tcp://ClusterSystem@localhost:1234,1772853420)]
>>>>> is trying to join, ignoring
>>>>> [INFO] [11/04/2014 17:20:28.310]
>>>>> [ClusterSystem-akka.actor.default-dispatcher-3]
>>>>> [Cluster(akka://ClusterSystem)] Cluster Node
>>>>> [akka.tcp://ClusterSystem@localhost:2551] - Existing member
>>>>> [UniqueAddress(akka.tcp://ClusterSystem@localhost:1234,1772853420)]
>>>>> is trying to join, ignoring
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/
>>>>> akka/current/additional/faq.html
>>>>> >>>>>>>>>> Search the archives: https://groups.
>>>>> google.com/group/akka-user
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Akka User List" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>>
>>>>> --
>>>>> Björn Antonsson
>>>>> Typesafe <http://typesafe.com/> – Reactive Apps on the JVM
>>>>> twitter: @bantonsson <http://twitter.com/#!/bantonsson>
>>>>>
>>>>> --
>>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/
>>>>> akka/current/additional/faq.html
>>>>> >>>>>>>>>> Search the archives: https://groups.
>>>>> google.com/group/akka-user
>>>>> ---
>>>>> You received this message because you are subscribed to a topic in the
>>>>> Google Groups "Akka User List" group.
>>>>> To unsubscribe from this topic, visit https://groups.google.
>>>>> com/d/topic/akka-user/AdRSv2yuwo4/unsubscribe.
>>>>> To unsubscribe from this group and all its topics, send an email to
>>>>> [email protected].
>>>>> To post to this group, send email to [email protected].
>>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> --Behrad
>>>> --
>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/
>>>> akka/current/additional/faq.html
>>>> >>>>>>>>>> Search the archives: https://groups.
>>>> google.com/group/akka-user
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Akka User List" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>>
>>>>
>>>> --
>>>> Björn Antonsson
>>>> Typesafe <http://typesafe.com/> – Reactive Apps on the JVM
>>>> twitter: @bantonsson <http://twitter.com/#!/bantonsson>
>>>>
>>>> --
>>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/
>>>> current/additional/faq.html
>>>> >>>>>>>>>> Search the archives: https://groups.google.com/
>>>> group/akka-user
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Akka User List" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at http://groups.google.com/group/akka-user.
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>
>>>
>>> --
>>>
>>> Patrik Nordwall
>>> Typesafe <http://typesafe.com/> - Reactive apps on the JVM
>>> Twitter: @patriknw
>>> --
>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/
>>> current/additional/faq.html
>>> >>>>>>>>>> Search the archives: https://groups.google.com/
>>> group/akka-user
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>> --
>>> Björn Antonsson
>>> Typesafe <http://typesafe.com/> – Reactive Apps on the JVM
>>> twitter: @bantonsson <http://twitter.com/#!/bantonsson>
>>>
>>> --
>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>> >>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/
>>> current/additional/faq.html
>>> >>>>>>>>>> Search the archives: https://groups.google.com/
>>> group/akka-user
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>>
>> Patrik Nordwall
>> Typesafe <http://typesafe.com/> - Reactive apps on the JVM
>> Twitter: @patriknw
>>
>> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>
--
Patrik Nordwall
Typesafe <http://typesafe.com/> - Reactive apps on the JVM
Twitter: @patriknw
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.