Hi Roland,

The cluster is based on https://github.com/wandoulabs/spray-socketio. We, 
Wandou Labs ( http://www.snappea.com/ ), are going to use it for at least 
10+ millions persistent connections, from mobile devices to our service. 
These mobile devices can then, share status, push messages, fire real-time 
events, virtually connect to each others etc.

Feel free for more questions :-)

Regards,
Caoyuan Deng ( https://github.com/dcaoyuan )

On Tuesday, August 26, 2014 10:10:27 PM UTC+8, rkuhn wrote:
>
> Hi Caoyuan,
>
> 26 aug 2014 kl. 09:51 skrev Caoyuan <[email protected] <javascript:>>:
>
>
>
> On Monday, August 25, 2014 6:31:15 PM UTC+8, Akka Team wrote:
>>
>> Hi Caouyan,
>>
>> It is usually dangerous to set the heartbeat-pause to a lesser value than 
>> the heartbeat interval itself. If a heartbeat gets lost, then the next 
>> heartbeat will definitely not make the deadline. I recommend to set it to a 
>> larger value. Also, I would go with a lower heartbeat-interval setting, 10s 
>> seems more appropriate if you want low heartbeat traffic.
>>
>> -Endre
>>
>
> Got it now. Thanks. 
>
> BTW, Our cluster has ran 15 days with 1 million long-connections, stable 
> and consistent.
>
>
> That’s great to hear, and it does make me a bit curious about the rest of 
> the story: care to share it privately or even publicly?
>
> Regards,
>
> Roland
>
>  
>
>>
>>
>> On Mon, Aug 25, 2014 at 9:31 AM, Caoyuan <[email protected]> wrote:
>>
>>> Update Aug 25, 2014:
>>>
>>> We changed 
>>> akka.remote.transport-failure-detector.acceptable-heartbeat-pause = 10 s 
>>> instead of 5 s, the WARN message gone. I guess the [Disassociated] WARN 
>>> might be caused by network delay or GC pause (Full GC lasts 3+ secs now on 
>>> our system) etc. The setting is
>>>
>>> akka.remote {
>>>  transport-failure-detector {
>>>     heartbeat-interval = 30 s   # default 4s
>>>    acceptable-heartbeat-pause = 10 s  # default 10s
>>>  }
>>> }
>>>
>>> But, that could not explain the periodic "Disassociated" WARN occurred 
>>> before, which, seems could not be recovered from Disassociated state.
>>>
>>> On Monday, August 11, 2014 12:08:00 AM UTC+8, Caoyuan wrote:
>>>>
>>>> We have an akka cluster with 10 nodes. it works almost smoothly except 
>>>> periodic firing "Disassociated" WARN log, which seems cannot be recovered:
>>>>
>>>> The following is the log records.
>>>>
>>>> ......
>>>> 2014-08-10 00:00:09,253 WARN  a.remote.ReliableDeliverySupervisor 
>>>> akka.tcp://[email protected]:2551/system/endpointManager/
>>>> reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
>>>> 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
>>>> [email protected]:2552] has failed, address is now gated for 
>>>> [5000] ms. Reason is: [Disassociated].
>>>>
>>>> 2014-08-10 00:00:44,292 WARN  a.remote.ReliableDeliverySupervisor 
>>>> akka.tcp://[email protected]:2551/system/endpointManager/
>>>> reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
>>>> 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
>>>> [email protected]:2552] has failed, address is now gated for 
>>>> [5000] ms. Reason is: [Disassociated].
>>>>
>>>> 2014-08-10 00:01:49,332 WARN  a.remote.ReliableDeliverySupervisor 
>>>> akka.tcp://[email protected]:2551/system/endpointManager/
>>>> reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
>>>> 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
>>>> [email protected]:2552] has failed, address is now gated for 
>>>> [5000] ms. Reason is: [Disassociated].
>>>>
>>>> 2014-08-10 00:02:24,373 WARN  a.remote.ReliableDeliverySupervisor 
>>>> akka.tcp://[email protected]:2551/system/endpointManager/
>>>> reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
>>>> 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
>>>> [email protected]:2552] has failed, address is now gated for 
>>>> [5000] ms. Reason is: [Disassociated].
>>>>
>>>> 2014-08-10 00:02:59,412 WARN  a.remote.ReliableDeliverySupervisor 
>>>> akka.tcp://[email protected]:2551/system/endpointManager/
>>>> reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
>>>> 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
>>>> [email protected]:2552] has failed, address is now gated for 
>>>> [5000] ms. Reason is: [Disassociated].
>>>>
>>>> 2014-08-10 00:03:34,452 WARN  a.remote.ReliableDeliverySupervisor 
>>>> akka.tcp://[email protected]:2551/system/endpointManager/
>>>> reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
>>>> 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
>>>> [email protected]:2552] has failed, address is now gated for 
>>>> [5000] ms. Reason is: [Disassociated].
>>>> ......
>>>>
>>>>
>>>> The warning continually occurred almost all day, with the period 35 
>>>> seconds (30 + 5 s) or 65 seconds (30 + 30 + 5 s), which is exactly the 
>>>> setting of akka.remote's transport failure detector:
>>>>
>>>> akka.remote {
>>>>  transport-failure-detector {
>>>>     heartbeat-interval = 30 s   # default 4s
>>>>    acceptable-heartbeat-pause = 5 s  # default 10s
>>>>  }
>>>>
>>>>  Where, the failure-detector mark it unavailable after heartbeat-interval 
>>>> + acceptable-heartbeat-pause period (35 s).
>>>>
>>>> We're using akka-2.3.3. the node which logged is at 10.0.69.169:2551, 
>>>> and the remote node is at 10.0.65.3:2552
>>>>
>>>> I tried to dig via the akka.remoting source code, but with no 
>>>> progressing.
>>>>
>>>> Thoughts ?
>>>>
>>>> -Caoyuan Deng
>>>>
>>>>
>>> -- 
>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>> >>>>>>>>>> Check the FAQ: 
>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>> >>>>>>>>>> Search the archives: 
>>> https://groups.google.com/group/akka-user
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at http://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> -- 
>> Akka Team
>> Typesafe - The software stack for applications that scale
>> Blog: letitcrash.com
>> Twitter: @akkateam
>>  
>
> -- 
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ: 
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> --- 
> You received this message because you are subscribed to the Google Groups 
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To post to this group, send email to [email protected] 
> <javascript:>.
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>
>
>
>
> *Dr. Roland Kuhn*
> *Akka Tech Lead*
> Typesafe <http://typesafe.com/> – Reactive apps on the JVM.
> twitter: @rolandkuhn
> <http://twitter.com/#!/rolandkuhn>
>  
>

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to