May be that's explaining mine too. Thank you very much, Aaron !! Best regards,
-chanwit -- Chanwit Kaewkasi linkedin.com/in/chanwit On Wed, May 28, 2014 at 12:47 AM, Aaron Davidson <ilike...@gmail.com> wrote: > Spark should effectively turn Akka's failure detector off, because we > historically had problems with GCs and other issues causing disassociations. > The only thing that should cause these messages nowadays is if the TCP > connection (which Akka sustains between Actor Systems on different machines) > actually drops. TCP connections are pretty resilient, so one common cause of > this is actual Executor failure -- recently, I have experienced a > similar-sounding problem due to my machine's OOM killer terminating my > Executors, such that they didn't produce any error output. > > > On Thu, May 22, 2014 at 9:19 AM, Chanwit Kaewkasi <chan...@gmail.com> wrote: >> >> Hi all, >> >> On an ARM cluster, I have been testing a wordcount program with JRE 7 >> and everything is OK. But when changing to the embedded version of >> Java SE (Oracle's eJRE), the same program cannot complete all >> computing stages. >> >> It is failed by many Akka's disassociation. >> >> - I've been trying to increase Akka's timeout but still stuck. I am >> not sure what is the right way to do so? (I suspected that GC pausing >> the world is causing this). >> >> - Another question is that how could I properly turn on Akka's logging >> to see what's the root cause of this disassociation problem? (If my >> guess about GC is wrong). >> >> Best regards, >> >> -chanwit >> >> -- >> Chanwit Kaewkasi >> linkedin.com/in/chanwit > >