You can set Flink’s log level to DEBUG in the log4j.properties file. Furthermore, you can activate logging of Akka’s life cycle events via akka.log.lifecycle.events: true which you specify in flink-conf.yaml.
Cheers, Till On Fri, Jan 15, 2016 at 12:41 PM, Frederick Ayala <frederickay...@gmail.com> wrote: > Hi Stephan, > > Other jobs run fine but this one is not working on the machine that I was > using previously (16GB RAM) [1] > > Is there a way to debug the Akka messages to understand what's happening > between the JobManager and the Client? I can add logging and send it. > > Thanks! > > Fred > > [1] The failure started to happen when I added the flatMap transformation. > Previously I was calling the collect function after the reduceGroup and > then using Scala's flatten function. Since this was very slow and failed > with large datafile I used Flink to flatten the list of lists and now it > faster. > On Jan 15, 2016 11:51, "Stephan Ewen" <se...@apache.org> wrote: > >> Hi! >> >> Do you get this problem with other Jobs as well? >> >> The logs suggest that the JobManager receives the job and starts tasks, >> but the Client thinks it lost connection. >> >> Greetings, >> Stephan >> >> >> On Fri, Jan 15, 2016 at 10:31 AM, Frederick Ayala < >> frederickay...@gmail.com> wrote: >> >>> Hi Robert, >>> >>> Thanks for your reply. >>> >>> I set the akka.ask.timeout to 10k seconds just to see what happened. I >>> tried different values but non did the trick. >>> >>> My problem was solved by using a machine with more RAM. However, it was >>> not clear that the memory was the problem :) >>> >>> Attached are the log and the Scala code of the transformation that I was >>> running. >>> >>> The data file I am processing is around 57M lines (~1.7GB). >>> >>> Let me know if you have any comment or suggestion. >>> >>> Thanks again, >>> >>> Frederick >>> >>> >>> >>> On Fri, Jan 15, 2016 at 10:01 AM, Robert Metzger <rmetz...@apache.org> >>> wrote: >>> >>>> Hi Frederick, >>>> >>>> sorry for the delayed response. >>>> >>>> I have no idea what the problem could be. >>>> Has the exception been thrown from the env.execute() call? >>>> Why did you set the akka.ask.timeout to 10k seconds? >>>> >>>> >>>> >>>> >>>> On Wed, Jan 13, 2016 at 2:13 AM, Frederick Ayala < >>>> frederickay...@gmail.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> I am having an error while running some Flink transformations in a >>>>> DataStream Scala API. >>>>> >>>>> The error I get is: >>>>> >>>>> Timeout while waiting for JobManager answer. Job time exceeded >>>>> 21474835 seconds >>>>> ... >>>>> >>>>> Caused by: akka.pattern.AskTimeoutException: Ask timed out on >>>>> [Actor[akka://flink/user/$a#183984057]] after [21474835000 ms] >>>>> >>>>> >>>>> This happens after a couple of minutes. Not after 21474835 seconds... >>>>> >>>>> I tried different configurations but no result so far: >>>>> val customConfiguration = new Configuration() >>>>> customConfiguration.setInteger("parallelism", 8) >>>>> customConfiguration.setInteger("jobmanager.heap.mb",2560) >>>>> customConfiguration.setInteger("taskmanager.heap.mb",10240) >>>>> customConfiguration.setInteger("taskmanager.numberOfTaskSlots",8) >>>>> >>>>> customConfiguration.setInteger("taskmanager.network.numberOfBuffers",16384) >>>>> customConfiguration.setString("akka.ask.timeout","10000 s") >>>>> customConfiguration.setString("akka.lookup.timeout","100 s") >>>>> env = >>>>> ExecutionEnvironment.createLocalEnvironment(customConfiguration) >>>>> >>>>> Any idea what could it be the problem? >>>>> >>>>> Thanks! >>>>> >>>>> Frederick >>>>> >>>> >>>> >>> >>> >>> -- >>> Frederick Ayala >>> >> >>