Hi Krzysztof, I haven't seen that error before. It does sound like it could be a connection issue. Did you check that the YARN node has access to hdfs:///user/samza/deploy/event-log-etl-nested-0.1.0-dist.tar.gz?
One way to set the AM and containers to debug is to include a log4j.xml file in your tar.gz on the lib folder. There special logic in the start scripts ( https://github.com/apache/samza/blob/master/samza-shell/src/main/bash/run-class.sh#L40) that checks for that path and doesn't work with log4j.properties, for example. Cheers, Roger On Fri, Jul 10, 2015 at 4:18 AM, Krzysztof Zarzycki <k.zarzy...@gmail.com> wrote: > Hi there Samza developers, > > I have a problem that I cannot overcome with deploying Samza task on YARN. > When I submitted the task, ApplicationMasters get created (2 of them), job > is visible, but in state UNASSIGNED. After some time the job FAILED. > > application information on resource manager panel is : > State: FAILED > FinalStatus: FAILED > Elapsed: 25mins, 2sec > Diagnostics: Application application_1424354741837_0380 failed 2 times due > to ApplicationMaster for attempt appattempt_1424354741837_0380_000002 timed > out. Failing the application. > > > When I look into the logs of ApplicationMaster I see no errors, no > warnings, anything wrong: Please see the output of "yarn logs" comand > attached. > > My guess would be that connection failed between some components > (container to ApplicationMaster? NodeManager? ). I suspect that when > looking at jstack output in the AM: > > "main" #1 prio=5 os_prio=0 tid=0x00007f9338015000 nid=0x6f2f waiting on > condition [0x00007f933de6e000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at > org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:154) > at com.sun.proxy.$Proxy18.registerApplicationMaster(Unknown Source) > at > org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:196) > at > org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:138) > at > org.apache.samza.job.yarn.SamzaAppMasterLifecycle.onInit(SamzaAppMasterLifecycle.scala:39) > at > org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:108) > at > org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:108) > at scala.collection.immutable.List.foreach(List.scala:318) > at > org.apache.samza.job.yarn.SamzaAppMaster$.run(SamzaAppMaster.scala:108) > at > org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:95) > at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala) > > > On the other hand I see in logs correct RM addresses: > 15/07/10 12:17:30 INFO client.RMProxy: Connecting to ResourceManager at > hdnn02.company.com/148.251.82.11:8030 > 15/07/10 12:17:31 INFO client.RMProxy: Connecting to ResourceManager at > hdnn02.company.com/148.251.82.11:8050 > ... > 2015-07-10 12:17:31,032 [main] INFO o.apache.samza.job.yarn.ClientHelper > - trying to connect to RM hdnn02.company.com:8050 > ... > 2015-07-10 12:17:31,680 [main] INFO o.a.s.job.yarn.SamzaAppMasterService > - Webapp is started at (rpc http://78.46.56.88:43268/, tracking http:// > > > Does anyone knows what could be wrong here? I'll be grateful for any help, > also in just debugging the case. > I start with a simple question: do you know how to set log4j for AM & > containers to DEBUG? > > Thank you! > Krzysztof > > >