Hi Krzysztof,

I haven't seen that error before.  It does sound like it could be a
connection issue.  Did you check that the YARN node has access
to hdfs:///user/samza/deploy/event-log-etl-nested-0.1.0-dist.tar.gz?

One way to set the AM and containers to debug is to include a log4j.xml
file in your tar.gz on the lib folder.  There special logic in the start
scripts (
https://github.com/apache/samza/blob/master/samza-shell/src/main/bash/run-class.sh#L40)
that checks for that path and doesn't work with log4j.properties, for
example.

Cheers,

Roger



On Fri, Jul 10, 2015 at 4:18 AM, Krzysztof Zarzycki <k.zarzy...@gmail.com>
wrote:

> Hi there Samza developers,
>
> I have a problem that I cannot overcome with deploying Samza task on YARN.
> When I submitted the task, ApplicationMasters get created (2 of them), job
> is visible, but in state UNASSIGNED. After some time the job FAILED.
>
> application information on resource manager panel is :
> State: FAILED
> FinalStatus: FAILED
> Elapsed: 25mins, 2sec
> Diagnostics: Application application_1424354741837_0380 failed 2 times due
> to ApplicationMaster for attempt appattempt_1424354741837_0380_000002 timed
> out. Failing the application.
>
>
> When I look into the logs of ApplicationMaster I see no errors, no
> warnings, anything wrong: Please see the output of "yarn logs" comand
> attached.
>
> My guess would be that connection failed between some components
> (container to ApplicationMaster? NodeManager? ).  I suspect that when
> looking at jstack output in the AM:
>
> "main" #1 prio=5 os_prio=0 tid=0x00007f9338015000 nid=0x6f2f waiting on
> condition [0x00007f933de6e000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>   at java.lang.Thread.sleep(Native Method)
>   at
> org.apache.hadoop.util.ThreadUtil.sleepAtLeastIgnoreInterrupts(ThreadUtil.java:43)
>   at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:154)
>   at com.sun.proxy.$Proxy18.registerApplicationMaster(Unknown Source)
>   at
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:196)
>   at
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl.registerApplicationMaster(AMRMClientAsyncImpl.java:138)
>   at
> org.apache.samza.job.yarn.SamzaAppMasterLifecycle.onInit(SamzaAppMasterLifecycle.scala:39)
>   at
> org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:108)
>   at
> org.apache.samza.job.yarn.SamzaAppMaster$$anonfun$run$1.apply(SamzaAppMaster.scala:108)
>   at scala.collection.immutable.List.foreach(List.scala:318)
>   at
> org.apache.samza.job.yarn.SamzaAppMaster$.run(SamzaAppMaster.scala:108)
>   at
> org.apache.samza.job.yarn.SamzaAppMaster$.main(SamzaAppMaster.scala:95)
>   at org.apache.samza.job.yarn.SamzaAppMaster.main(SamzaAppMaster.scala)
>
>
> On the other hand I see in logs correct RM addresses:
> 15/07/10 12:17:30 INFO client.RMProxy: Connecting to ResourceManager at
> hdnn02.company.com/148.251.82.11:8030
> 15/07/10 12:17:31 INFO client.RMProxy: Connecting to ResourceManager at
> hdnn02.company.com/148.251.82.11:8050
> ...
> 2015-07-10 12:17:31,032 [main] INFO  o.apache.samza.job.yarn.ClientHelper
> - trying to connect to RM hdnn02.company.com:8050
> ...
> 2015-07-10 12:17:31,680 [main] INFO  o.a.s.job.yarn.SamzaAppMasterService
> - Webapp is started at (rpc http://78.46.56.88:43268/, tracking http://
>
>
> Does anyone knows what could be wrong here? I'll be grateful for any help,
> also in just debugging the case.
> I start with a simple question: do you know how to set log4j for AM &
> containers to DEBUG?
>
> Thank you!
> Krzysztof
>
>
>

Reply via email to