Hi Douglas,

For the timeline service, please also set “tez.yarn.ats.enabled” to false
in tez-site.xml if the timeline service is not running. Would you mind
filing a jira for the errors that you saw when it was enabled.

As for the hung job ( assuming you have already killed it ), can you
provide the application logs obtained via "bin/yarn logs -applicationId”
and also the hive query explain plan. This should help us diagnose the
potential problems. Apache mailing lists do not support attachments so feel
free to file a jira and attach the logs there.

thanks
— Hitesh



On Thu, May 29, 2014 at 8:50 AM, Douglas Moore <
[email protected]> wrote:

> I'm on HDP 2.1 build running a Hive job that has created 3 stages.
> The first stage has 1045 maps, the second has 2 reducers the 3rd has 1
> reducer.
> The job churns through the first stage and never starts the second.
>
> I can see from the log file syslog_dag_.... that the job releases the
> containers and gets down to heldContainers=3 (which makes sense to me).
>
> How can I diagnose this further?
> How can I run the job in a safer mode, low gear, something to get through
> this stall?
>
> I should note, that I turned off the timeline service because of numerous
> errors by modifying
>
> yarn-site.xml: via Ambari
>
> <name>yarn.timeline-service.enabled</name>
>
> <value>false</value>
>
>
> Thanks in advance,
>
> Douglas
>
>

Reply via email to