RE: Origin of failed tasks

2016-10-12 Thread HuXi


> Subject: Re: Origin of failed tasks
> From: hit...@apache.org
> Date: Wed, 12 Oct 2016 08:57:12 -0700
> To: user@tez.apache.org
> 
> If you have the logs for the application master, you can try the following: 
> 
> grep “[HISTORY]” | grep “TASK_ATTEMPT_FINISHED”
> 
> This will give you info on any failed task attempts. 
> 
> The AM logs have history events being published to them. You can do grep 
> “[HISTORY]” | grep “_” where entity type is one of 
> DAG, VERTEX, TASK, TASK_ATTEMPT and event type is STARTED or FINISHED. 
> 
> The logs are also split into diff files. e.g. 
> The AM logs use a syslog_dag… format to split across dags. 
> Task/Container logs use syslog_attempt* format to split out logs for 
> different task attempts. 
> 
> If you have YARN timeline enabled, you can use the analyzers to do more 
> analysis on the dag specific data. These are more related to perf tuning and 
> not failure diagnostics though.
> 
> thanks
> ― Hitesh
> 
> 
> > On Oct 11, 2016, at 5:09 PM, Allan Wilson <wilsoncr...@gmail.com> wrote:
> > 
> > Use the yarn logs command.  That's your only chance without the TEZ UI.  I 
> > setup the TEZ UI
> > In our shop and it is really nice.
> > 
> > Allan
> > Sent from my iPhone
> > 
> >> On Oct 11, 2016, at 5:05 PM, Jan Morlock <jan.morl...@googlemail.com> 
> >> wrote:
> >> 
> >> Hi,
> >> 
> >> currently failed tasks occur during the execution of my Hive/Tez job.
> >> However in the end, the overall job succeeds. Is it possible to find out
> >> afterwards about the origin of those failed tasks (without using the Tez
> >> UI) just by analyzing the output log files?
> >> 
> >> Best regards
> >> Jan
> 
  

Re: Origin of failed tasks

2016-10-12 Thread Hitesh Shah
If you have the logs for the application master, you can try the following: 

grep “[HISTORY]” | grep “TASK_ATTEMPT_FINISHED”

This will give you info on any failed task attempts. 

The AM logs have history events being published to them. You can do grep 
“[HISTORY]” | grep “_” where entity type is one of 
DAG, VERTEX, TASK, TASK_ATTEMPT and event type is STARTED or FINISHED. 

The logs are also split into diff files. e.g. 
The AM logs use a syslog_dag… format to split across dags. 
Task/Container logs use syslog_attempt* format to split out logs for different 
task attempts. 

If you have YARN timeline enabled, you can use the analyzers to do more 
analysis on the dag specific data. These are more related to perf tuning and 
not failure diagnostics though.

thanks
— Hitesh


> On Oct 11, 2016, at 5:09 PM, Allan Wilson  wrote:
> 
> Use the yarn logs command.  That's your only chance without the TEZ UI.  I 
> setup the TEZ UI
> In our shop and it is really nice.
> 
> Allan
> Sent from my iPhone
> 
>> On Oct 11, 2016, at 5:05 PM, Jan Morlock  wrote:
>> 
>> Hi,
>> 
>> currently failed tasks occur during the execution of my Hive/Tez job.
>> However in the end, the overall job succeeds. Is it possible to find out
>> afterwards about the origin of those failed tasks (without using the Tez
>> UI) just by analyzing the output log files?
>> 
>> Best regards
>> Jan



Re: Origin of failed tasks

2016-10-11 Thread Allan Wilson
Use the yarn logs command.  That's your only chance without the TEZ UI.  I 
setup the TEZ UI
In our shop and it is really nice.

Allan
Sent from my iPhone

> On Oct 11, 2016, at 5:05 PM, Jan Morlock  wrote:
> 
> Hi,
> 
> currently failed tasks occur during the execution of my Hive/Tez job.
> However in the end, the overall job succeeds. Is it possible to find out
> afterwards about the origin of those failed tasks (without using the Tez
> UI) just by analyzing the output log files?
> 
> Best regards
> Jan