[ 
https://issues.apache.org/jira/browse/HIVE-7236?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039730#comment-14039730
 ] 

Lefty Leverenz commented on HIVE-7236:
--------------------------------------

Right, the only place I see is the design doc:

* [Hive on Tez | https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez]

Ideally we'd create a user doc, but as a quick fix how about adding an update 
note to the Job Monitoring or Job Diagnostics section?

* [Job Monitoring | 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez#HiveonTez-Jobmonitoring]
* [Job Diagnostics | 
https://cwiki.apache.org/confluence/display/Hive/Hive+on+Tez#HiveonTez-Jobdiagnostics]

However if an Updates & Improvements section would be useful for other new 
information, it should go at the end of the design doc and this could go there 
with a link from Job Monitoring or Diagnostics.

Also, the user docs could have a stub pointing to the design doc.

> Tez progress monitor should indicate running/failed tasks
> ---------------------------------------------------------
>
>                 Key: HIVE-7236
>                 URL: https://issues.apache.org/jira/browse/HIVE-7236
>             Project: Hive
>          Issue Type: Bug
>          Components: Tez
>    Affects Versions: 0.14.0
>            Reporter: Gopal V
>            Assignee: Gopal V
>            Priority: Minor
>             Fix For: 0.14.0
>
>         Attachments: HIVE-7236.1.patch
>
>
> Currently, the only logging in TezJobMonitor is for completed tasks. 
> This makes it hard to locate task stalls and task failures. Failure scenarios 
> are harder to debug, in particular when analyzing query runs on a cluster 
> with bad nodes.
> Change the job monitor to log running & failed tasks as follows.
> {code}
> Map 1: 0(+157,-1)/1755     Reducer 2: 0/1  
> Map 1: 0(+168,-1)/1755     Reducer 2: 0/1  
> Map 1: 0(+189,-1)/1755     Reducer 2: 0/1  
> Map 1: 0(+189,-1)/1755     Reducer 2: 0/1 
> {code}
> That is 189 tasks running, 1 failure and 0 complete.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to