[ 
https://issues.apache.org/jira/browse/HIVE-21785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oliver Draese reassigned HIVE-21785:
------------------------------------


> Add task queue/runtime stats per LLAP daemon to output
> ------------------------------------------------------
>
>                 Key: HIVE-21785
>                 URL: https://issues.apache.org/jira/browse/HIVE-21785
>             Project: Hive
>          Issue Type: Improvement
>          Components: llap
>    Affects Versions: 3.1.1
>            Reporter: Oliver Draese
>            Assignee: Oliver Draese
>            Priority: Major
>             Fix For: 3.1.1
>
>
> There are several scenarios, where we want to investigate if a particular 
> LLAP daemon is performing faster or slower than the others in the cluster. In 
> these scenarios, it is specifically important to figure out if tasks spent 
> significant time, waiting for an available executor (queued) vs. on the 
> execution itself. Also, a skew in task-to-daemon assignment is interesting.
> This patch adds these statistics to the TezCounters and therefore to the job 
> output on a per LLAP daemon base. Here is an example.
> {{INFO : LlapTaskRuntimeAgg by daemon:}}
> {{INFO :    Count-host-1.example.com: 41}}
> {{INFO :    Count-host-2.example.com: 39}}
> {{INFO :    Count-host-3.example.com: 45}}
> {{INFO :    QueueTime-host-1.example.com: 51437776}}
> {{INFO :    QueueTime-host-2.example.com: 35758306}}
> {{INFO :    QueueTime-host-3.example.com: 47168327}}
> {{INFO :    RunTime-host-1.example.com: 165151539295}}
> {{INFO :    RunTime-host-2.example.com: 141729193528}}
> {{INFO :    RunTime-host-3.example.com: 166876988771}}
> The "Count-" are simple task counts for the appended host name (LLAP daemon)
> The "QueueTime-" values tell, how long tasks waited in the 
> TaskExecutorService's queue before getting actually executed.
> The "RunTime-" values cover the time from execution start to finish (where 
> finish can either be successful execution or a killed/failed execution).
> For the new counts to appear in the output, both - the preexisting 
> hive.tez.exec.print.summary and the new hive.llap.task.time.print.summary 
> have to be set to true.
>  
> {{<property>}}
> {{  <name>hive.tez.exec.print.summary</name>}}
> {{  <value>true</value>}}
> {{</property>}}
> {{<property>}}
> {{  <name>hive.llap.task.time.print.summary</name>}}
> {{  <value>true</value>}}
> {{</property>}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to