[jira] [Commented] (TEZ-3957) Report TASK_DURATION_MILLIS as a Counter for completed tasks

2018-10-30 Thread Eric Wohlstadter (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16669397#comment-16669397
 ] 

Eric Wohlstadter commented on TEZ-3957:
---

lgtm (unbinding)

[~jeagles]

I'm pretty sure MR doesn't have this counter (at least as of 4 years ago).

> Report TASK_DURATION_MILLIS as a Counter for completed tasks
> 
>
> Key: TEZ-3957
> URL: https://issues.apache.org/jira/browse/TEZ-3957
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Eric Wohlstadter
>Assignee: Sergey Shelukhin
>Priority: Major
> Attachments: TEZ-3957.01.patch, TEZ-3957.patch
>
>
> timeTaken is already being reported by {{TaskAttemptFinishedEvent}}, but not 
> as a Counter.
> Combined with TEZ-3911, this provides min(timeTaken), max(timeTaken), 
> avg(timeTaken).
> The value will be: {{finishTime - launchTime}}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (TEZ-4015) Send killed diagnostics to the AM when ShuffleScheduler calls killSelf

2018-10-30 Thread Jaume M (JIRA)
Jaume M created TEZ-4015:


 Summary: Send killed diagnostics to the AM when ShuffleScheduler 
calls killSelf
 Key: TEZ-4015
 URL: https://issues.apache.org/jira/browse/TEZ-4015
 Project: Apache Tez
  Issue Type: Improvement
Affects Versions: 0.9.1
Reporter: Jaume M


This can be useful for debugging.  This in an example of the logs shown for a 
particular vertex when it fails:

{code}
ERROR : FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 
2, vertexId=vertex_1540489363818_0021_2_03, diagnostics=[Task failed, 
taskId=task_1540489363818_0021_2_03_35, diagnostics=[TaskAttempt 0 killed, 
TaskAttempt 1 failed, info=[Error: Error while running task ( failure ) : 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 error in shuffle in Fetcher_O {Map_1} #6
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:305)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:287)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Map_1: Shuffle failed with too many fetch 
failures and insufficient progress!failureCounts=5, pendingInputs=286, 
fetcherHealthy=false, reducerProgressedEnough=false, reducerStalled=false
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.isShuffleHealthy(ShuffleScheduler.java:1047)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:788)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:379)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.copyFromHost(FetcherOrderedGrouped.java:261)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.fetchNext(FetcherOrderedGrouped.java:180)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:192)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.callInternal(FetcherOrderedGrouped.java:56)
... 7 more
, errorMessage=Shuffle Runner 
Failed:org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$ShuffleError:
 error in shuffle in Fetcher_O {Map_1} #6
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:305)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.Shuffle$RunShuffleCallable.callInternal(Shuffle.java:287)
at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
at 
com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
at 
com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: Map_1: Shuffle failed with too many fetch 
failures and insufficient progress!failureCounts=5, pendingInputs=286, 
fetcherHealthy=false, reducerProgressedEnough=false, reducerStalled=false
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.isShuffleHealthy(ShuffleScheduler.java:1047)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.ShuffleScheduler.copyFailed(ShuffleScheduler.java:788)
at 
org.apache.tez.runtime.library.common.shuffle.orderedgrouped.FetcherOrderedGrouped.setupConnection(FetcherOrderedGrouped.java:379)
at 

[jira] [Updated] (TEZ-4004) Update jetty9 to align with Hadoop and Hive

2018-10-30 Thread Jonathan Eagles (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Eagles updated TEZ-4004:
-
Attachment: TEZ-4004.001-branch-0.9.patch

> Update jetty9 to align with Hadoop and Hive
> ---
>
> Key: TEZ-4004
> URL: https://issues.apache.org/jira/browse/TEZ-4004
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Fix For: 0.10.1
>
> Attachments: TEZ-4004.001-branch-0.9.patch, TEZ-4004.001.patch
>
>
> https://abi-laboratory.pro/index.php?view=timeline=java=jetty
> https://issues.apache.org/jira/browse/HADOOP-15815



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (TEZ-4004) Update jetty9 to align with Hadoop and Hive

2018-10-30 Thread Kuhu Shukla (JIRA)


 [ 
https://issues.apache.org/jira/browse/TEZ-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kuhu Shukla updated TEZ-4004:
-
Fix Version/s: 0.10.1

> Update jetty9 to align with Hadoop and Hive
> ---
>
> Key: TEZ-4004
> URL: https://issues.apache.org/jira/browse/TEZ-4004
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Fix For: 0.10.1
>
> Attachments: TEZ-4004.001.patch
>
>
> https://abi-laboratory.pro/index.php?view=timeline=java=jetty
> https://issues.apache.org/jira/browse/HADOOP-15815



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-4004) Update jetty9 to align with Hadoop and Hive

2018-10-30 Thread Kuhu Shukla (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668759#comment-16668759
 ] 

Kuhu Shukla commented on TEZ-4004:
--

Committed to master. [~jeagles] can you provide a patch for 0.9 as there is a 
conflict. Thank you!

> Update jetty9 to align with Hadoop and Hive
> ---
>
> Key: TEZ-4004
> URL: https://issues.apache.org/jira/browse/TEZ-4004
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Attachments: TEZ-4004.001.patch
>
>
> https://abi-laboratory.pro/index.php?view=timeline=java=jetty
> https://issues.apache.org/jira/browse/HADOOP-15815



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (TEZ-4004) Update jetty9 to align with Hadoop and Hive

2018-10-30 Thread Kuhu Shukla (JIRA)


[ 
https://issues.apache.org/jira/browse/TEZ-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16668728#comment-16668728
 ] 

Kuhu Shukla commented on TEZ-4004:
--

+1 for the patch. Committing this shortly.

> Update jetty9 to align with Hadoop and Hive
> ---
>
> Key: TEZ-4004
> URL: https://issues.apache.org/jira/browse/TEZ-4004
> Project: Apache Tez
>  Issue Type: Bug
>Reporter: Jonathan Eagles
>Assignee: Jonathan Eagles
>Priority: Major
> Attachments: TEZ-4004.001.patch
>
>
> https://abi-laboratory.pro/index.php?view=timeline=java=jetty
> https://issues.apache.org/jira/browse/HADOOP-15815



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)