[jira] [Commented] (TEZ-2954) Container launch timeouts should count towards node blacklisting

2016-03-14 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15194355#comment-15194355
 ] 

Siddharth Seth commented on TEZ-2954:
-

[~ozawa] - the problem is highlighted in 
https://issues.apache.org/jira/browse/TEZ-925?focusedCommentId=13932292=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13932292

If we receive a container timeout - we would have received a task timeout as 
well - which is factored in. The problem is that a launch failure on the NM 
will be reported back via the RM. When that happens, we lose track of the fact 
that the launch failed. If there's a timoue while talking to the NM - that will 
register as a task failure.

The jira description should have been better.

> Container launch timeouts should count towards node blacklisting
> 
>
> Key: TEZ-2954
> URL: https://issues.apache.org/jira/browse/TEZ-2954
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-2954.001.patch
>
>
> Currently, only task failures count towards blacklisting. A container timing 
> out should do the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2954) Container launch timeouts should count towards node blacklisting

2016-03-07 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183434#comment-15183434
 ] 

Siddharth Seth commented on TEZ-2954:
-

[~ozawa] - I'll try looking at the patch by the end of the week.

> Container launch timeouts should count towards node blacklisting
> 
>
> Key: TEZ-2954
> URL: https://issues.apache.org/jira/browse/TEZ-2954
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-2954.001.patch
>
>
> Currently, only task failures count towards blacklisting. A container timing 
> out should do the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2954) Container launch timeouts should count towards node blacklisting

2016-03-04 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179606#comment-15179606
 ] 

TezQA commented on TEZ-2954:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12791422/TEZ-2954.001.patch
  against master revision 91e24d7.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance
  org.apache.tez.dag.app.dag.impl.TestDAGImpl

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1542//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1542//artifact/patchprocess/newPatchFindbugsWarningstez-dag.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1542//console

This message is automatically generated.

> Container launch timeouts should count towards node blacklisting
> 
>
> Key: TEZ-2954
> URL: https://issues.apache.org/jira/browse/TEZ-2954
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-2954.001.patch
>
>
> Currently, only task failures count towards blacklisting. A container timing 
> out should do the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2954) Container launch timeouts should count towards node blacklisting

2016-03-03 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15179515#comment-15179515
 ] 

Tsuyoshi Ozawa commented on TEZ-2954:
-

I think the patch includes TEZ-925. 

[~sseth] could you take a look?

> Container launch timeouts should count towards node blacklisting
> 
>
> Key: TEZ-2954
> URL: https://issues.apache.org/jira/browse/TEZ-2954
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
> Attachments: TEZ-2954.001.patch
>
>
> Currently, only task failures count towards blacklisting. A container timing 
> out should do the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-2954) Container launch timeouts should count towards node blacklisting

2015-12-15 Thread Jeff Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15057551#comment-15057551
 ] 

Jeff Zhang commented on TEZ-2954:
-

Move to 0.7.2

> Container launch timeouts should count towards node blacklisting
> 
>
> Key: TEZ-2954
> URL: https://issues.apache.org/jira/browse/TEZ-2954
> Project: Apache Tez
>  Issue Type: Improvement
>Reporter: Siddharth Seth
>
> Currently, only task failures count towards blacklisting. A container timing 
> out should do the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)