[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-26 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169448#comment-15169448
 ] 

Siddharth Seth commented on TEZ-3128:
-

+1. Committing. Thanks [~ozawa]. [~jlowe] - should this be pulled into 0.7 as 
well ?

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch, TEZ-3128.002.patch, 
> TEZ-3128.003.patch, TEZ-3128.004.patch, amJstack
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-26 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169260#comment-15169260
 ] 

TezQA commented on TEZ-3128:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12790142/TEZ-3128.004.patch
  against master revision 15d7339.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 31 javac 
compiler warnings (more than the master's current 30 warnings).

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.test.TestFaultTolerance

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1518//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1518//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html
Javac warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1518//artifact/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1518//console

This message is automatically generated.

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch, TEZ-3128.002.patch, 
> TEZ-3128.003.patch, TEZ-3128.004.patch, amJstack
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-26 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169234#comment-15169234
 ] 

TezQA commented on TEZ-3128:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12790138/TEZ-3128.003.patch
  against master revision 15d7339.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:red}-1 findbugs{color}.  The patch appears to introduce 1 new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1517//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1517//artifact/patchprocess/newPatchFindbugsWarningstez-runtime-library.html
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1517//console

This message is automatically generated.

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch, TEZ-3128.002.patch, 
> TEZ-3128.003.patch, TEZ-3128.004.patch, amJstack
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-26 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169083#comment-15169083
 ] 

Tsuyoshi Ozawa commented on TEZ-3128:
-

Attached v03 patch to address the comment.

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch, TEZ-3128.002.patch, 
> TEZ-3128.003.patch, amJstack
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-25 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168481#comment-15168481
 ] 

Siddharth Seth commented on TEZ-3128:
-

[~ozawa] - looking at the scheduler, we already release all held containers as 
part of the shutdown process (way before we unregister from the RM). Given 
that, avoiding the container stop completely would be a better option, and 
simpler patch.

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch, TEZ-3128.002.patch, amJstack
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-25 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167944#comment-15167944
 ] 

Tsuyoshi Ozawa commented on TEZ-3128:
-

[~sseth] [~hitesh] could you check the patch?

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch, TEZ-3128.002.patch, amJstack
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-24 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163235#comment-15163235
 ] 

TezQA commented on TEZ-3128:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12789566/TEZ-3128.002.patch
  against master revision 701e9aa.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.history.TestHistoryParser

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1507//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1507//console

This message is automatically generated.

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch, TEZ-3128.002.patch, amJstack
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-23 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159263#comment-15159263
 ] 

Hitesh Shah commented on TEZ-3128:
--

We do need to release/stop them before shutdown as there is no guarantee on 
when the AM will be killed after unregistering if the AM still has pending work 
( flushing events, etc).

My point was whether we can get away with releasing running containers to YARN 
instead of calling stop on each of them via the NM proxy. If we cannot release 
them, then we need to reduce the timeout and use a new NM client proxy with the 
modified timeouts to stop the containers. 

  

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch, amJstack
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-22 Thread Tsuyoshi Ozawa (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15158004#comment-15158004
 ] 

Tsuyoshi Ozawa commented on TEZ-3128:
-

[~hitesh] [~sseth] Thank you for pointing.

{quote}
dagappmaster shuts down yarn scheduler service but it does not kill containers 
on shutdown - just releases them via amrmclient
TezTaskCommunicatorImpl on stop() does nothing to kill containers.
{quote}

Right, that's why I thought the place I fixed was what you mentioned. Could you 
help me to clarify where to fix?

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-22 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15157279#comment-15157279
 ] 

Hitesh Shah commented on TEZ-3128:
--

[~ozawa] I dont think the delayed container manager thread is the issue here. 

[~sseth] can you add more details/logs on this.


I see the following as per code: 
   - dagappmaster shuts down yarn scheduler service but it does not kill 
containers on shutdown - just releases them via amrmclient
   - TezTaskCommunicatorImpl on stop() does nothing to kill containers. 

It seems like the container launcher is the one trying shut down containers for 
some reason. Maybe we should just release containers via the scheduler service 
instead of trying to stop them?


> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-21 Thread TezQA (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15156575#comment-15156575
 ] 

TezQA commented on TEZ-3128:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment
  http://issues.apache.org/jira/secure/attachment/12788960/TEZ-3128.001.patch
  against master revision 941d199.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 3.0.1) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in :
   org.apache.tez.dag.app.rm.TestContainerReuse

Test results: 
https://builds.apache.org/job/PreCommit-TEZ-Build/1493//testReport/
Console output: https://builds.apache.org/job/PreCommit-TEZ-Build/1493//console

This message is automatically generated.

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>Assignee: Tsuyoshi Ozawa
>  Labels: newbie
> Attachments: TEZ-3128.001.patch
>
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-19 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15155125#comment-15155125
 ] 

Hitesh Shah commented on TEZ-3128:
--

Could probably override the shutdown timeouts programmatically for this jira 
and do a follow-up for how to address yarn timeouts for container launches 
while an app is running. 

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>  Labels: newbie
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-19 Thread Siddharth Seth (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154982#comment-15154982
 ] 

Siddharth Seth commented on TEZ-3128:
-

Yep. That works. The timeouts while an app are running are also way too high.

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (TEZ-3128) Avoid stopping containers on the AM shutdown thread

2016-02-19 Thread Hitesh Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/TEZ-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154783#comment-15154783
 ] 

Hitesh Shah commented on TEZ-3128:
--

This is done to release containers faster before other services such as ATS are 
shutdown which can take a long time. But yes, we need to figure out how to 
short-circuit the release if the NM cannot be communicated. Does it make sense 
to override the timeouts just for this shutdown phase instead of trying to 
avoid stopping them ? 

> Avoid stopping containers on the AM shutdown thread
> ---
>
> Key: TEZ-3128
> URL: https://issues.apache.org/jira/browse/TEZ-3128
> Project: Apache Tez
>  Issue Type: Bug
>Affects Versions: 0.8.0-alpha
>Reporter: Siddharth Seth
>
> During an AM shutdown, the TaskCommunicator is also shutdown and it tries to 
> stop containers in the shutdown thread itself. This can cause the AM shutdown 
> to block if NMs are not available.
> This likely affects 0.7 as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)