[jira] [Updated] (YARN-2612) Some completed containers are not reported to NM

2014-09-26 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2612: --- Attachment: YARN-2612.2.patch Also change Capacity and FIFO Scheduler. Some completed containers are not

[jira] [Updated] (YARN-2612) Some completed containers are not reported to NM

2014-09-26 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2612: --- Attachment: (was: YARN-2612.2.patch) Some completed containers are not reported to NM

[jira] [Updated] (YARN-2612) Some completed containers are not reported to NM

2014-09-26 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2612: --- Attachment: (was: YARN-2612.patch) Some completed containers are not reported to NM

[jira] [Updated] (YARN-2612) Some completed containers are not reported to NM

2014-09-26 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2612: --- Description: We are testing RM work preserving restart and found the following logs when we ran a simple

[jira] [Created] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-28 Thread Jun Gong (JIRA)
Jun Gong created YARN-2617: -- Summary: NM does not need to send finished container whose APP is not running to RM Key: YARN-2617 URL: https://issues.apache.org/jira/browse/YARN-2617 Project: Hadoop YARN

[jira] [Updated] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-28 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2617: --- Attachment: YARN-2617.patch NM does not need to send finished container whose APP is not running to RM

[jira] [Updated] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-28 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2617: --- Description: We([~chenchun]) are testing RM work preserving restart and found the following logs when we ran

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-29 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14151466#comment-14151466 ] Jun Gong commented on YARN-2617: [~jianhe], thank you for the review! {quote} I think we

[jira] [Updated] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-29 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2617: --- Attachment: YARN-2617.2.patch NM does not need to send finished container whose APP is not running to RM

[jira] [Updated] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-30 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2617: --- Attachment: YARN-2617.3.patch Update the patch. Delete an unrelated line. NM does not need to send finished

[jira] [Updated] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-30 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2617: --- Attachment: YARN-2617.4.patch Update the patch. I am not sure whether I catch your point: send completed

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-09-30 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14154337#comment-14154337 ] Jun Gong commented on YARN-2617: Get it. Thank you! NM does not need to send finished

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-01 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14155960#comment-14155960 ] Jun Gong commented on YARN-2617: It seems that there is something wrong with Jenkins.

[jira] [Created] (YARN-2640) TestDirectoryCollection.testCreateDirectories failed

2014-10-02 Thread Jun Gong (JIRA)
Jun Gong created YARN-2640: -- Summary: TestDirectoryCollection.testCreateDirectories failed Key: YARN-2640 URL: https://issues.apache.org/jira/browse/YARN-2640 Project: Hadoop YARN Issue Type: Bug

[jira] [Commented] (YARN-2617) NM does not need to send finished container whose APP is not running to RM

2014-10-02 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156198#comment-14156198 ] Jun Gong commented on YARN-2617: I investigated why TestDirectoryCollection failed. And it

[jira] [Updated] (YARN-2640) TestDirectoryCollection.testCreateDirectories failed

2014-10-02 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2640: --- Attachment: YARN-2640.patch Patch submitted. TestDirectoryCollection.testCreateDirectories failed

[jira] [Updated] (YARN-2640) TestDirectoryCollection.testCreateDirectories failed

2014-10-02 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2640: --- Attachment: YARN-2640.2.patch TestDirectoryCollection.testCreateDirectories failed

[jira] [Commented] (YARN-2640) TestDirectoryCollection.testCreateDirectories failed

2014-10-02 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157570#comment-14157570 ] Jun Gong commented on YARN-2640: [~ozawa], thank you for telling me. Close it now.

[jira] [Resolved] (YARN-2612) Some completed containers are not reported to NM

2014-10-02 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong resolved YARN-2612. Resolution: Duplicate Some completed containers are not reported to NM

[jira] [Created] (YARN-2164) Add switch 'restart' for yarn-daemon.sh

2014-06-16 Thread Jun Gong (JIRA)
Jun Gong created YARN-2164: -- Summary: Add switch 'restart' for yarn-daemon.sh Key: YARN-2164 URL: https://issues.apache.org/jira/browse/YARN-2164 Project: Hadoop YARN Issue Type: Improvement

[jira] [Updated] (YARN-2164) Add switch 'restart' for yarn-daemon.sh

2014-06-16 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2164: --- Attachment: YARN-2164.patch Add switch 'restart' for yarn-daemon.sh

[jira] [Created] (YARN-2170) Fix components' version information in the web page 'About the Cluster'

2014-06-17 Thread Jun Gong (JIRA)
Jun Gong created YARN-2170: -- Summary: Fix components' version information in the web page 'About the Cluster' Key: YARN-2170 URL: https://issues.apache.org/jira/browse/YARN-2170 Project: Hadoop YARN

[jira] [Updated] (YARN-2170) Fix components' version information in the web page 'About the Cluster'

2014-06-17 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-2170: --- Attachment: YARN-2170.patch Fix components' version information in the web page 'About the Cluster'

[jira] [Updated] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-02-04 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3094: --- Attachment: YARN-3094.5.patch The failed test case seems unrelated. Re-submit the same patch. reset timer

[jira] [Updated] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-02-04 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3094: --- Attachment: YARN-3094.4.patch reset timer for liveness monitors after RM recovery

[jira] [Commented] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-02-04 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14305235#comment-14305235 ] Jun Gong commented on YARN-3094: Thanks [~jianhe] and [~rohithsharma] for review and

[jira] [Commented] (YARN-3057) Need update apps' runnability when reloading allocation files for FairScheduler

2015-01-21 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14285417#comment-14285417 ] Jun Gong commented on YARN-3057: The failed test cases seem unrelated to the patch(most

[jira] [Updated] (YARN-3057) Need update apps' runnability when reloading allocation files for FairScheduler

2015-01-20 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3057: --- Attachment: YARN-3057.patch Need update apps' runnability when reloading allocation files for FairScheduler

[jira] [Commented] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-01-28 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14295291#comment-14295291 ] Jun Gong commented on YARN-3094: Hi [~jianhe], could you please help review it? Thank you.

[jira] [Updated] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-01-28 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3094: --- Attachment: YARN-3094.3.patch reset timer for liveness monitors after RM recovery

[jira] [Commented] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-01-28 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14296397#comment-14296397 ] Jun Gong commented on YARN-3094: [~adhoot] Thank you for the comment. It is really very

[jira] [Commented] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-01-25 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14291478#comment-14291478 ] Jun Gong commented on YARN-3094: [~chenchun] Thanks for the suggestion. I think the time

[jira] [Updated] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-01-25 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3094: --- Attachment: YARN-3094.2.patch Add a test case. reset timer for liveness monitors after RM recovery

[jira] [Commented] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-01-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14290377#comment-14290377 ] Jun Gong commented on YARN-3094: [~rohithsharma] Thanks for your review. I will add a test

[jira] [Updated] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-01-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3094: --- Attachment: YARN-3094.patch reset timer for liveness monitors after RM recovery

[jira] [Created] (YARN-3094) reset timer for liveness monitors after RM recovery

2015-01-23 Thread Jun Gong (JIRA)
Jun Gong created YARN-3094: -- Summary: reset timer for liveness monitors after RM recovery Key: YARN-3094 URL: https://issues.apache.org/jira/browse/YARN-3094 Project: Hadoop YARN Issue Type: Bug

[jira] [Updated] (YARN-3057) Need update apps' runnability when reloading allocation files for FairScheduler

2015-01-28 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3057: --- Attachment: YARN-3057.2.patch Re-submit the same patch. Need update apps' runnability when reloading

[jira] [Created] (YARN-3057) Need update apps' runnability when reloading allocation files for FairScheduler

2015-01-13 Thread Jun Gong (JIRA)
Jun Gong created YARN-3057: -- Summary: Need update apps' runnability when reloading allocation files for FairScheduler Key: YARN-3057 URL: https://issues.apache.org/jira/browse/YARN-3057 Project: Hadoop YARN

[jira] [Updated] (YARN-3161) Containers' information are lost in some cases when RM restart

2015-02-09 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3161: --- Description: When RM restart, containers' information will be lost for the following scenarios: 1. NM

[jira] [Resolved] (YARN-3161) Containers' information are lost in some cases when RM restart

2015-02-09 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong resolved YARN-3161. Resolution: Duplicate Containers' information are lost in some cases when RM restart

[jira] [Commented] (YARN-3161) Containers' information are lost in some cases when RM restart

2015-02-09 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313343#comment-14313343 ] Jun Gong commented on YARN-3161: Thanks [~jianhe] and [~vinodkv] for the explanation. Then

[jira] [Commented] (YARN-3161) Containers' information are lost in some cases when RM restart

2015-02-09 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313384#comment-14313384 ] Jun Gong commented on YARN-3161: Thanks Jian. Closing it now. Containers' information are

[jira] [Created] (YARN-3161) Containers' information are lost in some cases when RM restart

2015-02-09 Thread Jun Gong (JIRA)
Jun Gong created YARN-3161: -- Summary: Containers' information are lost in some cases when RM restart Key: YARN-3161 URL: https://issues.apache.org/jira/browse/YARN-3161 Project: Hadoop YARN Issue

[jira] [Created] (YARN-3389) Two attempts might operate on same data structures concurrently

2015-03-23 Thread Jun Gong (JIRA)
Jun Gong created YARN-3389: -- Summary: Two attempts might operate on same data structures concurrently Key: YARN-3389 URL: https://issues.apache.org/jira/browse/YARN-3389 Project: Hadoop YARN Issue

[jira] [Updated] (YARN-3389) Two attempts might operate on same data structures concurrently

2015-03-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3389: --- Attachment: YARN-3389.01.patch Two attempts might operate on same data structures concurrently

[jira] [Commented] (YARN-3389) Avoid race conditions when attempts operate on shared states concurrently

2015-04-14 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14493716#comment-14493716 ] Jun Gong commented on YARN-3389: [~jianhe] Thank you for the explanation. I thought RM's

[jira] [Updated] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-04-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3480: --- Summary: Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable (was: Make AM max

[jira] [Updated] (YARN-3480) Make AM max attempts stored in RMStateStore to be configurable

2015-04-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3480: --- Attachment: YARN-3480.01.patch Attach an initial patch. I will add test cases later. Make AM max attempts

[jira] [Updated] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-04-24 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3480: --- Attachment: YARN-3480.02.patch Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

[jira] [Updated] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-02 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3480: --- Description: When RM HA is enabled and running containers are kept across attempts, apps are more likely to

[jira] [Updated] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-02 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3480: --- Attachment: YARN-3480.03.patch Update patch. Fix javac warning, checkstyple and test cases error. Make AM

[jira] [Commented] (YARN-3474) Add a way to let NM wait RM to come back, not kill running containers

2015-05-04 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526588#comment-14526588 ] Jun Gong commented on YARN-3474: [~vinodkv] Thank you for the explanation. Closing it now.

[jira] [Resolved] (YARN-3474) Add a way to let NM wait RM to come back, not kill running containers

2015-05-04 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong resolved YARN-3474. Resolution: Invalid Add a way to let NM wait RM to come back, not kill running containers

[jira] [Commented] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-03 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14526195#comment-14526195 ] Jun Gong commented on YARN-3480: [~vinodkv] Thank you for the comments. {quote} No, as you

[jira] [Commented] (YARN-3366) Outbound network bandwidth : classify/shape traffic originating from YARN containers

2015-04-30 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14522639#comment-14522639 ] Jun Gong commented on YARN-3366: [~sidharta-s] Thank you for the explanation. Could we set

[jira] [Commented] (YARN-3366) Outbound network bandwidth : classify/shape traffic originating from YARN containers

2015-04-29 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14520788#comment-14520788 ] Jun Gong commented on YARN-3366: When addYARNRootClass in

[jira] [Commented] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-04 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14527847#comment-14527847 ] Jun Gong commented on YARN-3480: [~jianhe], sorry for not specifying our scenario: RM HA is

[jira] [Commented] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-07 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533731#comment-14533731 ] Jun Gong commented on YARN-3480: [~jianhe] just catch your option. Do you mean that the

[jira] [Commented] (YARN-3480) Make AM max attempts stored in RMAppImpl and RMStateStore to be configurable

2015-05-07 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14533705#comment-14533705 ] Jun Gong commented on YARN-3480: [~jianhe], thanks for your comments and suggestion. The

[jira] [Updated] (YARN-3480) Recovery may get very slow with lots of services with lots of app-attempts

2015-05-08 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3480: --- Attachment: YARN-3480.04.patch Recovery may get very slow with lots of services with lots of app-attempts

[jira] [Updated] (YARN-3474) Add a way to let NM wait RM to come back, not kill running containers

2015-04-14 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3474: --- Attachment: YARN-3474.01.patch Run 'bin/yarn waitrm', then just wait RM to come back, and press 'Enter' to

[jira] [Created] (YARN-3480) Make AM max attempts stored in RMStateStore to be configurable

2015-04-13 Thread Jun Gong (JIRA)
Jun Gong created YARN-3480: -- Summary: Make AM max attempts stored in RMStateStore to be configurable Key: YARN-3480 URL: https://issues.apache.org/jira/browse/YARN-3480 Project: Hadoop YARN Issue

[jira] [Commented] (YARN-3389) Avoid race conditions when attempts operate on shared states concurrently

2015-04-13 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14492374#comment-14492374 ] Jun Gong commented on YARN-3389: [~jianhe] Kindly review it please. Thank you. Avoid race

[jira] [Updated] (YARN-3389) Avoid race conditions when attempts operate on shared states concurrently

2015-04-11 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3389: --- Summary: Avoid race conditions when attempts operate on shared states concurrently (was: Two attempts might

[jira] [Updated] (YARN-3389) Avoid race conditions when attempts operate on shared states concurrently

2015-04-11 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3389: --- Description: In AttemptFailedTransition, new attempt will get states('justFinishedContainers' and

[jira] [Commented] (YARN-3474) Add a way to let NM wait RM to come back, not kill running containers

2015-04-17 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14501037#comment-14501037 ] Jun Gong commented on YARN-3474: Any comments appreciate. Add a way to let NM wait RM to

[jira] [Created] (YARN-3469) Do not set watch for most cases in ZKRMStateStore

2015-04-09 Thread Jun Gong (JIRA)
Jun Gong created YARN-3469: -- Summary: Do not set watch for most cases in ZKRMStateStore Key: YARN-3469 URL: https://issues.apache.org/jira/browse/YARN-3469 Project: Hadoop YARN Issue Type:

[jira] [Updated] (YARN-3469) Do not set watch for most cases in ZKRMStateStore

2015-04-09 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3469: --- Description: In ZKRMStateStore, most operations(e.g. getDataWithRetries, getDataWithRetries,

[jira] [Updated] (YARN-3469) Do not set watch for most cases in ZKRMStateStore

2015-04-09 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3469: --- Attachment: YARN-3469.01.patch Do not set watch for most cases in ZKRMStateStore

[jira] [Commented] (YARN-3480) Recovery may get very slow with lots of services with lots of app-attempts

2015-05-20 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14553451#comment-14553451 ] Jun Gong commented on YARN-3480: {quote} Without doing this, we will unnecessarily be

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597622#comment-14597622 ] Jun Gong commented on YARN-3809: Same as previous explanation, checkstyle and test case

[jira] [Resolved] (YARN-3831) Localization failed when a local disk turns from bad to good without NM initializes it

2015-06-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong resolved YARN-3831. Resolution: Not A Problem Localization failed when a local disk turns from bad to good without NM

[jira] [Commented] (YARN-3831) Localization failed when a local disk turns from bad to good without NM initializes it

2015-06-23 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3831?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14597616#comment-14597616 ] Jun Gong commented on YARN-3831: [~zxu], thank you for the remind. Sorry for late reply.

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-20 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594424#comment-14594424 ] Jun Gong commented on YARN-3809: Attach a new patch to address [~jlowe] 's suggestions.

[jira] [Updated] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-20 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3809: --- Attachment: YARN-3809.03.patch Failed to launch new attempts because ApplicationMasterLauncher's threads all

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-19 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14594300#comment-14594300 ] Jun Gong commented on YARN-3809: [~jlowe] Thank you for the very detailed suggestions. It

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-24 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600489#comment-14600489 ] Jun Gong commented on YARN-3809: Thanks [~rohithsharma], [~devaraj.k] and [~kasha] for

[jira] [Updated] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-19 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3809: --- Attachment: YARN-3809.02.patch Failed to launch new attempts because ApplicationMasterLauncher's threads all

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-19 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593452#comment-14593452 ] Jun Gong commented on YARN-3809: [~jlowe] Thanks for explanation and suggestions. I

[jira] [Created] (YARN-3833) TestWorkPreservingRMRestart#testSchedulerRecovery fails in trunk

2015-06-19 Thread Jun Gong (JIRA)
Jun Gong created YARN-3833: -- Summary: TestWorkPreservingRMRestart#testSchedulerRecovery fails in trunk Key: YARN-3833 URL: https://issues.apache.org/jira/browse/YARN-3833 Project: Hadoop YARN

[jira] [Created] (YARN-3831) Localization failed when a local disk turns from bad to good without NM initializes it

2015-06-19 Thread Jun Gong (JIRA)
Jun Gong created YARN-3831: -- Summary: Localization failed when a local disk turns from bad to good without NM initializes it Key: YARN-3831 URL: https://issues.apache.org/jira/browse/YARN-3831 Project:

[jira] [Commented] (YARN-3833) TestWorkPreservingRMRestart#testSchedulerRecovery fails in trunk

2015-06-19 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14593476#comment-14593476 ] Jun Gong commented on YARN-3833: [~brahmareddy] Thank you. Closing it now.

[jira] [Resolved] (YARN-3833) TestWorkPreservingRMRestart#testSchedulerRecovery fails in trunk

2015-06-19 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong resolved YARN-3833. Resolution: Duplicate TestWorkPreservingRMRestart#testSchedulerRecovery fails in trunk

[jira] [Updated] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-16 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3809: --- Attachment: YARN-3809.01.patch Attach a patch. Make thread pool size configurable, and default size is 50.

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-16 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14589164#comment-14589164 ] Jun Gong commented on YARN-3809: The checkstyle error is : YarnConfiguration.java: File

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-17 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14591093#comment-14591093 ] Jun Gong commented on YARN-3809: [~devaraj.k] and [~kasha], thank you for the comments and

[jira] [Created] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-15 Thread Jun Gong (JIRA)
Jun Gong created YARN-3809: -- Summary: Failed to launch new attempts because ApplicationMasterLauncher's threads all hang Key: YARN-3809 URL: https://issues.apache.org/jira/browse/YARN-3809 Project: Hadoop

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-16 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587555#comment-14587555 ] Jun Gong commented on YARN-3809: The stack is as following: {noformat} 2015-06-15

[jira] [Commented] (YARN-3809) Failed to launch new attempts because ApplicationMasterLauncher's threads all hang

2015-06-15 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587474#comment-14587474 ] Jun Gong commented on YARN-3809: How about setting thread pool size in

[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

2015-05-27 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560679#comment-14560679 ] Jun Gong commented on YARN-3712: [~vinodkv] Our case: NM receives a event SHUTDOWN, and

[jira] [Commented] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

2015-05-27 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560632#comment-14560632 ] Jun Gong commented on YARN-3712: [~sidharta-s] [~ashahab] Thanks for the suggestion. I am

[jira] [Updated] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

2015-05-27 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3712: --- Priority: Minor (was: Major) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

[jira] [Assigned] (YARN-3644) Node manager shuts down if unable to connect with RM

2015-05-27 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong reassigned YARN-3644: -- Assignee: Jun Gong (was: Raju Bairishetti) Node manager shuts down if unable to connect with RM

[jira] [Updated] (YARN-3644) Node manager shuts down if unable to connect with RM

2015-05-27 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3644: --- Assignee: Raju Bairishetti (was: Jun Gong) Node manager shuts down if unable to connect with RM

[jira] [Commented] (YARN-3644) Node manager shuts down if unable to connect with RM

2015-05-27 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562210#comment-14562210 ] Jun Gong commented on YARN-3644: Sorry, by mistake... Node manager shuts down if unable

[jira] [Commented] (YARN-3644) Node manager shuts down if unable to connect with RM

2015-05-27 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3644?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14562209#comment-14562209 ] Jun Gong commented on YARN-3644: Sorry, by mistake... Node manager shuts down if unable

[jira] [Updated] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

2015-05-25 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3712: --- Attachment: YARN-3712.01.patch ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

[jira] [Created] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

2015-05-25 Thread Jun Gong (JIRA)
Jun Gong created YARN-3712: -- Summary: ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously Key: YARN-3712 URL: https://issues.apache.org/jira/browse/YARN-3712 Project: Hadoop YARN

[jira] [Updated] (YARN-3712) ContainersLauncher: handle event CLEANUP_CONTAINER asynchronously

2015-05-26 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-3712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-3712: --- Attachment: YARN-3712.02.patch Fix checkstyle warnings. ContainersLauncher: handle event CLEANUP_CONTAINER

[jira] [Updated] (YARN-4057) If ContainersMonitor is not enabled, only print related log info one time

2015-08-17 Thread Jun Gong (JIRA)
[ https://issues.apache.org/jira/browse/YARN-4057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Gong updated YARN-4057: --- Attachment: YARN-4057.01.patch If ContainersMonitor is not enabled, only print related log info one time

  1   2   3   4   5   >