[jira] [Commented] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721422#comment-14721422 ] Shiwei Guo commented on YARN-3933: -- I improved the patch and resubmitted it here, seems OK to Hadoop QA. The old patch with bad filename is deleted by me. > Race condition when calling AbstractYarnScheduler.completedContainer. > - > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0, 2.7.0, 2.5.2, 2.7.1 >Reporter: Lavkesh Lahngir >Assignee: Shiwei Guo > Labels: patch > Attachments: YARN-3933.001.patch > > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are > no longer needed then it calls queue.completedContainer() which causes > resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate > excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721418#comment-14721418 ] Hadoop QA commented on YARN-3933: - \\ \\ | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 16m 21s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 2 new or modified test files. | | {color:green}+1{color} | javac | 7m 45s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 48s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 50s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 0s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 27s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 36s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 28s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 53m 40s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 92m 22s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12753166/YARN-3933.001.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 837fb75 | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8944/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8944/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8944/console | This message was automatically generated. > Race condition when calling AbstractYarnScheduler.completedContainer. > - > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0, 2.7.0, 2.5.2, 2.7.1 >Reporter: Lavkesh Lahngir >Assignee: Shiwei Guo > Labels: patch > Attachments: YARN-3933.001.patch > > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are > no longer needed then it calls queue.completedContainer() which causes > resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate > excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiwei Guo updated YARN-3933: - Attachment: YARN-3933.001.patch > Race condition when calling AbstractYarnScheduler.completedContainer. > - > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.6.0, 2.7.0, 2.5.2, 2.7.1 >Reporter: Lavkesh Lahngir >Assignee: Shiwei Guo > Labels: patch > Attachments: YARN-3933.001.patch > > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are > no longer needed then it calls queue.completedContainer() which causes > resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate > excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721290#comment-14721290 ] Junping Du commented on YARN-3933: -- Sorry. I didn't see the patch you were mentioning. Did you delete it? > Race condition when calling AbstractYarnScheduler.completedContainer. > - > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.2 >Reporter: Lavkesh Lahngir >Assignee: Shiwei Guo > Labels: patch > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are > no longer needed then it calls queue.completedContainer() which causes > resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate > excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721163#comment-14721163 ] Hadoop QA commented on YARN-4024: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 19m 7s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 41s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 45s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 2m 0s | The applied patch generated 1 new checkstyle issues (total was 211, now 211). | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 30s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 33s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 4m 40s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 0m 23s | Tests passed in hadoop-yarn-api. | | {color:green}+1{color} | yarn tests | 1m 58s | Tests passed in hadoop-yarn-common. | | {color:green}+1{color} | yarn tests | 53m 33s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 102m 19s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-yarn-server-resourcemanager | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12751913/YARN-4024-v7.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e2c9b28 | | checkstyle | https://builds.apache.org/job/PreCommit-YARN-Build/8943/artifact/patchprocess/diffcheckstylehadoop-yarn-api.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-YARN-Build/8943/artifact/patchprocess/newPatchFindbugsWarningshadoop-yarn-server-resourcemanager.html | | hadoop-yarn-api test log | https://builds.apache.org/job/PreCommit-YARN-Build/8943/artifact/patchprocess/testrun_hadoop-yarn-api.txt | | hadoop-yarn-common test log | https://builds.apache.org/job/PreCommit-YARN-Build/8943/artifact/patchprocess/testrun_hadoop-yarn-common.txt | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8943/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8943/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8943/console | This message was automatically generated. > YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat > -- > > Key: YARN-4024 > URL: https://issues.apache.org/jira/browse/YARN-4024 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Hong Zhiguo > Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, > YARN-4024-draft.patch, YARN-4024-v4.patch, YARN-4024-v5.patch, > YARN-4024-v6.patch, YARN-4024-v7.patch > > > Currently, YARN RM NodesListManager will resolve IP address every time when > node doing heartbeat. When DNS server becomes slow, NM heartbeat will be > blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-4024) YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat
[ https://issues.apache.org/jira/browse/YARN-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721144#comment-14721144 ] Hong Zhiguo commented on YARN-4024: --- Why jenkins doesn't run against the latest patch? > YARN RM should avoid unnecessary resolving IP when NMs doing heartbeat > -- > > Key: YARN-4024 > URL: https://issues.apache.org/jira/browse/YARN-4024 > Project: Hadoop YARN > Issue Type: Improvement > Components: resourcemanager >Reporter: Wangda Tan >Assignee: Hong Zhiguo > Attachments: YARN-4024-draft-v2.patch, YARN-4024-draft-v3.patch, > YARN-4024-draft.patch, YARN-4024-v4.patch, YARN-4024-v5.patch, > YARN-4024-v6.patch, YARN-4024-v7.patch > > > Currently, YARN RM NodesListManager will resolve IP address every time when > node doing heartbeat. When DNS server becomes slow, NM heartbeat will be > blocked and cannot make progress. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiwei Guo updated YARN-3933: - Attachment: (was: YARN-3933.001.patch) > Race condition when calling AbstractYarnScheduler.completedContainer. > - > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.2 >Reporter: Lavkesh Lahngir >Assignee: Shiwei Guo > Labels: patch > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are > no longer needed then it calls queue.completedContainer() which causes > resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate > excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiwei Guo updated YARN-3933: - Attachment: YARN-3933.001.patch > Race condition when calling AbstractYarnScheduler.completedContainer. > - > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.2 >Reporter: Lavkesh Lahngir >Assignee: Shiwei Guo > Labels: patch > Attachments: YARN-3933.001.patch > > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are > no longer needed then it calls queue.completedContainer() which causes > resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate > excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer.
[ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiwei Guo updated YARN-3933: - Attachment: (was: patch.BUGFIX-JIRA-YARN-3933.txt) > Race condition when calling AbstractYarnScheduler.completedContainer. > - > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager >Affects Versions: 2.5.2 >Reporter: Lavkesh Lahngir >Assignee: Shiwei Guo > Labels: patch > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are > no longer needed then it calls queue.completedContainer() which causes > resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate > excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (YARN-3970) REST api support for Application Priority
[ https://issues.apache.org/jira/browse/YARN-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14721037#comment-14721037 ] Hadoop QA commented on YARN-3970: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 15m 28s | Findbugs (version ) appears to be broken on trunk. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 1 new or modified test files. | | {color:green}+1{color} | javac | 7m 42s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 9m 49s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 23s | The applied patch does not increase the total number of release audit warnings. | | {color:green}+1{color} | checkstyle | 0m 29s | There were no new checkstyle issues. | | {color:green}+1{color} | whitespace | 0m 2s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 1m 30s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | yarn tests | 54m 46s | Tests passed in hadoop-yarn-server-resourcemanager. | | | | 92m 17s | | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12753127/YARN-3970.20150829-1.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / e2c9b28 | | hadoop-yarn-server-resourcemanager test log | https://builds.apache.org/job/PreCommit-YARN-Build/8942/artifact/patchprocess/testrun_hadoop-yarn-server-resourcemanager.txt | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/8942/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf905.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-YARN-Build/8942/console | This message was automatically generated. > REST api support for Application Priority > - > > Key: YARN-3970 > URL: https://issues.apache.org/jira/browse/YARN-3970 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Naganarasimha G R > Attachments: YARN-3970.20150828-1.patch, YARN-3970.20150829-1.patch > > > REST api support for application priority. > - get/set priority of an application > - get default priority of a queue > - get cluster max priority -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (YARN-3970) REST api support for Application Priority
[ https://issues.apache.org/jira/browse/YARN-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Naganarasimha G R updated YARN-3970: Attachment: YARN-3970.20150829-1.patch Thanks [~sunilg] for the review comments, bq. priority.getPriority() != targetPriority.getPriority() We could use !priority.equals(targetPriority) targetPriority is of type "org.apache.hadoop.yarn.server.resourcemanager.webapp.dao.AppPriority" and priority is of type "org.apache.hadoop.yarn.api.records.Priority". so it cannot be compared as per your suggestion. bq. If app.getApplicationSubmissionContext().getPriority() is NULL, we will get n NPE here. Well went through the flow again, i think i have got the if clause wrong here, what i am trying to check here is, if target priority is same as the current priority then just return as success, so i can just return back with the target priority. Also i think i need to additionally validate whether target priority is not null. So after correction of all this, NPE is not possible here but other checks i will be adding as follows {code} if (targetPriority == null) { throw new YarnException("Target Priority cannot be null"); } . . . Priority priority = app.getApplicationSubmissionContext().getPriority(); if (priority == null || priority.getPriority() != targetPriority.getPriority()) { return modifyApplicationPriority(app, callerUGI, targetPriority.getPriority()); } return Response.status(Status.OK).entity(targetPriority).build(); {code} > REST api support for Application Priority > - > > Key: YARN-3970 > URL: https://issues.apache.org/jira/browse/YARN-3970 > Project: Hadoop YARN > Issue Type: Sub-task > Components: webapp >Affects Versions: 2.7.1 >Reporter: Sunil G >Assignee: Naganarasimha G R > Attachments: YARN-3970.20150828-1.patch, YARN-3970.20150829-1.patch > > > REST api support for application priority. > - get/set priority of an application > - get default priority of a queue > - get cluster max priority -- This message was sent by Atlassian JIRA (v6.3.4#6332)