[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472399#comment-16472399 ] Szilard Nemeth commented on YARN-8248: -- Thanks [~haibochen] for your answers. It makes sense now why you wanted to remove changes from Resources. Also changed the scope of the debug logs and just kept those that are "edge" cases and come up more rarely. About the 3rd point: I checked the writeLock's scope but it cannod be reduced since the following 2 lines should be called when lock was called on the writeLock: {code:java} RMApp rmApp = rmContext.getRMApps().get(applicationId); FSLeafQueue queue = assignToQueue(rmApp, queueName, user); {code} Please check my updated patch! Thanks! > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch, > YARN-8248-003.patch, YARN-8248-004.patch, YARN-8248-005.patch, > YARN-8248-006.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot serve the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472335#comment-16472335 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for the clarification. I think in general it's okay for fix some minor code/style cleanup around the same area while working on a patch. It becomes however, overhead if the minor changes cause confusion, or is to remote to your core change. Folks reading the commit history would also have questions, without digging into the Jira discussion. Hence, I'd prefer, in this case, to leave Resources as is. If there are many cleanup issues you can find, a separate patch is justifiable. I agree with you debug logs help debugging, especially on the unhappy/abnormal code path. However, if we add too much logging on the happy hot code path, the logs will be flooded. > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch, > YARN-8248-003.patch, YARN-8248-004.patch, YARN-8248-005.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot serve the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16472258#comment-16472258 ] Szilard Nemeth commented on YARN-8248: -- Thanks [~haibochen] for your comments! # I added some code in one of the earlier version of this patch to Resources. Then when I removed that code since it turned out it's unnecessary, I realized the IDE showed that the continue statements are not required, so it's a simple code cleanup. If it is discouraged in general to touch anything other than the bare minimum needed to fix the issue, I will remove those changes. But I'm still curious what's the correct way of working if I detect something minor code fix like that later on. Creating a separate Jira task for that seems overkill. # Those debug logs still are not strictly necessary, but these logs helped me to understand why the FS hangs, so in the end I kept those logs because I think it could save a lot of time for anyone if such an edge case happens what I fixed. If you don't agree with this, I can remove the logs. # Good catch, I will fix this. Please check 1 and 2 and decide how to go forward with those, the 3rd is rather trivial to fix.\ Thanks! > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch, > YARN-8248-003.patch, YARN-8248-004.patch, YARN-8248-005.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot serve the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471309#comment-16471309 ] Haibo Chen commented on YARN-8248: -- Thanks [~snemeth] for the patch. I have some questions. 1) Why the change in Resources.java? I don't see how it helps resolve the issue targeted in this jira. 2) There are many debug messages added to this patch. Again, are they necessary to solve this issue? 3) In FairScheduler.addApplication(), we are adding more code to the write lock. I think it is safe to reduce the write lock scope to just the mutation part. > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch, > YARN-8248-003.patch, YARN-8248-004.patch, YARN-8248-005.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot serve the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470210#comment-16470210 ] Gergo Repas commented on YARN-8248: --- [~snemeth] Thanks for the updated patch, LGTM. +1 (non-binding) > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch, > YARN-8248-003.patch, YARN-8248-004.patch, YARN-8248-005.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot serve the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16470013#comment-16470013 ] genericqa commented on YARN-8248: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 24s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 44s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 10s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 19s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 5s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 8s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 9m 51s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 7s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 68m 29s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 28s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}144m 29s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8248 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12922783/YARN-8248-005.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6567c5ba196a 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / cc0310a | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20673/testReport/ | | Max. process+thread count | 844 (vs. ulimit of 1
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469925#comment-16469925 ] Szilard Nemeth commented on YARN-8248: -- Fixed the unit test failure and last whitespace style check with patch 005, at least this patch should fix those issues. > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch, > YARN-8248-003.patch, YARN-8248-004.patch, YARN-8248-005.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot serve the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469549#comment-16469549 ] genericqa commented on YARN-8248: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 51s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 15s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 47s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 57s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 28s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 29s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 38s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 14s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 14s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 68m 28s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}154m 57s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8248 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12922693/YARN-8248-004.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux acfac7dd87bc 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / af4fc2e | | maven | version: Apache Maven 3.3.9 | | Default
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469291#comment-16469291 ] Szilard Nemeth commented on YARN-8248: -- Hey [~grepas]! Both are valid points, I was thinking about the pros and cons of logging with error level but basically I agree, from the scheduler's point of view it's not an error so I changed the level back to info. Also fixed the remained duplicated loggings as well. Please check the updated patch! Thanks! > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch, > YARN-8248-003.patch, YARN-8248-004.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot serve the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469175#comment-16469175 ] genericqa commented on YARN-8248: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 25s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 2 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 9s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 22s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 12s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 38s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 18s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 10m 56s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 7s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 67m 20s{color} | {color:red} hadoop-yarn-server-resourcemanager in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}152m 3s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8248 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12922643/YARN-8248-003.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux ad0fcc00d334 3.13.0-137-generic #186-Ubuntu SMP Mon Dec 4 19:09:19 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 343b51d | | maven | version: Apache Maven 3.3.9 | | Default
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468961#comment-16468961 ] Gergo Repas commented on YARN-8248: --- [~snemeth] Thanks for the new patch! I can see you introduced FairScheduler.rejectApplicationWithMessage() which logs the rejection message with error level. However I don't think rejections should be logged with error level, since that's rather normal behaviour and doesn't indicate an error in the scheduler's operation. Also, in some cases you removed the logging before the rejectApplicationWithMessage() call, sometimes you kept it, I think it should be consistent. > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch, > YARN-8248-003.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot serve the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16468911#comment-16468911 ] Szilard Nemeth commented on YARN-8248: -- Hi [~grepas]! Thanks for your comments. It is a very good point what you have brought up, I fixed the code accordingly and removed my newly introduced Resource method. Please check the updated patch! Thanks! > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch, > YARN-8248-003.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot serve the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16467091#comment-16467091 ] Gergo Repas commented on YARN-8248: --- [~snemeth] Thanks for working on this. In the Resources class the patch introduces a new method ({{hasAnyZeroRequestedResource()}}), which seems to be very specific to this usecase. It may be worth to check if you can achieve the same logic by using existing methods of this class (e.g. {{fitsIn()}}, {{isAnyMajorResourceZero()}}). > Job hangs when queue is specified and that queue has 0 capability of a > resource > --- > > Key: YARN-8248 > URL: https://issues.apache.org/jira/browse/YARN-8248 > Project: Hadoop YARN > Issue Type: Bug > Components: fairscheduler, yarn >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-8248-001.patch, YARN-8248-002.patch > > > Job hangs when mapreduce.job.queuename is specified and the queue has 0 of > any resource (vcores / memory / other) > In this scenario, the job should be immediately rejected upon submission > since the specified queue cannot server the resource needs of the submitted > job. > > Command to run: > {code:java} > bin/yarn jar > "./share/hadoop/mapreduce/hadoop-mapreduce-examples-$MY_HADOOP_VERSION.jar" > pi -Dmapreduce.job.queuename=sample_queue 1 1000;{code} > fair-scheduler.xml queue config (excerpt): > > {code:java} > > 1 mb,0vcores > 9 mb,0vcores > 50 > -1.0f > 2.0 > fair > > {code} > Diagnostic message from the web UI: > {code:java} > Wed May 02 06:35:57 -0700 2018] Application is added to the scheduler and is > not yet activated. (Resource request: exceeds current > queue or its parents maximum resource allowed).{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465305#comment-16465305 ] genericqa commented on YARN-8248: - | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 32s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 28s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 54s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 43s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 11s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 11s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 67m 3s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 36s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}152m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8248 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12922195/YARN-8248-002.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 210707dc5da0 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e9159db | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC1 | | Test Results | https://builds.apache.org/job/PreCommit-YARN-Build/20610/testReport/ | | Max. process+thread count | 882 (vs. ulimit of 100
[jira] [Commented] (YARN-8248) Job hangs when queue is specified and that queue has 0 capability of a resource
[ https://issues.apache.org/jira/browse/YARN-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465264#comment-16465264 ] genericqa commented on YARN-8248: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 31s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 25m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 8m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 13m 42s{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 21s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 7m 15s{color} | {color:green} the patch passed {color} | | {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange} 1m 23s{color} | {color:orange} hadoop-yarn-project/hadoop-yarn: The patch generated 2 new + 251 unchanged - 0 fixed = 253 total (was 251) {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 11m 23s{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 54s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 20s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 11s{color} | {color:green} hadoop-yarn-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 66m 29s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 35s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}152m 19s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:abb62dd | | JIRA Issue | YARN-8248 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12922078/YARN-8248-001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 519eddf3b63c 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/patchprocess/precommit/personality/provided.sh | | git revision | trunk / e9159db | | maven | version: Apache Maven 3.3.9 | | Default Java | 1.8.0