[jira] [Commented] (YARN-9992) Max allocation per queue is zero for custom resource types on RM startup

2019-12-05 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16988937#comment-16988937
 ] 

Eric Payne commented on YARN-9992:
--

Thanks [~jhung] , I verified that backporting YARN-9205 fixed the problem.

> Max allocation per queue is zero for custom resource types on RM startup
> 
>
> Key: YARN-9992
> URL: https://issues.apache.org/jira/browse/YARN-9992
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9992.001.patch
>
>
> Found an issue where trying to request GPUs on a newly booted RM cannot 
> schedule. It throws the exception in 
> SchedulerUtils#throwInvalidResourceException:
> {noformat}
> throw new InvalidResourceRequestException(
> "Invalid resource request, requested resource type=[" + reqResourceName
> + "] < 0 or greater than maximum allowed allocation. Requested "
> + "resource=" + reqResource + ", maximum allowed allocation="
> + availableResource
> + ", please note that maximum allowed allocation is calculated "
> + "by scheduler based on maximum resource of registered "
> + "NodeManagers, which might be less than configured "
> + "maximum allocation="
> + ResourceUtils.getResourceTypesMaximumAllocation());{noformat}
> Upon refreshing scheduler (e.g. via refreshQueues), GPU scheduling works 
> again.
> I think the RC is that upon scheduler refresh, resource-types.xml is loaded 
> in CapacitySchedulerConfiguration (as part of YARN-7738), so when we call 
> ResourceUtils#fetchMaximumAllocationFromConfig in 
> CapacitySchedulerConfiguration#getMaximumAllocationPerQueue, it's able to 
> fetch the {{yarn.resource-types}} config. But resource-types.xml is not 
> loaded into the conf in CapacityScheduler#initScheduler, so it doesn't find 
> the custom resource when computing max allocations, and the custom resource 
> max allocation is 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9992) Max allocation per queue is zero for custom resource types on RM startup

2019-12-03 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987270#comment-16987270
 ] 

Jonathan Hung commented on YARN-9992:
-

Hmm, not sure how I missed this before, I think it's related to YARN-9205. Let 
me try porting that.

> Max allocation per queue is zero for custom resource types on RM startup
> 
>
> Key: YARN-9992
> URL: https://issues.apache.org/jira/browse/YARN-9992
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9992.001.patch
>
>
> Found an issue where trying to request GPUs on a newly booted RM cannot 
> schedule. It throws the exception in 
> SchedulerUtils#throwInvalidResourceException:
> {noformat}
> throw new InvalidResourceRequestException(
> "Invalid resource request, requested resource type=[" + reqResourceName
> + "] < 0 or greater than maximum allowed allocation. Requested "
> + "resource=" + reqResource + ", maximum allowed allocation="
> + availableResource
> + ", please note that maximum allowed allocation is calculated "
> + "by scheduler based on maximum resource of registered "
> + "NodeManagers, which might be less than configured "
> + "maximum allocation="
> + ResourceUtils.getResourceTypesMaximumAllocation());{noformat}
> Upon refreshing scheduler (e.g. via refreshQueues), GPU scheduling works 
> again.
> I think the RC is that upon scheduler refresh, resource-types.xml is loaded 
> in CapacitySchedulerConfiguration (as part of YARN-7738), so when we call 
> ResourceUtils#fetchMaximumAllocationFromConfig in 
> CapacitySchedulerConfiguration#getMaximumAllocationPerQueue, it's able to 
> fetch the {{yarn.resource-types}} config. But resource-types.xml is not 
> loaded into the conf in CapacityScheduler#initScheduler, so it doesn't find 
> the custom resource when computing max allocations, and the custom resource 
> max allocation is 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9992) Max allocation per queue is zero for custom resource types on RM startup

2019-12-03 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987105#comment-16987105
 ] 

Eric Payne commented on YARN-9992:
--

The code changes look fine, but I'm still trying to understand what is 
different between trunk and branch-2. These code changes are not in trunk, but 
something is picking up the resource-types.xml in the CS init path.

> Max allocation per queue is zero for custom resource types on RM startup
> 
>
> Key: YARN-9992
> URL: https://issues.apache.org/jira/browse/YARN-9992
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9992.001.patch
>
>
> Found an issue where trying to request GPUs on a newly booted RM cannot 
> schedule. It throws the exception in 
> SchedulerUtils#throwInvalidResourceException:
> {noformat}
> throw new InvalidResourceRequestException(
> "Invalid resource request, requested resource type=[" + reqResourceName
> + "] < 0 or greater than maximum allowed allocation. Requested "
> + "resource=" + reqResource + ", maximum allowed allocation="
> + availableResource
> + ", please note that maximum allowed allocation is calculated "
> + "by scheduler based on maximum resource of registered "
> + "NodeManagers, which might be less than configured "
> + "maximum allocation="
> + ResourceUtils.getResourceTypesMaximumAllocation());{noformat}
> Upon refreshing scheduler (e.g. via refreshQueues), GPU scheduling works 
> again.
> I think the RC is that upon scheduler refresh, resource-types.xml is loaded 
> in CapacitySchedulerConfiguration (as part of YARN-7738), so when we call 
> ResourceUtils#fetchMaximumAllocationFromConfig in 
> CapacitySchedulerConfiguration#getMaximumAllocationPerQueue, it's able to 
> fetch the {{yarn.resource-types}} config. But resource-types.xml is not 
> loaded into the conf in CapacityScheduler#initScheduler, so it doesn't find 
> the custom resource when computing max allocations, and the custom resource 
> max allocation is 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9992) Max allocation per queue is zero for custom resource types on RM startup

2019-12-02 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986454#comment-16986454
 ] 

Eric Payne commented on YARN-9992:
--

[~jhung], it looks like this is only a problem on branch-2 and branch-2.10. Is 
that your analysis as well?

> Max allocation per queue is zero for custom resource types on RM startup
> 
>
> Key: YARN-9992
> URL: https://issues.apache.org/jira/browse/YARN-9992
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9992.001.patch
>
>
> Found an issue where trying to request GPUs on a newly booted RM cannot 
> schedule. It throws the exception in 
> SchedulerUtils#throwInvalidResourceException:
> {noformat}
> throw new InvalidResourceRequestException(
> "Invalid resource request, requested resource type=[" + reqResourceName
> + "] < 0 or greater than maximum allowed allocation. Requested "
> + "resource=" + reqResource + ", maximum allowed allocation="
> + availableResource
> + ", please note that maximum allowed allocation is calculated "
> + "by scheduler based on maximum resource of registered "
> + "NodeManagers, which might be less than configured "
> + "maximum allocation="
> + ResourceUtils.getResourceTypesMaximumAllocation());{noformat}
> Upon refreshing scheduler (e.g. via refreshQueues), GPU scheduling works 
> again.
> I think the RC is that upon scheduler refresh, resource-types.xml is loaded 
> in CapacitySchedulerConfiguration (as part of YARN-7738), so when we call 
> ResourceUtils#fetchMaximumAllocationFromConfig in 
> CapacitySchedulerConfiguration#getMaximumAllocationPerQueue, it's able to 
> fetch the {{yarn.resource-types}} config. But resource-types.xml is not 
> loaded into the conf in CapacityScheduler#initScheduler, so it doesn't find 
> the custom resource when computing max allocations, and the custom resource 
> max allocation is 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9992) Max allocation per queue is zero for custom resource types on RM startup

2019-12-02 Thread Eric Payne (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986312#comment-16986312
 ] 

Eric Payne commented on YARN-9992:
--

Thanks [~jhung] for reporting this issue and putting up a patch. I encountered 
this problem as well. I'll take a look at the patch soon.

> Max allocation per queue is zero for custom resource types on RM startup
> 
>
> Key: YARN-9992
> URL: https://issues.apache.org/jira/browse/YARN-9992
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9992.001.patch
>
>
> Found an issue where trying to request GPUs on a newly booted RM cannot 
> schedule. It throws the exception in 
> SchedulerUtils#throwInvalidResourceException:
> {noformat}
> throw new InvalidResourceRequestException(
> "Invalid resource request, requested resource type=[" + reqResourceName
> + "] < 0 or greater than maximum allowed allocation. Requested "
> + "resource=" + reqResource + ", maximum allowed allocation="
> + availableResource
> + ", please note that maximum allowed allocation is calculated "
> + "by scheduler based on maximum resource of registered "
> + "NodeManagers, which might be less than configured "
> + "maximum allocation="
> + ResourceUtils.getResourceTypesMaximumAllocation());{noformat}
> Upon refreshing scheduler (e.g. via refreshQueues), GPU scheduling works 
> again.
> I think the RC is that upon scheduler refresh, resource-types.xml is loaded 
> in CapacitySchedulerConfiguration (as part of YARN-7738), so when we call 
> ResourceUtils#fetchMaximumAllocationFromConfig in 
> CapacitySchedulerConfiguration#getMaximumAllocationPerQueue, it's able to 
> fetch the {{yarn.resource-types}} config. But resource-types.xml is not 
> loaded into the conf in CapacityScheduler#initScheduler, so it doesn't find 
> the custom resource when computing max allocations, and the custom resource 
> max allocation is 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org



[jira] [Commented] (YARN-9992) Max allocation per queue is zero for custom resource types on RM startup

2019-11-26 Thread Hadoop QA (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983125#comment-16983125
 ] 

Hadoop QA commented on YARN-9992:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
47s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 23s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 28s{color} | {color:orange} 
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager:
 The patch generated 1 new + 102 unchanged - 0 fixed = 103 total (was 102) 
{color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 42s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 85m 
29s{color} | {color:green} hadoop-yarn-server-resourcemanager in the patch 
passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}142m  3s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=19.03.5 Server=19.03.5 Image:yetus/hadoop:104ccca9169 |
| JIRA Issue | YARN-9992 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12986851/YARN-9992.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux dae86372c6c0 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 
05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ef950b0 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_222 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-YARN-Build/25225/artifact/out/diff-checkstyle-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-YARN-Build/25225/testReport/ |
| Max. process+th

[jira] [Commented] (YARN-9992) Max allocation per queue is zero for custom resource types on RM startup

2019-11-26 Thread Jonathan Hung (Jira)


[ 
https://issues.apache.org/jira/browse/YARN-9992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16983036#comment-16983036
 ] 

Jonathan Hung commented on YARN-9992:
-

Attached a simple one liner [^YARN-9992.001.patch]

> Max allocation per queue is zero for custom resource types on RM startup
> 
>
> Key: YARN-9992
> URL: https://issues.apache.org/jira/browse/YARN-9992
> Project: Hadoop YARN
>  Issue Type: Bug
>Reporter: Jonathan Hung
>Assignee: Jonathan Hung
>Priority: Major
> Attachments: YARN-9992.001.patch
>
>
> Found an issue where trying to request GPUs on a newly booted RM cannot 
> schedule. It throws the exception in 
> SchedulerUtils#throwInvalidResourceException:
> {noformat}
> throw new InvalidResourceRequestException(
> "Invalid resource request, requested resource type=[" + reqResourceName
> + "] < 0 or greater than maximum allowed allocation. Requested "
> + "resource=" + reqResource + ", maximum allowed allocation="
> + availableResource
> + ", please note that maximum allowed allocation is calculated "
> + "by scheduler based on maximum resource of registered "
> + "NodeManagers, which might be less than configured "
> + "maximum allocation="
> + ResourceUtils.getResourceTypesMaximumAllocation());{noformat}
> Upon refreshing scheduler (e.g. via refreshQueues), GPU scheduling works 
> again.
> I think the RC is that upon scheduler refresh, resource-types.xml is loaded 
> in CapacitySchedulerConfiguration (as part of YARN-7738), so when we call 
> ResourceUtils#fetchMaximumAllocationFromConfig in 
> CapacitySchedulerConfiguration#getMaximumAllocationPerQueue, it's able to 
> fetch the {{yarn.resource-types}} config. But resource-types.xml is not 
> loaded into the conf in CapacityScheduler#initScheduler, so it doesn't find 
> the custom resource when computing max allocations, and the custom resource 
> max allocation is 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org