[jira] [Updated] (YARN-10168) FS-CS Converter: tool doesn't handle min/max resource conversion correctly
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10168: Attachment: YARN-10168-002.patch > FS-CS Converter: tool doesn't handle min/max resource conversion correctly > -- > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Blocker > Labels: fs2cs > Attachments: YARN-10168-001.patch, YARN-10168-002.patch > > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS, we should pick and choose either weight or minResource to > generate CS. > 3) > In FS, mix-use of absolute-resource configs (like min/maxResource), and > percentage-based (like weight) is allowed. But in CS, it is not allowed. The > reason is discussed on YARN-5881, and find [a]Should we support specifying a > mix of percentage ... > The existing fs2cs doesn't handle the issue, which could set mixed absolute > resource and percentage-based resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS
[ https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17053319#comment-17053319 ] Peter Bacsko commented on YARN-9879: [~prabhujoseph] could you try it with this mapping rule? {{u:%user:root.batch.%user}} That is, you give the full path, not just the leaf queue name. Although I believe {{QueueManager.get()}} should be able to retrieve both. > Allow multiple leaf queues with the same name in CS > --- > > Key: YARN-9879 > URL: https://issues.apache.org/jira/browse/YARN-9879 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Labels: fs2cs > Attachments: CSQueue.getQueueUsage.txt, DesignDoc_v1.pdf, > YARN-9879.POC001.patch, YARN-9879.POC002.patch, YARN-9879.POC003.patch, > YARN-9879.POC004.patch, YARN-9879.POC005.patch, YARN-9879.POC006.patch, > YARN-9879.POC007.patch, YARN-9879.POC008.patch, YARN-9879.POC009.patch, > YARN-9879.POC010.patch, YARN-9879.POC011.patch > > > Currently the leaf queue's name must be unique regardless of its position in > the queue hierarchy. > Design doc and first proposal is being made, I'll attach it as soon as it's > done. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10168) FS-CS Converter: tool doesn't handle min/max resource conversion correctly
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10168: Labels: fs2cs (was: ) > FS-CS Converter: tool doesn't handle min/max resource conversion correctly > -- > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Blocker > Labels: fs2cs > Attachments: YARN-10168-001.patch > > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS, we should pick and choose either weight or minResource to > generate CS. > 3) > In FS, mix-use of absolute-resource configs (like min/maxResource), and > percentage-based (like weight) is allowed. But in CS, it is not allowed. The > reason is discussed on YARN-5881, and find [a]Should we support specifying a > mix of percentage ... > The existing fs2cs doesn't handle the issue, which could set mixed absolute > resource and percentage-based resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10168) FS-CS Converter: tool doesn't handle min/max resource conversion correctly
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10168: Attachment: YARN-10168-001.patch > FS-CS Converter: tool doesn't handle min/max resource conversion correctly > -- > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Blocker > Attachments: YARN-10168-001.patch > > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS, we should pick and choose either weight or minResource to > generate CS. > 3) > In FS, mix-use of absolute-resource configs (like min/maxResource), and > percentage-based (like weight) is allowed. But in CS, it is not allowed. The > reason is discussed on YARN-5881, and find [a]Should we support specifying a > mix of percentage ... > The existing fs2cs doesn't handle the issue, which could set mixed absolute > resource and percentage-based resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10168) FS-CS Converter: tool doesn't handle min/max resource conversion correctly
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10168: Summary: FS-CS Converter: tool doesn't handle min/max resource conversion correctly (was: FS-CS Convert: Converter tool doesn't handle min/max resource conversion correct) > FS-CS Converter: tool doesn't handle min/max resource conversion correctly > -- > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Blocker > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS, we should pick and choose either weight or minResource to > generate CS. > 3) > In FS, mix-use of absolute-resource configs (like min/maxResource), and > percentage-based (like weight) is allowed. But in CS, it is not allowed. The > reason is discussed on YARN-5881, and find [a]Should we support specifying a > mix of percentage ... > The existing fs2cs doesn't handle the issue, which could set mixed absolute > resource and percentage-based resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10168) FS-CS Convert: Converter tool doesn't handle min/max resource conversion correct
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17052333#comment-17052333 ] Peter Bacsko commented on YARN-10168: - After offline discussion with [~leftnoteasy], I arrived at the following conclusion: * drop support for {{}} because it might be defined in absolute resources and it cannot be mixed with percentages * drop support for {{}} for the very same reason * enhance the conveter to emit a WARNING when these settings are defined * Always set {{maximum-capacity = 100}} for every queue [~leftnoteasy] can you give green light for the changes above? > FS-CS Convert: Converter tool doesn't handle min/max resource conversion > correct > > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Blocker > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS, we should pick and choose either weight or minResource to > generate CS. > 3) > In FS, mix-use of absolute-resource configs (like min/maxResource), and > percentage-based (like weight) is allowed. But in CS, it is not allowed. The > reason is discussed on YARN-5881, and find [a]Should we support specifying a > mix of percentage ... > The existing fs2cs doesn't handle the issue, which could set mixed absolute > resource and percentage-based resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10168) FS-CS Convert: Converter tool doesn't handle min/max resource conversion correct
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17052333#comment-17052333 ] Peter Bacsko edited comment on YARN-10168 at 3/5/20, 4:57 PM: -- After offline discussion with [~leftnoteasy], I arrived at the following conclusion: * drop support for {{}} because it might be defined in absolute resources and it cannot be mixed with percentages in CS * drop support for {{}} for the very same reason * enhance the conveter to emit a WARNING when these settings are defined * Always set {{maximum-capacity = 100}} for every queue [~leftnoteasy] can you give green light for the changes above? was (Author: pbacsko): After offline discussion with [~leftnoteasy], I arrived at the following conclusion: * drop support for {{}} because it might be defined in absolute resources and it cannot be mixed with percentages * drop support for {{}} for the very same reason * enhance the conveter to emit a WARNING when these settings are defined * Always set {{maximum-capacity = 100}} for every queue [~leftnoteasy] can you give green light for the changes above? > FS-CS Convert: Converter tool doesn't handle min/max resource conversion > correct > > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Blocker > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS, we should pick and choose either weight or minResource to > generate CS. > 3) > In FS, mix-use of absolute-resource configs (like min/maxResource), and > percentage-based (like weight) is allowed. But in CS, it is not allowed. The > reason is discussed on YARN-5881, and find [a]Should we support specifying a > mix of percentage ... > The existing fs2cs doesn't handle the issue, which could set mixed absolute > resource and percentage-based resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10168) FS-CS Convert: Converter tool doesn't handle min/max resource conversion correct
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko reassigned YARN-10168: --- Assignee: Peter Bacsko > FS-CS Convert: Converter tool doesn't handle min/max resource conversion > correct > > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Blocker > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS, we should pick and choose either weight or minResource to > generate CS. > 3) > In FS, mix-use of absolute-resource configs (like min/maxResource), and > percentage-based (like weight) is allowed. But in CS, it is not allowed. The > reason is discussed on YARN-5881, and find [a]Should we support specifying a > mix of percentage ... > The existing fs2cs doesn't handle the issue, which could set mixed absolute > resource and percentage-based resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17051313#comment-17051313 ] Peter Bacsko commented on YARN-10167: - [~snemeth] could you review the latest v5 patch? > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs, newbie > Attachments: YARN-10167-001.patch, YARN-10167-002.patch, > YARN-10167-003.patch, YARN-10167-004.patch, YARN-10167-005.patch > > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10167: Attachment: YARN-10167-005.patch > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs, newbie > Attachments: YARN-10167-001.patch, YARN-10167-002.patch, > YARN-10167-003.patch, YARN-10167-004.patch, YARN-10167-005.patch > > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10167: Attachment: YARN-10167-004.patch > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs, newbie > Attachments: YARN-10167-001.patch, YARN-10167-002.patch, > YARN-10167-003.patch, YARN-10167-004.patch > > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10167: Attachment: YARN-10167-003.patch > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs, newbie > Attachments: YARN-10167-001.patch, YARN-10167-002.patch, > YARN-10167-003.patch > > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10167: Attachment: YARN-10167-002.patch > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs, newbie > Attachments: YARN-10167-001.patch, YARN-10167-002.patch > > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10167: Attachment: YARN-10167-001.patch > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs, newbie > Attachments: YARN-10167-001.patch > > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko reassigned YARN-10167: --- Assignee: Peter Bacsko > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs, newbie > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10175) FS-CS converter: only convert placement rules if a cmd line switch is defined
[ https://issues.apache.org/jira/browse/YARN-10175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10175: Attachment: YARN-10175-001.patch > FS-CS converter: only convert placement rules if a cmd line switch is defined > - > > Key: YARN-10175 > URL: https://issues.apache.org/jira/browse/YARN-10175 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10175-001.patch > > > In the current form, the conversion of FS placement rules to CS mapping rules > has a lot of feature gaps and doesn't work properly. > The output is good as a starting point but sometimes it causes CS to throw an > exception. > Until a proper resolution is implemented, it's better to disable this by > default and introduce a command line switch. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10175) FS-CS converter: only convert placement rules if a cmd line switch is defined
Peter Bacsko created YARN-10175: --- Summary: FS-CS converter: only convert placement rules if a cmd line switch is defined Key: YARN-10175 URL: https://issues.apache.org/jira/browse/YARN-10175 Project: Hadoop YARN Issue Type: Sub-task Reporter: Peter Bacsko Assignee: Peter Bacsko In the current form, the conversion of FS placement rules to CS mapping rules has a lot of feature gaps and doesn't work properly. The output is good as a starting point but sometimes it causes CS to throw an exception. Until a proper resolution is implemented, it's better to disable this by default and introduce a command line switch. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046466#comment-17046466 ] Peter Bacsko edited comment on YARN-10167 at 2/27/20 11:01 AM: --- [~sunilg] it's way too complicated. IMO we don't need to contact the RM for validation. As [~leftnoteasy] said, the cluster might be down. Note that the converter itself already starts an FS instance inside to parse and load the allocation file. We can do the same thing with CS. Just load the converted config along with the delta {{yarn-site.xml}} (which essentially means that we merge the original site + the delta) and let's see if it can start. If not, we might have a problem and the configuration needs adjustments. Otherwise it's good (at least from a syntactic perspective). was (Author: pbacsko): [~sunilg] I think it's way too complicated. I don't think that we need to contact the RM for validation. As [~leftnoteasy] said, the cluster might be down. Note that the converter itself already starts an FS instance inside to parse and load the allocation file. We can do the same thing with CS. Just load the converted config along with the delta {{yarn-site.xml}} (which essentially means that we merge the original site + the delta) and let's see if it can start. If not, we might have a problem and the configuration needs adjustments. Otherwise it's good (at least from a syntactic perspective). > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Major > Labels: fs2cs, newbie > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046466#comment-17046466 ] Peter Bacsko commented on YARN-10167: - [~sunilg] I think it's way too complicated. I don't think that we need to contact the RM for validation. As [~leftnoteasy] said, the cluster might be down. Note that the converter itself already starts an FS instance inside to parse and load the allocation file. We can do the same thing with CS. Just load the converted config along with the delta {{yarn-site.xml}} (which essentially means that we merge the original site + the delta) and let's see if it can start. If not, we might have a problem and the configuration needs adjustments. Otherwise it's good (at least from a syntactic perspective). > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Major > Labels: fs2cs, newbie > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10168) FS-CS Convert: Converter tool doesn't handle min/max resource conversion correct
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046405#comment-17046405 ] Peter Bacsko edited comment on YARN-10168 at 2/27/20 9:33 AM: -- [~leftnoteasy] we really have to talk about this. _"In the existing FS2CS converter, when a percentage-based maximum resource is specified, the converter takes a global resource from fs2cs CLI, and applies percentages to that. It is not correct since the percentage-based value will get lost, and in the future when cluster resources go up and down, the maximum resource cannot be changed."_ That's true, but you can't define a vector of percentages to CS at the moment. That's why YARN-9936 was created, but unfortunately it hasn't been finished yet. So you can have only a single percentage. How do you deal with that if the input mem/vcore percentages are different? _"In FS, minResource defined the guaranteed resource, and weight defined how much the pie can grow to._ _So to me, in FS, we should pick and choose either weight or minResource to generate CS."_ {{}} is optional in FS. You don't always have it. The only thing that is mandatory is the weight. That's why weight was used as a starting point. _"In FS, mix-use of absolute-resource configs (like min/maxResource), and percentage-based (like weight) is allowed. But in CS, it is not allowed."_ That's weird. I was under the impression that for static queue configs, you can mix capacity and absolute resource. In this case, the verification of sum(caps) == 100.0 is skipped. So is this assumption false? was (Author: pbacsko): [~leftnoteasy] we really have to talk about this. _"In the existing FS2CS converter, when a percentage-based maximum resource is specified, the converter takes a global resource from fs2cs CLI, and applies percentages to that. It is not correct since the percentage-based value will get lost, and in the future when cluster resources go up and down, the maximum resource cannot be changed."_ That's true, but you can't define a vector of percentages to CS at the moment. That's why YARN-9936 was created, but unfortunately it hasn't been finished yet. So you can have only a single percentage. How do you deal with that if the input mem/vcore percentages are different? _"In FS, minResource defined the guaranteed resource, and weight defined how much the pie can grow to._ _So to me, in FS, we should pick and choose either weight or minResource to generate CS."_ {{}} is optional in FS. You don't always have it. The only thing that is mandatory is the weight. That's why weight was used as a starting point. _"In FS, mix-use of absolute-resource configs (like min/maxResource), and percentage-based (like weight) is allowed. But in CS, it is not allowed."_ That's weird. I was under the impression that for static queue configs, you can mix capacity and absolute resource. In this case, the verification of sum(caps) == 100.0 is skipped. So is this assumption false? > FS-CS Convert: Converter tool doesn't handle min/max resource conversion > correct > > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Blocker > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS,
[jira] [Commented] (YARN-10168) FS-CS Convert: Converter tool doesn't handle min/max resource conversion correct
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046406#comment-17046406 ] Peter Bacsko commented on YARN-10168: - BTW is this really a blocker for the converter? > FS-CS Convert: Converter tool doesn't handle min/max resource conversion > correct > > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Blocker > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS, we should pick and choose either weight or minResource to > generate CS. > 3) > In FS, mix-use of absolute-resource configs (like min/maxResource), and > percentage-based (like weight) is allowed. But in CS, it is not allowed. The > reason is discussed on YARN-5881, and find [a]Should we support specifying a > mix of percentage ... > The existing fs2cs doesn't handle the issue, which could set mixed absolute > resource and percentage-based resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10168) FS-CS Convert: Converter tool doesn't handle min/max resource conversion correct
[ https://issues.apache.org/jira/browse/YARN-10168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046405#comment-17046405 ] Peter Bacsko commented on YARN-10168: - [~leftnoteasy] we really have to talk about this. _"In the existing FS2CS converter, when a percentage-based maximum resource is specified, the converter takes a global resource from fs2cs CLI, and applies percentages to that. It is not correct since the percentage-based value will get lost, and in the future when cluster resources go up and down, the maximum resource cannot be changed."_ That's true, but you can't define a vector of percentages to CS at the moment. That's why YARN-9936 was created, but unfortunately it hasn't been finished yet. So you can have only a single percentage. How do you deal with that if the input mem/vcore percentages are different? _"In FS, minResource defined the guaranteed resource, and weight defined how much the pie can grow to._ _So to me, in FS, we should pick and choose either weight or minResource to generate CS."_ {{}} is optional in FS. You don't always have it. The only thing that is mandatory is the weight. That's why weight was used as a starting point. _"In FS, mix-use of absolute-resource configs (like min/maxResource), and percentage-based (like weight) is allowed. But in CS, it is not allowed."_ That's weird. I was under the impression that for static queue configs, you can mix capacity and absolute resource. In this case, the verification of sum(caps) == 100.0 is skipped. So is this assumption false? > FS-CS Convert: Converter tool doesn't handle min/max resource conversion > correct > > > Key: YARN-10168 > URL: https://issues.apache.org/jira/browse/YARN-10168 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Blocker > > Trying to understand logics of convert min and max resource from FS to CS, > and found some issues: > 1) > In FSQueueConverter#emitMaximumCapacity > Existing logic in FS is to either specify a maximum percentage for queues > against cluster resources. Or, specify an absolute valued maximum resource. > In the existing FS2CS converter, when a percentage-based maximum resource is > specified, the converter takes a global resource from fs2cs CLI, and applies > percentages to that. It is not correct since the percentage-based value will > get lost, and in the future when cluster resources go up and down, the > maximum resource cannot be changed. > 2) > The logic to deal with min/weight resource is also questionable: > The existing fs2cs tool, it takes precedence of percentage over > absoluteResource, and could set both to a queue config. See > FSQueueConverter.Capacity#toString > However, in CS, comparing to FS, the weights/min resource is quite different: > CS use the same queue.capacity to specify both percentage-based or > absolute-resource-based configs (Similar to how FS deal with maximum > Resource). > The capacity defines guaranteed resource, which also impact fairshare of the > queue. (The more guaranteed resource a queue has, the larger "pie" the queue > can get if there's any additional available resource). > In FS, minResource defined the guaranteed resource, and weight defined how > much the pie can grow to. > So to me, in FS, we should pick and choose either weight or minResource to > generate CS. > 3) > In FS, mix-use of absolute-resource configs (like min/maxResource), and > percentage-based (like weight) is allowed. But in CS, it is not allowed. The > reason is discussed on YARN-5881, and find [a]Should we support specifying a > mix of percentage ... > The existing fs2cs doesn't handle the issue, which could set mixed absolute > resource and percentage-based resources. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10167) FS-CS Converter: Need validate c-s.xml after converting
[ https://issues.apache.org/jira/browse/YARN-10167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17046312#comment-17046312 ] Peter Bacsko commented on YARN-10167: - I think the command can be part of {{yarn fs2cs}} tool, with a switch, like {{--validate-cs-config}} or whatever. > FS-CS Converter: Need validate c-s.xml after converting > --- > > Key: YARN-10167 > URL: https://issues.apache.org/jira/browse/YARN-10167 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wangda Tan >Priority: Major > Labels: fs2cs, newbie > > Currently we just generated c-s.xml, but we didn't validate that. To make > sure the c-s.xml is correct after conversion, it's better to initialize the > CS scheduler using configs. > Also, in the test, we should try to leverage MockRM to validate generated > configs as much as we could. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9893) Capacity scheduler: enhance leaf-queue-template capacity / maximum-capacity setting
[ https://issues.apache.org/jira/browse/YARN-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9893: --- Description: Capacity Scheduler does not support two percentage values for leaf queue capacity and maximum-capacity settings. So, you can't do something like this: {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=50.0%, vcores=50.0%}} Only a single percentage value is accepted. This makes it nearly impossible to properly convert a similar setting from Fair Scheduler, where such a configuration is valid and accepted ({{}}). Note: using absolute resources ({{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=16384, vcores=8}}) is addressed in YARN-10154. was: Capacity Scheduler does not support two percentage values for leaf queue capacity and maximum-capacity settings. So, you can't do something like this: {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=50.0%, vcores=50.0%}} On top of that, it's not even possible to define absolute resources: {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=16384, vcores=8}} Only a single percentage value is accepted. This makes it nearly impossible to properly convert a similar setting from Fair Scheduler, where such a configuration is valid and accepted ({{}}). > Capacity scheduler: enhance leaf-queue-template capacity / maximum-capacity > setting > --- > > Key: YARN-9893 > URL: https://issues.apache.org/jira/browse/YARN-9893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Manikandan R >Priority: Major > > Capacity Scheduler does not support two percentage values for leaf queue > capacity and maximum-capacity settings. So, you can't do something like this: > {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=50.0%, > vcores=50.0%}} > Only a single percentage value is accepted. > This makes it nearly impossible to properly convert a similar setting from > Fair Scheduler, where such a configuration is valid and accepted > ({{}}). > Note: using absolute resources > ({{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=16384, > vcores=8}}) is addressed in YARN-10154. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9893) Capacity scheduler: enhance leaf-queue-template capacity / maximum-capacity setting
[ https://issues.apache.org/jira/browse/YARN-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045721#comment-17045721 ] Peter Bacsko commented on YARN-9893: [~maniraj...@gmail.com] I'll re-phrase this JIRA and remove the absolute resource part. > Capacity scheduler: enhance leaf-queue-template capacity / maximum-capacity > setting > --- > > Key: YARN-9893 > URL: https://issues.apache.org/jira/browse/YARN-9893 > Project: Hadoop YARN > Issue Type: Sub-task > Components: capacity scheduler >Reporter: Peter Bacsko >Assignee: Manikandan R >Priority: Major > > Capacity Scheduler does not support two percentage values for leaf queue > capacity and maximum-capacity settings. So, you can't do something like this: > {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=50.0%, > vcores=50.0%}} > On top of that, it's not even possible to define absolute resources: > {{yarn.scheduler.capacity.root.users.john.leaf-queue-template.capacity=memory-mb=16384, > vcores=8}} > Only a single percentage value is accepted. > This makes it nearly impossible to properly convert a similar setting from > Fair Scheduler, where such a configuration is valid and accepted > ({{}}). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10158) FS-CS converter: convert property yarn.scheduler.fair.update-interval-ms
Peter Bacsko created YARN-10158: --- Summary: FS-CS converter: convert property yarn.scheduler.fair.update-interval-ms Key: YARN-10158 URL: https://issues.apache.org/jira/browse/YARN-10158 Project: Hadoop YARN Issue Type: Sub-task Reporter: Peter Bacsko Assignee: Peter Bacsko -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10135) FS-CS converter tool: issue warning on dynamic auto-create mapping rules
[ https://issues.apache.org/jira/browse/YARN-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10135: Attachment: YARN-10135-002.patch > FS-CS converter tool: issue warning on dynamic auto-create mapping rules > > > Key: YARN-10135 > URL: https://issues.apache.org/jira/browse/YARN-10135 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10135-001.patch, YARN-10135-002.patch > > > The converter tool should issue a warning whenever the conversion results in > mapping rules similar to these: > {{u:%user:[managedParentQueueName].[queueName]}} > {{u:%user:[managedParentQueueName].%user}} > {{u:%user:[managedParentQueueName].%primary_group}} > {{u:%user:[managedParentQueueName].%secondary_group}} > {{u:%user:%primary_group.%user}} > {{u:%user:%secondary_group.%user}} > {{u:%user:[managedParentQueuePath].%user}} > > The reason is that right now it's fully clear how we'll handle a case like > "u:%user:%primary_group.%user", where "%primary_group.%user" might result in > something like "users.john". > In case of "u:%user:[managedParentQueuePath].%user" , the > [managedParentQueuePath] is a result of a full path from Fair Scheduler. > Therefore it's not going to be a leaf queue. > The user might be required to do some fine tuning and adjust the property > "auto-create-child-queues". We should display a warning about these > additional steps. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10135) FS-CS converter tool: issue warning on dynamic auto-create mapping rules
[ https://issues.apache.org/jira/browse/YARN-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10135: Attachment: (was: YARN-10135-001.patch) > FS-CS converter tool: issue warning on dynamic auto-create mapping rules > > > Key: YARN-10135 > URL: https://issues.apache.org/jira/browse/YARN-10135 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10135-001.patch, YARN-10135-002.patch > > > The converter tool should issue a warning whenever the conversion results in > mapping rules similar to these: > {{u:%user:[managedParentQueueName].[queueName]}} > {{u:%user:[managedParentQueueName].%user}} > {{u:%user:[managedParentQueueName].%primary_group}} > {{u:%user:[managedParentQueueName].%secondary_group}} > {{u:%user:%primary_group.%user}} > {{u:%user:%secondary_group.%user}} > {{u:%user:[managedParentQueuePath].%user}} > > The reason is that right now it's fully clear how we'll handle a case like > "u:%user:%primary_group.%user", where "%primary_group.%user" might result in > something like "users.john". > In case of "u:%user:[managedParentQueuePath].%user" , the > [managedParentQueuePath] is a result of a full path from Fair Scheduler. > Therefore it's not going to be a leaf queue. > The user might be required to do some fine tuning and adjust the property > "auto-create-child-queues". We should display a warning about these > additional steps. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10135) FS-CS converter tool: issue warning on dynamic auto-create mapping rules
[ https://issues.apache.org/jira/browse/YARN-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10135: Attachment: YARN-10135-001.patch > FS-CS converter tool: issue warning on dynamic auto-create mapping rules > > > Key: YARN-10135 > URL: https://issues.apache.org/jira/browse/YARN-10135 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10135-001.patch, YARN-10135-002.patch > > > The converter tool should issue a warning whenever the conversion results in > mapping rules similar to these: > {{u:%user:[managedParentQueueName].[queueName]}} > {{u:%user:[managedParentQueueName].%user}} > {{u:%user:[managedParentQueueName].%primary_group}} > {{u:%user:[managedParentQueueName].%secondary_group}} > {{u:%user:%primary_group.%user}} > {{u:%user:%secondary_group.%user}} > {{u:%user:[managedParentQueuePath].%user}} > > The reason is that right now it's fully clear how we'll handle a case like > "u:%user:%primary_group.%user", where "%primary_group.%user" might result in > something like "users.john". > In case of "u:%user:[managedParentQueuePath].%user" , the > [managedParentQueuePath] is a result of a full path from Fair Scheduler. > Therefore it's not going to be a leaf queue. > The user might be required to do some fine tuning and adjust the property > "auto-create-child-queues". We should display a warning about these > additional steps. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10157) FS-CS converter: initPropertyActions() is not called without rules file
[ https://issues.apache.org/jira/browse/YARN-10157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10157: Attachment: YARN-10157-001.patch > FS-CS converter: initPropertyActions() is not called without rules file > --- > > Key: YARN-10157 > URL: https://issues.apache.org/jira/browse/YARN-10157 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10157-001.patch > > > The method {{FSConfigToCSConfigRuleHandler.initPropertyActions()}} should be > invoked even if we don't use the rule file. Otherwise the rule handler will > not initialize actions to WARNING. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10157) FS-CS converter: initPropertyActions() is not called without rules file
Peter Bacsko created YARN-10157: --- Summary: FS-CS converter: initPropertyActions() is not called without rules file Key: YARN-10157 URL: https://issues.apache.org/jira/browse/YARN-10157 Project: Hadoop YARN Issue Type: Sub-task Reporter: Peter Bacsko Assignee: Peter Bacsko The method {{FSConfigToCSConfigRuleHandler.initPropertyActions()}} should be invoked even if we don't use the rule file. Otherwise the rule handler will not initialize actions to WARNING. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10135) FS-CS converter tool: issue warning on dynamic auto-create mapping rules
[ https://issues.apache.org/jira/browse/YARN-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10135: Attachment: YARN-10135-001.patch > FS-CS converter tool: issue warning on dynamic auto-create mapping rules > > > Key: YARN-10135 > URL: https://issues.apache.org/jira/browse/YARN-10135 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10135-001.patch > > > The converter tool should issue a warning whenever the conversion results in > mapping rules similar to these: > {{u:%user:[managedParentQueueName].[queueName]}} > {{u:%user:[managedParentQueueName].%user}} > {{u:%user:[managedParentQueueName].%primary_group}} > {{u:%user:[managedParentQueueName].%secondary_group}} > {{u:%user:%primary_group.%user}} > {{u:%user:%secondary_group.%user}} > {{u:%user:[managedParentQueuePath].%user}} > > The reason is that right now it's fully clear how we'll handle a case like > "u:%user:%primary_group.%user", where "%primary_group.%user" might result in > something like "users.john". > In case of "u:%user:[managedParentQueuePath].%user" , the > [managedParentQueuePath] is a result of a full path from Fair Scheduler. > Therefore it's not going to be a leaf queue. > The user might be required to do some fine tuning and adjust the property > "auto-create-child-queues". We should display a warning about these > additional steps. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10147) FPGA plugin can't find the localized aocx file
[ https://issues.apache.org/jira/browse/YARN-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10147: Attachment: YARN-10147-003.patch > FPGA plugin can't find the localized aocx file > -- > > Key: YARN-10147 > URL: https://issues.apache.org/jira/browse/YARN-10147 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10147-001.patch, YARN-10147-002.patch, > YARN-10147-003.patch > > > There's a bug in the FPGA plugin which is intended to find the localized > "aocx" file: > {noformat} > ... > if (localizedResources != null) { > Optional aocxPath = localizedResources > .keySet() > .stream() > .filter(path -> matchesIpid(path, id)) > .findFirst(); > if (aocxPath.isPresent()) { > ipFilePath = aocxPath.get().toUri().toString(); > LOG.debug("Found: " + ipFilePath); > } > } else { > LOG.warn("Localized resource is null!"); > } > return ipFilePath; > } > private boolean matchesIpid(Path p, String id) { > return p.getName().toLowerCase().equals(id.toLowerCase()) > && p.getName().endsWith(".aocx"); > } > {noformat} > The method {{matchesIpid()}} works incorrecty: the {{id}} argument is the > expected filename, but without the extension. Therefore the {{equals()}} > comparison will always be false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Resolved] (YARN-10142) Distributed shell: add support for localization visibility
[ https://issues.apache.org/jira/browse/YARN-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko resolved YARN-10142. - Resolution: Duplicate > Distributed shell: add support for localization visibility > -- > > Key: YARN-10142 > URL: https://issues.apache.org/jira/browse/YARN-10142 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > > The localization is now hard coded in DistributedShell: > {noformat} > FileStatus scFileStatus = fs.getFileStatus(dst); > LocalResource scRsrc = > LocalResource.newInstance( > URL.fromURI(dst.toUri()), > LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, > scFileStatus.getLen(), scFileStatus.getModificationTime()); > localResources.put(fileDstPath, scRsrc); > {noformat} > However, sometimes it's useful if you have the possibility to change this to > PRIVATE/PUBLIC for testing purposes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10142) Distributed shell: add support for localization visibility
[ https://issues.apache.org/jira/browse/YARN-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039736#comment-17039736 ] Peter Bacsko commented on YARN-10142: - Yes, indeed. Closing this as duplicate. > Distributed shell: add support for localization visibility > -- > > Key: YARN-10142 > URL: https://issues.apache.org/jira/browse/YARN-10142 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > > The localization is now hard coded in DistributedShell: > {noformat} > FileStatus scFileStatus = fs.getFileStatus(dst); > LocalResource scRsrc = > LocalResource.newInstance( > URL.fromURI(dst.toUri()), > LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, > scFileStatus.getLen(), scFileStatus.getModificationTime()); > localResources.put(fileDstPath, scRsrc); > {noformat} > However, sometimes it's useful if you have the possibility to change this to > PRIVATE/PUBLIC for testing purposes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10147) FPGA plugin can't find the localized aocx file
[ https://issues.apache.org/jira/browse/YARN-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17039194#comment-17039194 ] Peter Bacsko commented on YARN-10147: - [~snemeth] please review & commit this JIRA. > FPGA plugin can't find the localized aocx file > -- > > Key: YARN-10147 > URL: https://issues.apache.org/jira/browse/YARN-10147 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10147-001.patch, YARN-10147-002.patch > > > There's a bug in the FPGA plugin which is intended to find the localized > "aocx" file: > {noformat} > ... > if (localizedResources != null) { > Optional aocxPath = localizedResources > .keySet() > .stream() > .filter(path -> matchesIpid(path, id)) > .findFirst(); > if (aocxPath.isPresent()) { > ipFilePath = aocxPath.get().toUri().toString(); > LOG.debug("Found: " + ipFilePath); > } > } else { > LOG.warn("Localized resource is null!"); > } > return ipFilePath; > } > private boolean matchesIpid(Path p, String id) { > return p.getName().toLowerCase().equals(id.toLowerCase()) > && p.getName().endsWith(".aocx"); > } > {noformat} > The method {{matchesIpid()}} works incorrecty: the {{id}} argument is the > expected filename, but without the extension. Therefore the {{equals()}} > comparison will always be false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10147) FPGA plugin can't find the localized aocx file
[ https://issues.apache.org/jira/browse/YARN-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10147: Attachment: YARN-10147-002.patch > FPGA plugin can't find the localized aocx file > -- > > Key: YARN-10147 > URL: https://issues.apache.org/jira/browse/YARN-10147 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10147-001.patch, YARN-10147-002.patch > > > There's a bug in the FPGA plugin which is intended to find the localized > "aocx" file: > {noformat} > ... > if (localizedResources != null) { > Optional aocxPath = localizedResources > .keySet() > .stream() > .filter(path -> matchesIpid(path, id)) > .findFirst(); > if (aocxPath.isPresent()) { > ipFilePath = aocxPath.get().toUri().toString(); > LOG.debug("Found: " + ipFilePath); > } > } else { > LOG.warn("Localized resource is null!"); > } > return ipFilePath; > } > private boolean matchesIpid(Path p, String id) { > return p.getName().toLowerCase().equals(id.toLowerCase()) > && p.getName().endsWith(".aocx"); > } > {noformat} > The method {{matchesIpid()}} works incorrecty: the {{id}} argument is the > expected filename, but without the extension. Therefore the {{equals()}} > comparison will always be false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10147) FPGA plugin can't find the localized aocx file
[ https://issues.apache.org/jira/browse/YARN-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10147: Attachment: YARN-10147-001.patch > FPGA plugin can't find the localized aocx file > -- > > Key: YARN-10147 > URL: https://issues.apache.org/jira/browse/YARN-10147 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10147-001.patch > > > There's a bug in the FPGA plugin which is intended to find the localized > "aocx" file: > {noformat} > ... > if (localizedResources != null) { > Optional aocxPath = localizedResources > .keySet() > .stream() > .filter(path -> matchesIpid(path, id)) > .findFirst(); > if (aocxPath.isPresent()) { > ipFilePath = aocxPath.get().toUri().toString(); > LOG.debug("Found: " + ipFilePath); > } > } else { > LOG.warn("Localized resource is null!"); > } > return ipFilePath; > } > private boolean matchesIpid(Path p, String id) { > return p.getName().toLowerCase().equals(id.toLowerCase()) > && p.getName().endsWith(".aocx"); > } > {noformat} > The method {{matchesIpid()}} works incorrecty: the {{id}} argument is the > expected filename, but without the extension. Therefore the {{equals()}} > comparison will always be false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10147) FPGA plugin can't find the localized aocx file
Peter Bacsko created YARN-10147: --- Summary: FPGA plugin can't find the localized aocx file Key: YARN-10147 URL: https://issues.apache.org/jira/browse/YARN-10147 Project: Hadoop YARN Issue Type: Bug Components: nodemanager Reporter: Peter Bacsko Assignee: Peter Bacsko There's a bug in the FPGA plugin which is intended to find the localized "aocx" file: {noformat} ... if (localizedResources != null) { Optional aocxPath = localizedResources .keySet() .stream() .filter(path -> matchesIpid(path, id)) .findFirst(); if (aocxPath.isPresent()) { ipFilePath = aocxPath.get().toUri().toString(); LOG.debug("Found: " + ipFilePath); } } else { LOG.warn("Localized resource is null!"); } return ipFilePath; } private boolean matchesIpid(Path p, String id) { return p.getName().toLowerCase().equals(id.toLowerCase()) && p.getName().endsWith(".aocx"); } {noformat} The method {{matchesIpid()}} works incorrecty: the {{id}} argument is the expected filename, but without the extension. Therefore the {{equals()}} comparison will always be false. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10142) Distributed shell: add support for localization visibility
Peter Bacsko created YARN-10142: --- Summary: Distributed shell: add support for localization visibility Key: YARN-10142 URL: https://issues.apache.org/jira/browse/YARN-10142 Project: Hadoop YARN Issue Type: Improvement Reporter: Peter Bacsko Assignee: Peter Bacsko The localization is now hard coded in DistributedShell: {noformat} FileStatus scFileStatus = fs.getFileStatus(dst); LocalResource scRsrc = LocalResource.newInstance( URL.fromURI(dst.toUri()), LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, scFileStatus.getLen(), scFileStatus.getModificationTime()); localResources.put(fileDstPath, scRsrc); {noformat} However, sometimes it's useful if you have the possibility to change this to PRIVATE/PUBLIC for testing purposes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10142) Distributed shell: add support for localization visibility
[ https://issues.apache.org/jira/browse/YARN-10142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10142: Component/s: distributed-shell > Distributed shell: add support for localization visibility > -- > > Key: YARN-10142 > URL: https://issues.apache.org/jira/browse/YARN-10142 > Project: Hadoop YARN > Issue Type: Improvement > Components: distributed-shell >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > > The localization is now hard coded in DistributedShell: > {noformat} > FileStatus scFileStatus = fs.getFileStatus(dst); > LocalResource scRsrc = > LocalResource.newInstance( > URL.fromURI(dst.toUri()), > LocalResourceType.FILE, LocalResourceVisibility.APPLICATION, > scFileStatus.getLen(), scFileStatus.getModificationTime()); > localResources.put(fileDstPath, scRsrc); > {noformat} > However, sometimes it's useful if you have the possibility to change this to > PRIVATE/PUBLIC for testing purposes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10135) FS-CS converter tool: issue warning on dynamic auto-create mapping rules
[ https://issues.apache.org/jira/browse/YARN-10135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10135: Labels: fs2cs (was: ) > FS-CS converter tool: issue warning on dynamic auto-create mapping rules > > > Key: YARN-10135 > URL: https://issues.apache.org/jira/browse/YARN-10135 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > > The converter tool should issue a warning whenever the conversion results in > mapping rules similar to these: > {{u:%user:[managedParentQueueName].[queueName]}} > {{u:%user:[managedParentQueueName].%user}} > {{u:%user:[managedParentQueueName].%primary_group}} > {{u:%user:[managedParentQueueName].%secondary_group}} > {{u:%user:%primary_group.%user}} > {{u:%user:%secondary_group.%user}} > {{u:%user:[managedParentQueuePath].%user}} > > The reason is that right now it's fully clear how we'll handle a case like > "u:%user:%primary_group.%user", where "%primary_group.%user" might result in > something like "users.john". > In case of "u:%user:[managedParentQueuePath].%user" , the > [managedParentQueuePath] is a result of a full path from Fair Scheduler. > Therefore it's not going to be a leaf queue. > The user might be required to do some fine tuning and adjust the property > "auto-create-child-queues". We should display a warning about these > additional steps. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10135) FS-CS converter tool: issue warning on dynamic auto-create mapping rules
Peter Bacsko created YARN-10135: --- Summary: FS-CS converter tool: issue warning on dynamic auto-create mapping rules Key: YARN-10135 URL: https://issues.apache.org/jira/browse/YARN-10135 Project: Hadoop YARN Issue Type: Sub-task Reporter: Peter Bacsko Assignee: Peter Bacsko The converter tool should issue a warning whenever the conversion results in mapping rules similar to these: {{u:%user:[managedParentQueueName].[queueName]}} {{u:%user:[managedParentQueueName].%user}} {{u:%user:[managedParentQueueName].%primary_group}} {{u:%user:[managedParentQueueName].%secondary_group}} {{u:%user:%primary_group.%user}} {{u:%user:%secondary_group.%user}} {{u:%user:[managedParentQueuePath].%user}} The reason is that right now it's fully clear how we'll handle a case like "u:%user:%primary_group.%user", where "%primary_group.%user" might result in something like "users.john". In case of "u:%user:[managedParentQueuePath].%user" , the [managedParentQueuePath] is a result of a full path from Fair Scheduler. Therefore it's not going to be a leaf queue. The user might be required to do some fine tuning and adjust the property "auto-create-child-queues". We should display a warning about these additional steps. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10130) Do not allow output dir to be the same as input dir
[ https://issues.apache.org/jira/browse/YARN-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17035399#comment-17035399 ] Peter Bacsko commented on YARN-10130: - [~adam.antal] I'd also check the existence of {{yarn-site.xml}} and {{capacity-scheduler.xml}} and try to create them exclusively. > Do not allow output dir to be the same as input dir > --- > > Key: YARN-10130 > URL: https://issues.apache.org/jira/browse/YARN-10130 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Adam Antal >Priority: Major > Attachments: YARN-10130.001.patch > > > If the input dir where fair-scheduler.xml / yarn-site.xml sits is the same as > the output dir (defined by the -o switch), the fs2cs tool overwrites the > source config files, i.e. yarn-site.xml. > Reproduce this is easy, just run fs2cs tool with this command: > {code:java} > /bin/yarn fs2cs --cluster-resource memory-mb=18044928,vcores=16 > --no-terminal-rule-check -y yarn-site.xml -f fair-scheduler.xml -o . > {code} > The following (or similar) is emitted by the tool: > {code:java} > WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of > YARN_OPTS.WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of > YARN_OPTS.20/02/10 12:51:42 INFO converter.FSConfigToCSConfigConverter: > Output directory for yarn-site.xml and capacity-scheduler.xml is: .20/02/10 > 12:51:42 INFO converter.FSConfigToCSConfigConverter: Conversion rules file is > not defined, using default conversion config!20/02/10 12:51:42 ERROR > conf.Configuration: error parsing conf > yarn-site.xmlcom.ctc.wstx.exc.WstxEOFException: Unexpected EOF in prolog at > [row,col,system-id]: [1,0,"yarn-site.xml"] at > com.ctc.wstx.sr.StreamScanner.throwUnexpectedEOF(StreamScanner.java:687) at > com.ctc.wstx.sr.BasicStreamReader.handleEOF(BasicStreamReader.java:2220) at > com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2126) > at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1181) at > org.apache.hadoop.conf.Configuration$Parser.parseNext(Configuration.java:3343) > at > org.apache.hadoop.conf.Configuration$Parser.parse(Configuration.java:3137) at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3030) at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2996) > at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2871) at > org.apache.hadoop.conf.Configuration.set(Configuration.java:1389) at > org.apache.hadoop.conf.Configuration.set(Configuration.java:1361) at > org.apache.hadoop.conf.Configuration.setBoolean(Configuration.java:1702) at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigConverter.createConfiguration(FSConfigToCSConfigConverter.java:166) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigConverter.convert(FSConfigToCSConfigConverter.java:98) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigArgumentHandler.parseAndConvert(FSConfigToCSConfigArgumentHandler.java:137) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigConverterMain.main(FSConfigToCSConfigConverterMain.java:40)20/02/10 > 12:51:42 ERROR converter.FSConfigToCSConfigConverterMain: Error while > starting FS configuration conversion! > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10127) FSQueueConverter should not set App Ordering Policy to Parent Queue
[ https://issues.apache.org/jira/browse/YARN-10127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034467#comment-17034467 ] Peter Bacsko commented on YARN-10127: - Thanks for the explanation [~prabhujoseph] - looks like the solution is to only emit this property for leaf queues. Otherwise skip it. Uploaded patch v1. > FSQueueConverter should not set App Ordering Policy to Parent Queue > --- > > Key: YARN-10127 > URL: https://issues.apache.org/jira/browse/YARN-10127 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10127-001.patch > > > FSQueueConverter should not set App Ordering Policy (fair, fifo) to Parent > Queue. RM will fail to start if Parent Queue is set with App Ordering Policy. > {code} > Error starting ResourceManager > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Unable to construct > queue ordering policy=fair queue=root > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueOrderingPolicy(CapacitySchedulerConfiguration.java:1584) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.setupQueueConfigs(ParentQueue.java:145) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.(ParentQueue.java:112) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractManagedParentQueue.(AbstractManagedParentQueue.java:51) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ManagedParentQueue.(ManagedParentQueue.java:56) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.parseQueue(CapacitySchedulerQueueManager.java:272) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.initializeQueues(CapacitySchedulerQueueManager.java:158) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:751) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > > {code} > Input fair-scheduler.xml: > {code} > [yarn@mradha-s1-1 /]$ cat /tmp/fair-scheduler.xml > > > > fair > > fair > > > fair > > > > > > > > > > > {code} > Command Used: > {code} > yarn fs2cs -t -f /tmp/fair-scheduler.xml -y > /var/run/cloudera-scm-agent/process/11-yarn-RESOURCEMANAGER/yarn-site.xml -o > /tmp/CS > {code} > Output capacity-scheduler.xml > {code} > > yarn.scheduler.capacity.root.auto-create-child-queue.enabledtruefalseprogrammatically > yarn.scheduler.capacity.root.users.capacity50.000falseprogrammatically > yarn.scheduler.capacity.root.queuesdefault,usersfalseprogrammatically > yarn.scheduler.capacity.queue-mappings-override.enablefalsefalseprogrammatically > yarn.scheduler.capacity.root.default.capacity50.000falseprogrammatically > yarn.scheduler.capacity.root.default.auto-create-child-queue.enabledtruefalseprogrammatically > yarn.scheduler.capacity.maximum-am-resource-percent0.5falseprogrammatically > yarn.scheduler.capacity.root.users.auto-create-child-queue.enabledtruefalseprogrammatically > yarn.scheduler.capacity.root.default.ordering-policyfairfalseprogrammatically > yarn.scheduler.capacity.queue-mappingsu:%user:%user;u:%user:root.users.%user;u:%user:root.defaultfalseprogrammatically > yarn.scheduler.capacity.root.users.ordering-policyfairfalseprogrammatically >
[jira] [Updated] (YARN-10127) FSQueueConverter should not set App Ordering Policy to Parent Queue
[ https://issues.apache.org/jira/browse/YARN-10127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10127: Attachment: YARN-10127-001.patch > FSQueueConverter should not set App Ordering Policy to Parent Queue > --- > > Key: YARN-10127 > URL: https://issues.apache.org/jira/browse/YARN-10127 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10127-001.patch > > > FSQueueConverter should not set App Ordering Policy (fair, fifo) to Parent > Queue. RM will fail to start if Parent Queue is set with App Ordering Policy. > {code} > Error starting ResourceManager > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Unable to construct > queue ordering policy=fair queue=root > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueOrderingPolicy(CapacitySchedulerConfiguration.java:1584) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.setupQueueConfigs(ParentQueue.java:145) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.(ParentQueue.java:112) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractManagedParentQueue.(AbstractManagedParentQueue.java:51) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ManagedParentQueue.(ManagedParentQueue.java:56) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.parseQueue(CapacitySchedulerQueueManager.java:272) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.initializeQueues(CapacitySchedulerQueueManager.java:158) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:751) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > > {code} > Input fair-scheduler.xml: > {code} > [yarn@mradha-s1-1 /]$ cat /tmp/fair-scheduler.xml > > > > fair > > fair > > > fair > > > > > > > > > > > {code} > Command Used: > {code} > yarn fs2cs -t -f /tmp/fair-scheduler.xml -y > /var/run/cloudera-scm-agent/process/11-yarn-RESOURCEMANAGER/yarn-site.xml -o > /tmp/CS > {code} > Output capacity-scheduler.xml > {code} > > yarn.scheduler.capacity.root.auto-create-child-queue.enabledtruefalseprogrammatically > yarn.scheduler.capacity.root.users.capacity50.000falseprogrammatically > yarn.scheduler.capacity.root.queuesdefault,usersfalseprogrammatically > yarn.scheduler.capacity.queue-mappings-override.enablefalsefalseprogrammatically > yarn.scheduler.capacity.root.default.capacity50.000falseprogrammatically > yarn.scheduler.capacity.root.default.auto-create-child-queue.enabledtruefalseprogrammatically > yarn.scheduler.capacity.maximum-am-resource-percent0.5falseprogrammatically > yarn.scheduler.capacity.root.users.auto-create-child-queue.enabledtruefalseprogrammatically > yarn.scheduler.capacity.root.default.ordering-policyfairfalseprogrammatically > yarn.scheduler.capacity.queue-mappingsu:%user:%user;u:%user:root.users.%user;u:%user:root.defaultfalseprogrammatically > yarn.scheduler.capacity.root.users.ordering-policyfairfalseprogrammatically > yarn.scheduler.capacity.root.ordering-policyfairfalseprogrammatically > > {code} > Root Queue is set with App Ordering Policy fair which is wrong > {code} > yarn.scheduler.capacity.root.ordering-policyfair > {code}
[jira] [Assigned] (YARN-10127) FSQueueConverter should not set App Ordering Policy to Parent Queue
[ https://issues.apache.org/jira/browse/YARN-10127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko reassigned YARN-10127: --- Assignee: Peter Bacsko (was: Prabhu Joseph) > FSQueueConverter should not set App Ordering Policy to Parent Queue > --- > > Key: YARN-10127 > URL: https://issues.apache.org/jira/browse/YARN-10127 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > > FSQueueConverter should not set App Ordering Policy (fair, fifo) to Parent > Queue. RM will fail to start if Parent Queue is set with App Ordering Policy. > {code} > Error starting ResourceManager > org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Unable to construct > queue ordering policy=fair queue=root > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueOrderingPolicy(CapacitySchedulerConfiguration.java:1584) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.setupQueueConfigs(ParentQueue.java:145) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ParentQueue.(ParentQueue.java:112) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.AbstractManagedParentQueue.(AbstractManagedParentQueue.java:51) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.ManagedParentQueue.(ManagedParentQueue.java:56) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.parseQueue(CapacitySchedulerQueueManager.java:272) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerQueueManager.initializeQueues(CapacitySchedulerQueueManager.java:158) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:751) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > > {code} > Input fair-scheduler.xml: > {code} > [yarn@mradha-s1-1 /]$ cat /tmp/fair-scheduler.xml > > > > fair > > fair > > > fair > > > > > > > > > > > {code} > Command Used: > {code} > yarn fs2cs -t -f /tmp/fair-scheduler.xml -y > /var/run/cloudera-scm-agent/process/11-yarn-RESOURCEMANAGER/yarn-site.xml -o > /tmp/CS > {code} > Output capacity-scheduler.xml > {code} > > yarn.scheduler.capacity.root.auto-create-child-queue.enabledtruefalseprogrammatically > yarn.scheduler.capacity.root.users.capacity50.000falseprogrammatically > yarn.scheduler.capacity.root.queuesdefault,usersfalseprogrammatically > yarn.scheduler.capacity.queue-mappings-override.enablefalsefalseprogrammatically > yarn.scheduler.capacity.root.default.capacity50.000falseprogrammatically > yarn.scheduler.capacity.root.default.auto-create-child-queue.enabledtruefalseprogrammatically > yarn.scheduler.capacity.maximum-am-resource-percent0.5falseprogrammatically > yarn.scheduler.capacity.root.users.auto-create-child-queue.enabledtruefalseprogrammatically > yarn.scheduler.capacity.root.default.ordering-policyfairfalseprogrammatically > yarn.scheduler.capacity.queue-mappingsu:%user:%user;u:%user:root.users.%user;u:%user:root.defaultfalseprogrammatically > yarn.scheduler.capacity.root.users.ordering-policyfairfalseprogrammatically > yarn.scheduler.capacity.root.ordering-policyfairfalseprogrammatically > > {code} > Root Queue is set with App Ordering Policy fair which is wrong > {code} > yarn.scheduler.capacity.root.ordering-policyfair > {code} -- This message was
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17034339#comment-17034339 ] Peter Bacsko commented on YARN-10043: - +1 (non-binding). [~snemeth] could you check this? > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17033720#comment-17033720 ] Peter Bacsko commented on YARN-10043: - [~maniraj...@gmail.com] yes, I'll try to find some spare cycles tomorrow and review the latest. > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch, > YARN-10043.003.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9011) Race condition during decommissioning
[ https://issues.apache.org/jira/browse/YARN-9011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17030163#comment-17030163 ] Peter Bacsko commented on YARN-9011: Thanks [~mkonst] for sharing this. Yes, I was thinking about what happens if refresh is called quickly multiple times. The reason I ignored it is that this is initiated from the command line, and by the time you do it again, chances are that the node state has already been in DECOMMISSIONING for a (relatively) long time. So it's not a realistic use case, but more like a "sabotage". Correct me if this reasoning is wrong. Having said that, if you think it's something worth improving, I suggest that you create a follow-up JIRA and link this one. If you already have a patch or some code that you can share, I'm happy to review it. > Race condition during decommissioning > - > > Key: YARN-9011 > URL: https://issues.apache.org/jira/browse/YARN-9011 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager >Affects Versions: 3.1.1 >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Fix For: 3.3.0, 3.2.2, 3.1.4 > > Attachments: YARN-9011-001.patch, YARN-9011-002.patch, > YARN-9011-003.patch, YARN-9011-004.patch, YARN-9011-005.patch, > YARN-9011-006.patch, YARN-9011-007.patch, YARN-9011-008.patch, > YARN-9011-009.patch, YARN-9011-branch-3.1.001.patch, > YARN-9011-branch-3.2.001.patch > > > During internal testing, we found a nasty race condition which occurs during > decommissioning. > Node manager, incorrect behaviour: > {noformat} > 2018-06-18 21:00:17,634 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received > SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting > down. > 2018-06-18 21:00:17,634 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from > ResourceManager: Disallowed NodeManager nodeId: node-6.hostname.com:8041 > hostname:node-6.hostname.com > {noformat} > Node manager, expected behaviour: > {noformat} > 2018-06-18 21:07:37,377 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Received > SHUTDOWN signal from Resourcemanager as part of heartbeat, hence shutting > down. > 2018-06-18 21:07:37,377 WARN > org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Message from > ResourceManager: DECOMMISSIONING node-6.hostname.com:8041 is ready to be > decommissioned > {noformat} > Note the two different messages from the RM ("Disallowed NodeManager" vs > "DECOMMISSIONING"). The problem is that {{ResourceTrackerService}} can see an > inconsistent state of nodes while they're being updated: > {noformat} > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: hostsReader > include:{172.26.12.198,node-7.hostname.com,node-2.hostname.com,node-5.hostname.com,172.26.8.205,node-8.hostname.com,172.26.23.76,172.26.22.223,node-6.hostname.com,172.26.9.218,node-4.hostname.com,node-3.hostname.com,172.26.13.167,node-9.hostname.com,172.26.21.221,172.26.10.219} > exclude:{node-6.hostname.com} > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.NodesListManager: Gracefully > decommission node node-6.hostname.com:8041 with state RUNNING > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService: > Disallowed NodeManager nodeId: node-6.hostname.com:8041 node: > node-6.hostname.com > 2018-06-18 21:00:17,576 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Put Node > node-6.hostname.com:8041 in DECOMMISSIONING. > 2018-06-18 21:00:17,575 INFO > org.apache.hadoop.yarn.server.resourcemanager.RMAuditLogger: USER=yarn > IP=172.26.22.115OPERATION=refreshNodes TARGET=AdminService > RESULT=SUCCESS > 2018-06-18 21:00:17,577 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Preserve > original total capability: > 2018-06-18 21:00:17,577 INFO > org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: > node-6.hostname.com:8041 Node Transitioned from RUNNING to DECOMMISSIONING > {noformat} > When the decommissioning succeeds, there is no output logged from > {{ResourceTrackerService}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-9879) Allow multiple leaf queues with the same name in CS
[ https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028892#comment-17028892 ] Peter Bacsko commented on YARN-9879: The latest build picked up {{CSQueue.getQueueUsage.txt}}. [~shuzirra] you might want to re-upload the patch again. > Allow multiple leaf queues with the same name in CS > --- > > Key: YARN-9879 > URL: https://issues.apache.org/jira/browse/YARN-9879 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Labels: fs2cs > Attachments: CSQueue.getQueueUsage.txt, DesignDoc_v1.pdf, > YARN-9879.POC001.patch, YARN-9879.POC002.patch > > > Currently the leaf queue's name must be unique regardless of its position in > the queue hierarchy. > Design doc and first proposal is being made, I'll attach it as soon as it's > done. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10101) Support listing of aggregated logs for containers belonging to an application attempt
[ https://issues.apache.org/jira/browse/YARN-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027407#comment-17027407 ] Peter Bacsko edited comment on YARN-10101 at 1/31/20 11:12 AM: --- [~adam.antal] I quickly went through the patch. Haven't seen anything that stands out, but I suggest some enhancements. 1. There is one part of the code which is a bit hard to read. I'm talking about {{validateUserInput()}}. Lot of nested ifs, I think those can be simplified. For example: {noformat} if (applicationAttemptId != null) { if (!applicationAttemptId.equals(containerId .getApplicationAttemptId())) { ... {noformat} I think this to ifs can be merged to a single condition: {{if (applicationAttemptId != null && (!applicationAttemptId.equals(containerId.getApplicationAttemptId())}} This one, too: {noformat} if (applicationAttemptId != null) { if (applicationId != null) { if (!applicationId.equals(applicationAttemptId.getApplicationId())) { {noformat} --> {{if (applicationAttemptId != null && applicationId != null && !applicationId.equals(applicationAttemptId.getApplicationId())}} Before this condition, I would add a single line comment like "// We have no containerId" to emphasize when that branch is taken. 2. These stuff: {noformat} } catch (Exception ignore) { } {noformat} I'm assuming that you're ignoring the exception because certain input data (appIdStr / appAttemptIdStr / containerIdStr) can be null. I'd rather you checked for null explicitly, then parse it if non-null: {noformat} ApplicationId appId = null; if (appIdStr != null) { try { appId = ApplicationId.fromString(appIdStr); } catch (Exception e) { throw new WebApplicationException("Illegal Application ID string", e); } } {noformat} 3. Do we need {{@InterfaceAudience}} on {{getLogServlet()}} and {{setLogServlet()}}? It's not really part of an interface that is used either internally or externally, as it's stated clearly by {{@VisibleForTesting}}. was (Author: pbacsko): [~adam.antal] I quickly went through the patch. Haven't seen anything that stands out, but I suggest some enhancements. 1. There is one part of the code which is a bit hard to read though. I'm talking about {{validateUserInput()}}. Lot of nested ifs, I think those can be simplified. For example: {noformat} if (applicationAttemptId != null) { if (!applicationAttemptId.equals(containerId .getApplicationAttemptId())) { ... {noformat} I think this to ifs can be merged to a single condition: {{if (applicationAttemptId != null && (!applicationAttemptId.equals(containerId.getApplicationAttemptId())}} This one, too: {noformat} if (applicationAttemptId != null) { if (applicationId != null) { if (!applicationId.equals(applicationAttemptId.getApplicationId())) { {noformat} --> {{if (applicationAttemptId != null && applicationId != null && !applicationId.equals(applicationAttemptId.getApplicationId())}} Before this condition, I would add a single line comment like "// We have no containerId" to emphasize when that branch is taken. 2. These stuff: {noformat} } catch (Exception ignore) { } {noformat} I'm assuming that you're ignoring the exception because certain input data (appIdStr / appAttemptIdStr / containerIdStr) can be null. I'd rather you checked for null explicitly, then parse it if non-null: {noformat} ApplicationId appId = null; if (appIdStr != null) { try { appId = ApplicationId.fromString(appIdStr); } catch (Exception e) { throw new WebApplicationException("Illegal Application ID string", e); } } {noformat} 3. Do we need {{@InterfaceAudience}} on {{getLogServlet()}} and {{setLogServlet()}}? It's not really part of an interface that is used either internally or externally, as it's stated clearly by {{@VisibleForTesting}}. > Support listing of aggregated logs for containers belonging to an application > attempt > - > > Key: YARN-10101 > URL: https://issues.apache.org/jira/browse/YARN-10101 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation, yarn >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-10101.001.patch, YARN-10101.002.patch, > YARN-10101.003.patch, YARN-10101.004.patch, YARN-10101.005.patch > > > To display logs without access to the timeline server, we need an interface > where we can query the list of containers with aggregated logs belonging to > an application attempt. > We should add support for this. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (YARN-10101) Support listing of aggregated logs for containers belonging to an application attempt
[ https://issues.apache.org/jira/browse/YARN-10101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17027407#comment-17027407 ] Peter Bacsko commented on YARN-10101: - [~adam.antal] I quickly went through the patch. Haven't seen anything that stands out, but I suggest some enhancements. 1. There is one part of the code which is a bit hard to read though. I'm talking about {{validateUserInput()}}. Lot of nested ifs, I think those can be simplified. For example: {noformat} if (applicationAttemptId != null) { if (!applicationAttemptId.equals(containerId .getApplicationAttemptId())) { ... {noformat} I think this to ifs can be merged to a single condition: {{if (applicationAttemptId != null && (!applicationAttemptId.equals(containerId.getApplicationAttemptId())}} This one, too: {noformat} if (applicationAttemptId != null) { if (applicationId != null) { if (!applicationId.equals(applicationAttemptId.getApplicationId())) { {noformat} --> {{if (applicationAttemptId != null && applicationId != null && !applicationId.equals(applicationAttemptId.getApplicationId())}} Before this condition, I would add a single line comment like "// We have no containerId" to emphasize when that branch is taken. 2. These stuff: {noformat} } catch (Exception ignore) { } {noformat} I'm assuming that you're ignoring the exception because certain input data (appIdStr / appAttemptIdStr / containerIdStr) can be null. I'd rather you checked for null explicitly, then parse it if non-null: {noformat} ApplicationId appId = null; if (appIdStr != null) { try { appId = ApplicationId.fromString(appIdStr); } catch (Exception e) { throw new WebApplicationException("Illegal Application ID string", e); } } {noformat} 3. Do we need {{@InterfaceAudience}} on {{getLogServlet()}} and {{setLogServlet()}}? It's not really part of an interface that is used either internally or externally, as it's stated clearly by {{@VisibleForTesting}}. > Support listing of aggregated logs for containers belonging to an application > attempt > - > > Key: YARN-10101 > URL: https://issues.apache.org/jira/browse/YARN-10101 > Project: Hadoop YARN > Issue Type: Sub-task > Components: log-aggregation, yarn >Affects Versions: 3.3.0 >Reporter: Adam Antal >Assignee: Adam Antal >Priority: Major > Attachments: YARN-10101.001.patch, YARN-10101.002.patch, > YARN-10101.003.patch, YARN-10101.004.patch, YARN-10101.005.patch > > > To display logs without access to the timeline server, we need an interface > where we can query the list of containers with aggregated logs belonging to > an application attempt. > We should add support for this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10067) Add dry-run feature to FS-CS converter tool
[ https://issues.apache.org/jira/browse/YARN-10067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10067: Labels: fs2cs (was: ) > Add dry-run feature to FS-CS converter tool > --- > > Key: YARN-10067 > URL: https://issues.apache.org/jira/browse/YARN-10067 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Fix For: 3.3.0 > > Attachments: YARN-10067-001.patch, YARN-10067-002.patch, > YARN-10067-003.patch, YARN-10067-004.patch, YARN-10067-005.patch, > YARN-10067-006.patch, YARN-10067-007.patch, YARN-10067-007.patch > > > Add a "d" / "-dry-run" switch to the tool. The purpose of this would be to > inform the user whether a conversion is possible and if it is, are there any > warnings. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9879) Allow multiple leaf queues with the same name in CS
[ https://issues.apache.org/jira/browse/YARN-9879?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9879: --- Labels: fs2cs (was: ) > Allow multiple leaf queues with the same name in CS > --- > > Key: YARN-9879 > URL: https://issues.apache.org/jira/browse/YARN-9879 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Gergely Pollak >Assignee: Gergely Pollak >Priority: Major > Labels: fs2cs > Attachments: DesignDoc_v1.pdf, YARN-9879.POC001.patch > > > Currently the leaf queue's name must be unique regardless of its position in > the queue hierarchy. > Design doc and first proposal is being made, I'll attach it as soon as it's > done. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9899) Migration tool that help to generate CS config based on FS config [Phase 2]
[ https://issues.apache.org/jira/browse/YARN-9899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9899: --- Labels: fs2cs (was: ) > Migration tool that help to generate CS config based on FS config [Phase 2] > > > Key: YARN-9899 > URL: https://issues.apache.org/jira/browse/YARN-9899 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Szilard Nemeth >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Fix For: 3.3.0 > > Attachments: YARN-9899-001.patch, YARN-9899-002.patch, > YARN-9899-003.patch, YARN-9899-004.patch, YARN-9899-005.patch, > YARN-9899-006.patch, YARN-9899-007.patch > > > YARN-9699 laid down the groundworks of a converter from FS to CS config. > During the development of the converter, we came up with the following things > to fix. > 1. If we don't specify a mandatory option, we have this stacktrace for > example: > > {code:java} > org.apache.commons.cli.MissingOptionException: Missing required option: o > at org.apache.commons.cli.Parser.checkRequiredOptions(Parser.java:299) > at org.apache.commons.cli.Parser.parse(Parser.java:231) > at org.apache.commons.cli.Parser.parse(Parser.java:85) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.converter.FSConfigToCSConfigArgumentHandler.parseAndConvert(FSConfigToCSConfigArgumentHandler.java:100) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1572){code} > > We should provide a more concise and meaningful error message (without > stacktrace on the CLI, but we should log the exception with stacktrace to the > RM log). > An explanation of the missing option is also required. > 2. We may think about how to handle exceptions from commons CLI: > MissingArgumentException vs. MissingOptionException > 3. We need to provide a -h / --help option for the CLI that prints all the > possible options / arguments. > 4. Last but not least: We should move the CLI command to a more reasonable > place: > As YARN-9699 implemented it, the command can be invoked like: > {code:java} > /opt/hadoop/bin/yarn resourcemanager -convert-fs-configuration -y > /opt/hadoop/etc/hadoop/yarn-site.xml -f > /opt/hadoop/etc/hadoop/fair-scheduler.xml -r > ~systest/sample-rules-config.properties -o /tmp/fs-cs-output > {code} > This is problematic, as if YARN RM is already running, we need to stop it in > order to start the RM again with the conversion switch. > 5. Add unit test coverage for {{QueuePlacementConverter}} > 6. Close some feature gaps. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-9699) Migration tool that help to generate CS config based on FS config [Phase 1]
[ https://issues.apache.org/jira/browse/YARN-9699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-9699: --- Labels: fs2cs (was: ) > Migration tool that help to generate CS config based on FS config [Phase 1] > > > Key: YARN-9699 > URL: https://issues.apache.org/jira/browse/YARN-9699 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Wanqiang Ji >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Fix For: 3.3.0 > > Attachments: FS_to_CS_migration_POC.patch, YARN-9699-003.patch, > YARN-9699-004.patch, YARN-9699-005.patch, YARN-9699-006.patch, > YARN-9699-007.patch, YARN-9699-008.patch, YARN-9699-009.patch, > YARN-9699-010.patch, YARN-9699-011.patch, YARN-9699-012.patch, > YARN-9699-013.patch, YARN-9699-014.patch, YARN-9699-015.patch, > YARN-9699-016.patch, YARN-9699-017.patch, YARN-9699.001.patch, > YARN-9699.002.patch > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10108) FS-CS converter: nestedUserQueue with default rule results in invalid queue mapping
[ https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10108: Labels: fs2cs (was: ) > FS-CS converter: nestedUserQueue with default rule results in invalid queue > mapping > --- > > Key: YARN-10108 > URL: https://issues.apache.org/jira/browse/YARN-10108 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > > FS Queue Placement Policy > {code:java} > > > > > > {code} > gets mapped to an invalid CS queue mapping "u:%user:root.users.%user" > RM fails to start with above queue mapping in CS > {code:java} > 2020-01-28 00:19:12,889 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping > contains invalid or non-leaf queue [%user] and invalid parent queue > [root.users] > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue > [%user] and invalid parent queue [root.users] > at > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue(QueuePlacementRuleUtils.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.validateAndGetAutoCreatedQueueMapping(UserGroupMappingPlacementRule.java:363) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 7 more > {code} > QueuePlacementConverter#handleNestedRule has to be fixed. > {code:java} > else if (pr instanceof DefaultPlacementRule) { > DefaultPlacementRule defaultRule = (DefaultPlacementRule) pr; > mapping.append("u:" + USER + ":") > .append(defaultRule.defaultQueueName) > .append("." + USER); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10108) FS-CS converter: nestedUserQueue with default rule results in invalid queue mapping
[ https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10108: Summary: FS-CS converter: nestedUserQueue with default rule results in invalid queue mapping (was: FS-CS converter: nestedUserQueue with default rule maps to invalid queue mapping) > FS-CS converter: nestedUserQueue with default rule results in invalid queue > mapping > --- > > Key: YARN-10108 > URL: https://issues.apache.org/jira/browse/YARN-10108 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > > FS Queue Placement Policy > {code:java} > > > > > > {code} > gets mapped to an invalid CS queue mapping "u:%user:root.users.%user" > RM fails to start with above queue mapping in CS > {code:java} > 2020-01-28 00:19:12,889 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping > contains invalid or non-leaf queue [%user] and invalid parent queue > [root.users] > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue > [%user] and invalid parent queue [root.users] > at > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue(QueuePlacementRuleUtils.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.validateAndGetAutoCreatedQueueMapping(UserGroupMappingPlacementRule.java:363) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 7 more > {code} > QueuePlacementConverter#handleNestedRule has to be fixed. > {code:java} > else if (pr instanceof DefaultPlacementRule) { > DefaultPlacementRule defaultRule = (DefaultPlacementRule) pr; > mapping.append("u:" + USER + ":") > .append(defaultRule.defaultQueueName) > .append("." + USER); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix misc issues
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026667#comment-17026667 ] Peter Bacsko commented on YARN-10099: - [~snemeth] patch v10 should be good for a review. > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix misc issues > - > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch, YARN-10099-004.patch, YARN-10099-005.patch, > YARN-10099-006.patch, YARN-10099-007.patch, YARN-10099-008.patch, > YARN-10099-010.patch > > > This ticket is intended to fix two issues: > 1. Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. > 2. A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. > 3. When initializing FS for conversion, we add the current {{yarn-site.xml}} > as resource. This is not necessary. This can cause problems like: > {noformat} > [...] > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-audit.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-security.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-policymgr-ssl.xml): couldn't find resource > file location > 20/01/29 02:45:38 ERROR conf.Configuration: error parsing conf > file:/etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml > java.io.FileNotFoundException: > /etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml (No such file or > directory) > at java.base/java.io.FileInputStream.open0(Native Method) > at
[jira] [Commented] (YARN-10102) Capacity scheduler: add support for %specified mapping
[ https://issues.apache.org/jira/browse/YARN-10102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026622#comment-17026622 ] Peter Bacsko commented on YARN-10102: - [~maniraj...@gmail.com] yes that property does the trick, however, it doesn't solve everything if the {{}} rule is not the first in the list. So it'll override all other rules I guess. > Capacity scheduler: add support for %specified mapping > -- > > Key: YARN-10102 > URL: https://issues.apache.org/jira/browse/YARN-10102 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Priority: Major > > The reduce the gap between Fair Scheduler and Capacity Scheduler, it's > reasonable to have a {{%specified}} mapping. This would be equivalent to the > {{}} placement rule in FS, that is, use the queue that comes in > with the application submission context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10103) Capacity scheduler: add support for create=true/false per mapping rule
[ https://issues.apache.org/jira/browse/YARN-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17026601#comment-17026601 ] Peter Bacsko commented on YARN-10103: - [~maniraj...@gmail.com] I'm afraid that's not enough. Consider a nested Primary Group placement rule in FS with create=true. This results in the following mapping in CS: {{u:%user:%primary_group.%user}} Now we know that this works. But what to do with create=true? This means that we have to know all primary groups in advance and generate the {{yarn.scheduler.capacity.root.[groupName].auto-create-child-queue.enabled=true}} properties which doesn't seem all that feasible. Same goes for secondary groups. Also, if someone uses LDAP group mapping (which is pretty common in an enterprise environment), then it makes things even more complicated. Not to mention that these properties should be updated whenever there is a change in the groups. So I think it's absolutely necessary to have this feature sooner or later. > Capacity scheduler: add support for create=true/false per mapping rule > -- > > Key: YARN-10103 > URL: https://issues.apache.org/jira/browse/YARN-10103 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Priority: Major > Labels: fs2cs > > You can't ask Capacity Scheduler for a mapping to create a queue if it > doesn't exist. > For example, this mapping would use the first rule if the queue exist. If it > doesn't, then it proceeds to the next rule: > {{u:%user:%primary_group.%user:create=false;u:%user%:root.default}} > Let's say user "alice" belongs to the "admins" group. It would first try to > map {{root.admins.alice}}. But, if the queue doesn't exist, then it places > the application into {{root.default}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix misc issues
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Attachment: YARN-10099-010.patch > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix misc issues > - > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch, YARN-10099-004.patch, YARN-10099-005.patch, > YARN-10099-006.patch, YARN-10099-007.patch, YARN-10099-008.patch, > YARN-10099-010.patch > > > This ticket is intended to fix two issues: > 1. Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. > 2. A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. > 3. When initializing FS for conversion, we add the current {{yarn-site.xml}} > as resource. This is not necessary. This can cause problems like: > {noformat} > [...] > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-audit.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-security.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-policymgr-ssl.xml): couldn't find resource > file location > 20/01/29 02:45:38 ERROR conf.Configuration: error parsing conf > file:/etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml > java.io.FileNotFoundException: > /etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml (No such file or > directory) > at java.base/java.io.FileInputStream.open0(Native Method) > at java.base/java.io.FileInputStream.open(FileInputStream.java:219) > at
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix misc issues
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Attachment: YARN-10099-008.patch > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix misc issues > - > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch, YARN-10099-004.patch, YARN-10099-005.patch, > YARN-10099-006.patch, YARN-10099-007.patch, YARN-10099-008.patch > > > This ticket is intended to fix two issues: > 1. Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. > 2. A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. > 3. When initializing FS for conversion, we add the current {{yarn-site.xml}} > as resource. This is not necessary. This can cause problems like: > {noformat} > [...] > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-audit.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-security.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-policymgr-ssl.xml): couldn't find resource > file location > 20/01/29 02:45:38 ERROR conf.Configuration: error parsing conf > file:/etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml > java.io.FileNotFoundException: > /etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml (No such file or > directory) > at java.base/java.io.FileInputStream.open0(Native Method) > at java.base/java.io.FileInputStream.open(FileInputStream.java:219) > at
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix misc issues
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Attachment: YARN-10099-007.patch > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix misc issues > - > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch, YARN-10099-004.patch, YARN-10099-005.patch, > YARN-10099-006.patch, YARN-10099-007.patch > > > This ticket is intended to fix two issues: > 1. Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. > 2. A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. > 3. When initializing FS for conversion, we add the current {{yarn-site.xml}} > as resource. This is not necessary. This can cause problems like: > {noformat} > [...] > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-audit.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-security.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-policymgr-ssl.xml): couldn't find resource > file location > 20/01/29 02:45:38 ERROR conf.Configuration: error parsing conf > file:/etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml > java.io.FileNotFoundException: > /etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml (No such file or > directory) > at java.base/java.io.FileInputStream.open0(Native Method) > at java.base/java.io.FileInputStream.open(FileInputStream.java:219) > at java.base/java.io.FileInputStream.(FileInputStream.java:157) >
[jira] [Comment Edited] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025899#comment-17025899 ] Peter Bacsko edited comment on YARN-10043 at 1/29/20 2:06 PM: -- Some comments from me: 1. Nit: Pay attention to missing white spaces, like {{if(res) == 0}} (after "if") 2. {{compareDemand()}} can be simplified: {noformat} return (int) Math.signum(demand2 - demand1); {noformat} 3. {{testOrderingUsingAppSubmitTime()}} has multiple asserts. I'd prefer having separate test cases for better readability. Examples: * testOrderingWithoutUsedAndPendingResources * testOrderingWithUsedAndPendingResources * testOrderingWithSubmissionTime 4. Same applies to {{testOrderingUsingAppDemand()}}. Could be split up like: * testOrderingWithZeroDemand * testOrderingWithSameStartTimeDifferentDemand * Also, "//No changes, equal" part is the same as in {{testOrderingUsingAppSubmitTime()}} was (Author: pbacsko): Some comments from me: 1. Nit: Pay attention to missing white spaces, like {{if(res) == 0}} (after "if") 2. {{compareDemand()}} can be simplified: {noformat} return (int) Math.signum(demand2 - demand1); {noformat} 3. {{testOrderingUsingAppSubmitTime()}} has multiple asserts. I'd prefer having separate test cases for better readability. Examples: * testOrderingWithoutUsedAndPendingResources * testOrderingWithUsedAndPendingResource * testOrderingWithSubmissionTime 4. Same applies to {{testOrderingUsingAppDemand()}}. Could be split up like: * testOrderingWithZeroDemand * testOrderingWithSameStartTimeDifferentDemand * Also, "//No changes, equal" part is the same as in {{testOrderingUsingAppSubmitTime()}} > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10043) FairOrderingPolicy Improvements
[ https://issues.apache.org/jira/browse/YARN-10043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025899#comment-17025899 ] Peter Bacsko commented on YARN-10043: - Some comments from me: 1. Nit: Pay attention to missing white spaces, like {{if(res) == 0}} (after "if") 2. {{compareDemand()}} can be simplified: {noformat} return (int) Math.signum(demand2 - demand1); {noformat} 3. {{testOrderingUsingAppSubmitTime()}} has multiple asserts. I'd prefer having separate test cases for better readability. Examples: * testOrderingWithoutUsedAndPendingResources * testOrderingWithUsedAndPendingResource * testOrderingWithSubmissionTime 4. Same applies to {{testOrderingUsingAppDemand()}}. Could be split up like: * testOrderingWithZeroDemand * testOrderingWithSameStartTimeDifferentDemand * Also, "//No changes, equal" part is the same as in {{testOrderingUsingAppSubmitTime()}} > FairOrderingPolicy Improvements > --- > > Key: YARN-10043 > URL: https://issues.apache.org/jira/browse/YARN-10043 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Manikandan R >Assignee: Manikandan R >Priority: Major > Attachments: YARN-10043.001.patch, YARN-10043.002.patch > > > FairOrderingPolicy can be improved by using some of the approaches (only > relevant) implemented in FairSharePolicy of FS. This improvement has > significance in FS to CS migration context. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix misc issues
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Attachment: YARN-10099-006.patch > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix misc issues > - > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch, YARN-10099-004.patch, YARN-10099-005.patch, > YARN-10099-006.patch > > > This ticket is intended to fix two issues: > 1. Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. > 2. A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. > 3. When initializing FS for conversion, we add the current {{yarn-site.xml}} > as resource. This is not necessary. This can cause problems like: > {noformat} > [...] > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-audit.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-security.xml): couldn't find resource file > location > 20/01/29 02:45:38 ERROR config.RangerConfiguration: > addResourceIfReadable(ranger-yarn-policymgr-ssl.xml): couldn't find resource > file location > 20/01/29 02:45:38 ERROR conf.Configuration: error parsing conf > file:/etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml > java.io.FileNotFoundException: > /etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml (No such file or > directory) > at java.base/java.io.FileInputStream.open0(Native Method) > at java.base/java.io.FileInputStream.open(FileInputStream.java:219) > at java.base/java.io.FileInputStream.(FileInputStream.java:157) > at
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix misc issues
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Description: This ticket is intended to fix two issues: 1. Based on the latest documentation, there are two important properties that are ignored if we have placement rules: ||Property||Explanation|| |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can be created at application submission time, whether because they are specified as the application’s queue by the submitter or because they are placed there by the user-as-default-queue property. If this is false, any time an app would be placed in a queue that is not specified in the allocations file, it is placed in the “default” queue instead. Defaults to true. *If a queue placement policy is given in the allocations file, this property is ignored.*| |yarn.scheduler.fair.user-as-default-queue|Whether to use the username associated with the allocation as the default queue name, in the event that a queue name is not specified. If this is set to “false” or unset, all jobs have a shared default queue, named “default”. Defaults to true. *If a queue placement policy is given in the allocations file, this property is ignored.*| Right now these settings affects the conversion regardless of the placement rules. 2. A converted configuration throws this error: {noformat} 2020-01-27 03:35:35,007 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to standby state 2020-01-27 03:35:35,008 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager java.lang.IllegalArgumentException: Illegal queue mapping u:%user:%user;u:%user:root.users.%user;u:%user:root.default at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) at org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) {noformat} Mapping rules should be separated by a "," character, not by a semicolon. 3. When initializing FS for conversion, we add the current {{yarn-site.xml}} as resource. This is not necessary. This can cause problems like: {noformat} [...] 20/01/29 02:45:38 ERROR config.RangerConfiguration: addResourceIfReadable(ranger-yarn-audit.xml): couldn't find resource file location 20/01/29 02:45:38 ERROR config.RangerConfiguration: addResourceIfReadable(ranger-yarn-security.xml): couldn't find resource file location 20/01/29 02:45:38 ERROR config.RangerConfiguration: addResourceIfReadable(ranger-yarn-policymgr-ssl.xml): couldn't find resource file location 20/01/29 02:45:38 ERROR conf.Configuration: error parsing conf file:/etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml java.io.FileNotFoundException: /etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml (No such file or directory) at java.base/java.io.FileInputStream.open0(Native Method) at java.base/java.io.FileInputStream.open(FileInputStream.java:219) at java.base/java.io.FileInputStream.(FileInputStream.java:157) at java.base/java.io.FileInputStream.(FileInputStream.java:112) at java.base/sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:86) at java.base/sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:184) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2966) at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3057) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3018) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2996) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2871) at org.apache.hadoop.conf.Configuration.get(Configuration.java:1223) at
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix misc issues
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Description: This ticket is intended to fix two issues: 1. Based on the latest documentation, there are two important properties that are ignored if we have placement rules: ||Property||Explanation|| |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can be created at application submission time, whether because they are specified as the application’s queue by the submitter or because they are placed there by the user-as-default-queue property. If this is false, any time an app would be placed in a queue that is not specified in the allocations file, it is placed in the “default” queue instead. Defaults to true. *If a queue placement policy is given in the allocations file, this property is ignored.*| |yarn.scheduler.fair.user-as-default-queue|Whether to use the username associated with the allocation as the default queue name, in the event that a queue name is not specified. If this is set to “false” or unset, all jobs have a shared default queue, named “default”. Defaults to true. *If a queue placement policy is given in the allocations file, this property is ignored.*| Right now these settings affects the conversion regardless of the placement rules. 2. A converted configuration throws this error: {noformat} 2020-01-27 03:35:35,007 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to standby state 2020-01-27 03:35:35,008 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager java.lang.IllegalArgumentException: Illegal queue mapping u:%user:%user;u:%user:root.users.%user;u:%user:root.default at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) at org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) {noformat} Mapping rules should be separated by a "," character, not by a semicolon. 3. When initializing FS for conversion, we add the current {{yarn-site.xml}} as resource. This is not necessary. This can cause problems like: {noformat} [...] 1.cdh7.1.1.p0.1825944/lib/hadoop/lib/ranger-yarn-plugin-impl/gethostname4j-0.0.2.jar 20/01/29 02:45:38 ERROR config.RangerConfiguration: addResourceIfReadable(ranger-yarn-audit.xml): couldn't find resource file location 20/01/29 02:45:38 ERROR config.RangerConfiguration: addResourceIfReadable(ranger-yarn-security.xml): couldn't find resource file location 20/01/29 02:45:38 ERROR config.RangerConfiguration: addResourceIfReadable(ranger-yarn-policymgr-ssl.xml): couldn't find resource file location 20/01/29 02:45:38 ERROR conf.Configuration: error parsing conf file:/etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml java.io.FileNotFoundException: /etc/hadoop/conf.cloudera.YARN-1/xasecure-audit.xml (No such file or directory) at java.base/java.io.FileInputStream.open0(Native Method) at java.base/java.io.FileInputStream.open(FileInputStream.java:219) at java.base/java.io.FileInputStream.(FileInputStream.java:157) at java.base/java.io.FileInputStream.(FileInputStream.java:112) at java.base/sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:86) at java.base/sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:184) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2966) at org.apache.hadoop.conf.Configuration.getStreamReader(Configuration.java:3057) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:3018) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2996) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2871) at
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix misc issues
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Summary: FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix misc issues (was: FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix mapping rule separator) > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix misc issues > - > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch, YARN-10099-004.patch, YARN-10099-005.patch > > > This ticket is intended to fix two issues: > 1. Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. > 2. A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix mapping rule separator
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Attachment: YARN-10099-005.patch > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix mapping rule separator > > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch, YARN-10099-004.patch, YARN-10099-005.patch > > > This ticket is intended to fix two issues: > 1. Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. > 2. A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10108) FS-CS converter: nestedUserQueue with default rule maps to invalid queue mapping
[ https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025143#comment-17025143 ] Peter Bacsko commented on YARN-10108: - Plus, this change is likely going to be necessary in {{getPlacementForUser()}} {noformat} [...] } else if (mapping.getParentQueue() != null && mapping.getQueue().equals(CURRENT_USER_MAPPING)) { QueueMapping queueMapping = QueueMappingBuilder.create() .type(mapping.getType()) .source(mapping.getSource()) .queue(user) .parentQueue(mapping.getParentQueue()) .build(); return getPlacementContext(queueMapping, user); } else if (mapping.getQueue().equals(CURRENT_USER_MAPPING)) { return getPlacementContext(mapping, user); } else if (mapping.getQueue().equals(PRIMARY_GROUP_MAPPING)) { [...] {noformat} > FS-CS converter: nestedUserQueue with default rule maps to invalid queue > mapping > > > Key: YARN-10108 > URL: https://issues.apache.org/jira/browse/YARN-10108 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > > FS Queue Placement Policy > {code:java} > > > > > > {code} > gets mapped to an invalid CS queue mapping "u:%user:root.users.%user" > RM fails to start with above queue mapping in CS > {code:java} > 2020-01-28 00:19:12,889 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping > contains invalid or non-leaf queue [%user] and invalid parent queue > [root.users] > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue > [%user] and invalid parent queue [root.users] > at > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue(QueuePlacementRuleUtils.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.validateAndGetAutoCreatedQueueMapping(UserGroupMappingPlacementRule.java:363) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 7 more > {code} > QueuePlacementConverter#handleNestedRule has to be fixed. > {code:java} > else if (pr instanceof DefaultPlacementRule) { > DefaultPlacementRule defaultRule = (DefaultPlacementRule) pr; > mapping.append("u:" + USER + ":") > .append(defaultRule.defaultQueueName) > .append("." + USER); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Commented] (YARN-10108) FS-CS converter: nestedUserQueue with default rule maps to invalid queue mapping
[ https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025134#comment-17025134 ] Peter Bacsko commented on YARN-10108: - cc [~snemeth] > FS-CS converter: nestedUserQueue with default rule maps to invalid queue > mapping > > > Key: YARN-10108 > URL: https://issues.apache.org/jira/browse/YARN-10108 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > > FS Queue Placement Policy > {code:java} > > > > > > {code} > gets mapped to an invalid CS queue mapping "u:%user:root.users.%user" > RM fails to start with above queue mapping in CS > {code:java} > 2020-01-28 00:19:12,889 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping > contains invalid or non-leaf queue [%user] and invalid parent queue > [root.users] > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue > [%user] and invalid parent queue [root.users] > at > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue(QueuePlacementRuleUtils.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.validateAndGetAutoCreatedQueueMapping(UserGroupMappingPlacementRule.java:363) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 7 more > {code} > QueuePlacementConverter#handleNestedRule has to be fixed. > {code:java} > else if (pr instanceof DefaultPlacementRule) { > DefaultPlacementRule defaultRule = (DefaultPlacementRule) pr; > mapping.append("u:" + USER + ":") > .append(defaultRule.defaultQueueName) > .append("." + USER); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10108) FS-CS converter: nestedUserQueue with default rule maps to invalid queue mapping
[ https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025132#comment-17025132 ] Peter Bacsko edited comment on YARN-10108 at 1/28/20 2:11 PM: -- Ok, here is the problem in {{UserGroupMappingPlacementRule}}: {noformat} private static QueueMapping validateAndGetAutoCreatedQueueMapping( CapacitySchedulerQueueManager queueManager, QueueMapping mapping, QueuePath queuePath) throws IOException { if (queuePath.hasParentQueue() && (queuePath.getParentQueue().equals(PRIMARY_GROUP_MAPPING) || queuePath.getParentQueue().equals(SECONDARY_GROUP_MAPPING))) { [...] } else if (queuePath.hasParentQueue()) { //if parent queue is specified, // then it should exist and be an instance of ManagedParentQueue QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue( queueManager.getQueue(queuePath.getParentQueue()), <--- queuePath.getParentQueue() is "root.users" queuePath.getParentQueue(), queuePath.getLeafQueue()); return QueueMappingBuilder.create() .type(mapping.getType()) .source(mapping.getSource()) .queue(queuePath.getLeafQueue()) .parentQueue(queuePath.getParentQueue()) .build(); } {noformat} The problem is that {{queueManager.getQueue()}} expects a leaf queue name, not a full path. Since [~shuzirra] is working on YARN-9879 which enhances {{getQueue()}}, let's wait until that JIRA is committed. was (Author: pbacsko): Ok, here is the problem in {{UserGroupMappingPlacementRule}}: {noformat} [...] private static QueueMapping validateAndGetAutoCreatedQueueMapping( CapacitySchedulerQueueManager queueManager, QueueMapping mapping, QueuePath queuePath) throws IOException { if (queuePath.hasParentQueue() && (queuePath.getParentQueue().equals(PRIMARY_GROUP_MAPPING) || queuePath.getParentQueue().equals(SECONDARY_GROUP_MAPPING))) { [...] } else if (queuePath.hasParentQueue()) { //if parent queue is specified, // then it should exist and be an instance of ManagedParentQueue QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue( queueManager.getQueue(queuePath.getParentQueue()), <--- queuePath.getParentQueue() is "root.users" queuePath.getParentQueue(), queuePath.getLeafQueue()); return QueueMappingBuilder.create() .type(mapping.getType()) .source(mapping.getSource()) .queue(queuePath.getLeafQueue()) .parentQueue(queuePath.getParentQueue()) .build(); } [...] {noformat} The problem is that {{queueManager.getQueue()}} expects a leaf queue name, not a full path. Since [~shuzirra] is working on YARN-9879 which enhances {{getQueue()}}, let's wait until that JIRA is committed. > FS-CS converter: nestedUserQueue with default rule maps to invalid queue > mapping > > > Key: YARN-10108 > URL: https://issues.apache.org/jira/browse/YARN-10108 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > > FS Queue Placement Policy > {code:java} > > > > > > {code} > gets mapped to an invalid CS queue mapping "u:%user:root.users.%user" > RM fails to start with above queue mapping in CS > {code:java} > 2020-01-28 00:19:12,889 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping > contains invalid or non-leaf queue [%user] and invalid parent queue > [root.users] > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue > [%user] and invalid parent queue
[jira] [Commented] (YARN-10108) FS-CS converter: nestedUserQueue with default rule maps to invalid queue mapping
[ https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025132#comment-17025132 ] Peter Bacsko commented on YARN-10108: - Ok, here is the problem in {{UserGroupMappingPlacementRule}}: {noformat} [...] private static QueueMapping validateAndGetAutoCreatedQueueMapping( CapacitySchedulerQueueManager queueManager, QueueMapping mapping, QueuePath queuePath) throws IOException { if (queuePath.hasParentQueue() && (queuePath.getParentQueue().equals(PRIMARY_GROUP_MAPPING) || queuePath.getParentQueue().equals(SECONDARY_GROUP_MAPPING))) { [...] } else if (queuePath.hasParentQueue()) { //if parent queue is specified, // then it should exist and be an instance of ManagedParentQueue QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue( queueManager.getQueue(queuePath.getParentQueue()), <--- queuePath.getParentQueue() is "root.users" queuePath.getParentQueue(), queuePath.getLeafQueue()); return QueueMappingBuilder.create() .type(mapping.getType()) .source(mapping.getSource()) .queue(queuePath.getLeafQueue()) .parentQueue(queuePath.getParentQueue()) .build(); } [...] {noformat} The problem is that {{queueManager.getQueue()}} expects a leaf queue name, not a full path. Since [~shuzirra] is working on YARN-9879 which enhances {{getQueue()}}, let's wait until that JIRA is committed. > FS-CS converter: nestedUserQueue with default rule maps to invalid queue > mapping > > > Key: YARN-10108 > URL: https://issues.apache.org/jira/browse/YARN-10108 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > > FS Queue Placement Policy > {code:java} > > > > > > {code} > gets mapped to an invalid CS queue mapping "u:%user:root.users.%user" > RM fails to start with above queue mapping in CS > {code:java} > 2020-01-28 00:19:12,889 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping > contains invalid or non-leaf queue [%user] and invalid parent queue > [root.users] > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue > [%user] and invalid parent queue [root.users] > at > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue(QueuePlacementRuleUtils.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.validateAndGetAutoCreatedQueueMapping(UserGroupMappingPlacementRule.java:363) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 7 more > {code} > QueuePlacementConverter#handleNestedRule has to be
[jira] [Commented] (YARN-10107) Invoking NMWebServices#getNMResourceInfo tries to execute gpu discovery binary even if auto discovery is turned off
[ https://issues.apache.org/jira/browse/YARN-10107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025077#comment-17025077 ] Peter Bacsko commented on YARN-10107: - [~snemeth] I just have one question. Now in the {{else}} branch, you set {{gpuDeviceInformation}} to {{null}} and it will be wrapped in the response, right? What's the net effect of this? Is it OK to return a {{null}} {{GpuDeviceInformation}}? > Invoking NMWebServices#getNMResourceInfo tries to execute gpu discovery > binary even if auto discovery is turned off > --- > > Key: YARN-10107 > URL: https://issues.apache.org/jira/browse/YARN-10107 > Project: Hadoop YARN > Issue Type: Bug >Reporter: Szilard Nemeth >Assignee: Szilard Nemeth >Priority: Major > Attachments: YARN-10107.001.patch, nm-config-afterchange-gpu.xml, > nm-config-beforechange-gpu.xml.xml, > request-response-afterchange-with-autodiscovery.txt, > request-response-afterchange.txt, request-response-beforechange.txt > > > During internal end-to-end testing, I found the following issue: > Configuration: > - GPU is enabled > - yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables is set > to "/usr/bin/ls" - Any existing valid binary file > - yarn.nodemanager.resource-plugins.gpu.allowed-gpu-devices is set to > "0:0,1:1,2:2", so auto-discovery is turned off. > If REST endpoint > [http://quasar-tsjqpq-3.vpc.cloudera.com:8042/ws/v1/node/resources/yarn.io%2Fgpu] > is called, the following exception is thrown in NM: > {code:java} > 2020-01-23 07:55:24,803 ERROR > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuResourcePlugin: > Failed to find GPU discovery executable, please double check > yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables setting. > org.apache.hadoop.yarn.exceptions.YarnException: Failed to find GPU discovery > executable, please double check > yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables setting. > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.NvidiaBinaryHelper.getGpuDeviceInformation(NvidiaBinaryHelper.java:54) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDiscoverer.getGpuDeviceInformation(GpuDiscoverer.java:125) > at > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuResourcePlugin.getNMResourceInfo(GpuResourcePlugin.java:104) > at > org.apache.hadoop.yarn.server.nodemanager.webapp.NMWebServices.getNMResourceInfo(NMWebServices.java:515) > {code} > *Let's break this down:* > 1. > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuResourcePlugin#getNMResourceInfo > just calls to the > {code:java} > gpuDeviceInformation = gpuDiscoverer.getGpuDeviceInformation(); > {code} > 2. In > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDiscoverer#getGpuDeviceInformation, > the following calls to the NvidiaBinaryHelper.getGpuDeviceInformation: > {code:java} > try { > lastDiscoveredGpuInformation = > nvidiaBinaryHelper.getGpuDeviceInformation(pathOfGpuBinary); > } catch (IOException e) { > {code} > 3. > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.NvidiaBinaryHelper#getGpuDeviceInformation > finally throws the exception. > This is only happens in case of the parameter called "pathOfGpuBinary" is > null. > Since this method is only called from GpuDiscoverer#getGpuDeviceInformation, > that passes it's field called "pathOfGpuBinary" as the only one parameter, we > can be sure if this field is null, then we have the exception. > 4. The only method that can set the "pathOfGpuBinary" fields is with this > call chain: > {code:java} > GpuDiscoverer.lookUpAutoDiscoveryBinary(Configuration) > (org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu) > GpuDiscoverer.initialize(Configuration, NvidiaBinaryHelper) > (org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu) > {code} > 5. GpuDiscoverer#initialize contains this code: > {code:java} > if (isAutoDiscoveryEnabled()) { > numOfErrorExecutionSinceLastSucceed = 0; > lookUpAutoDiscoveryBinary(config); > > {code} > , so > org.apache.hadoop.yarn.server.nodemanager.containermanager.resourceplugin.gpu.GpuDiscoverer#pathOfGpuBinary > is set ONLY IF auto discovery is enabled. > Since our tests don't have auto discovery enabled, we have this exception. > In this sense, the exception message is very misleading for me: > {code:java} > Failed to find GPU discovery executable, please double check >
[jira] [Comment Edited] (YARN-10108) FS-CS converter: nestedUserQueue with default rule maps to invalid queue mapping
[ https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025045#comment-17025045 ] Peter Bacsko edited comment on YARN-10108 at 1/28/20 12:08 PM: --- Thanks [~prabhujoseph] for reporting this. -I believe the correct output would be just "root.users", right?- Edit: actually no, this should be accepted. Nested rules which are allowed so far: {{u:%user:%primary_group.%user}} {{u:%user:%secondary_group.%user}} In this case, we have an inline string instead of {{%primary_group}} or {{%secondary_group}}. Therefore the generated mapping rule is correct in theory, it's just not handled properly by CS. was (Author: pbacsko): Thanks [~prabhujoseph] for reporting this. I believe the correct output would be just "root.users", right? > FS-CS converter: nestedUserQueue with default rule maps to invalid queue > mapping > > > Key: YARN-10108 > URL: https://issues.apache.org/jira/browse/YARN-10108 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > > FS Queue Placement Policy > {code:java} > > > > > > {code} > gets mapped to an invalid CS queue mapping "u:%user:root.users.%user" > RM fails to start with above queue mapping in CS > {code:java} > 2020-01-28 00:19:12,889 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping > contains invalid or non-leaf queue [%user] and invalid parent queue > [root.users] > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue > [%user] and invalid parent queue [root.users] > at > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue(QueuePlacementRuleUtils.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.validateAndGetAutoCreatedQueueMapping(UserGroupMappingPlacementRule.java:363) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 7 more > {code} > QueuePlacementConverter#handleNestedRule has to be fixed. > {code:java} > else if (pr instanceof DefaultPlacementRule) { > DefaultPlacementRule defaultRule = (DefaultPlacementRule) pr; > mapping.append("u:" + USER + ":") > .append(defaultRule.defaultQueueName) > .append("." + USER); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Assigned] (YARN-10108) FS-CS converter: nestedUserQueue with default rule maps to invalid queue mapping
[ https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko reassigned YARN-10108: --- Assignee: Peter Bacsko > FS-CS converter: nestedUserQueue with default rule maps to invalid queue > mapping > > > Key: YARN-10108 > URL: https://issues.apache.org/jira/browse/YARN-10108 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Assignee: Peter Bacsko >Priority: Major > > FS Queue Placement Policy > {code:java} > > > > > > {code} > gets mapped to an invalid CS queue mapping "u:%user:root.users.%user" > RM fails to start with above queue mapping in CS > {code:java} > 2020-01-28 00:19:12,889 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping > contains invalid or non-leaf queue [%user] and invalid parent queue > [root.users] > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue > [%user] and invalid parent queue [root.users] > at > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue(QueuePlacementRuleUtils.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.validateAndGetAutoCreatedQueueMapping(UserGroupMappingPlacementRule.java:363) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 7 more > {code} > QueuePlacementConverter#handleNestedRule has to be fixed. > {code:java} > else if (pr instanceof DefaultPlacementRule) { > DefaultPlacementRule defaultRule = (DefaultPlacementRule) pr; > mapping.append("u:" + USER + ":") > .append(defaultRule.defaultQueueName) > .append("." + USER); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10108) FS-CS converter: nestedUserQueue with default rule maps to invalid queue mapping
[ https://issues.apache.org/jira/browse/YARN-10108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025045#comment-17025045 ] Peter Bacsko commented on YARN-10108: - Thanks [~prabhujoseph] for reporting this. I believe the correct output would be just "root.users", right? > FS-CS converter: nestedUserQueue with default rule maps to invalid queue > mapping > > > Key: YARN-10108 > URL: https://issues.apache.org/jira/browse/YARN-10108 > Project: Hadoop YARN > Issue Type: Sub-task >Affects Versions: 3.3.0 >Reporter: Prabhu Joseph >Priority: Major > > FS Queue Placement Policy > {code:java} > > > > > > {code} > gets mapped to an invalid CS queue mapping "u:%user:root.users.%user" > RM fails to start with above queue mapping in CS > {code:java} > 2020-01-28 00:19:12,889 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > org.apache.hadoop.service.ServiceStateException: java.io.IOException: mapping > contains invalid or non-leaf queue [%user] and invalid parent queue > [root.users] > at > org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:105) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:173) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$RMActiveServices.serviceInit(ResourceManager.java:829) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndInitActiveServices(ResourceManager.java:1247) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:324) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1534) > Caused by: java.io.IOException: mapping contains invalid or non-leaf queue > [%user] and invalid parent queue [root.users] > at > org.apache.hadoop.yarn.server.resourcemanager.placement.QueuePlacementRuleUtils.validateQueueMappingUnderParentQueue(QueuePlacementRuleUtils.java:48) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.validateAndGetAutoCreatedQueueMapping(UserGroupMappingPlacementRule.java:363) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:300) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > ... 7 more > {code} > QueuePlacementConverter#handleNestedRule has to be fixed. > {code:java} > else if (pr instanceof DefaultPlacementRule) { > DefaultPlacementRule defaultRule = (DefaultPlacementRule) pr; > mapping.append("u:" + USER + ":") > .append(defaultRule.defaultQueueName) > .append("." + USER); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix mapping rule separator
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Attachment: YARN-10099-004.patch > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix mapping rule separator > > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch, YARN-10099-004.patch > > > This ticket is intended to fix two issues: > 1. Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. > 2. A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10105) FS-CS converter: separator between mapping rules should be comma
[ https://issues.apache.org/jira/browse/YARN-10105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17025008#comment-17025008 ] Peter Bacsko commented on YARN-10105: - Discussed w/ [~snemeth] offline, this patch will be merged into YARN-10099. Closing this as duplicate. > FS-CS converter: separator between mapping rules should be comma > > > Key: YARN-10105 > URL: https://issues.apache.org/jira/browse/YARN-10105 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10105-001.patch > > > A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix mapping rule separator
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Description: This ticket is intended to fix two issues: 1. Based on the latest documentation, there are two important properties that are ignored if we have placement rules: ||Property||Explanation|| |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can be created at application submission time, whether because they are specified as the application’s queue by the submitter or because they are placed there by the user-as-default-queue property. If this is false, any time an app would be placed in a queue that is not specified in the allocations file, it is placed in the “default” queue instead. Defaults to true. *If a queue placement policy is given in the allocations file, this property is ignored.*| |yarn.scheduler.fair.user-as-default-queue|Whether to use the username associated with the allocation as the default queue name, in the event that a queue name is not specified. If this is set to “false” or unset, all jobs have a shared default queue, named “default”. Defaults to true. *If a queue placement policy is given in the allocations file, this property is ignored.*| Right now these settings affects the conversion regardless of the placement rules. 2. A converted configuration throws this error: {noformat} 2020-01-27 03:35:35,007 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to standby state 2020-01-27 03:35:35,008 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager java.lang.IllegalArgumentException: Illegal queue mapping u:%user:%user;u:%user:root.users.%user;u:%user:root.default at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) at org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) {noformat} Mapping rules should be separated by a "," character, not by a semicolon. was: Based on the latest documentation, there are two important properties that are ignored if we have placement rules: ||Property||Explanation|| |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can be created at application submission time, whether because they are specified as the application’s queue by the submitter or because they are placed there by the user-as-default-queue property. If this is false, any time an app would be placed in a queue that is not specified in the allocations file, it is placed in the “default” queue instead. Defaults to true. *If a queue placement policy is given in the allocations file, this property is ignored.*| |yarn.scheduler.fair.user-as-default-queue|Whether to use the username associated with the allocation as the default queue name, in the event that a queue name is not specified. If this is set to “false” or unset, all jobs have a shared default queue, named “default”. Defaults to true. *If a queue placement policy is given in the allocations file, this property is ignored.*| Right now these settings affects the conversion regardless of the placement rules. > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix mapping rule separator > > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch > > > This ticket is intended to fix two issues: > 1. Based on the latest
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix mapping rule separator
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Summary: FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly and fix mapping rule separator (was: FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly) > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly and fix mapping rule separator > > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch > > > Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Attachment: YARN-10099-003.patch > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly > - > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch, > YARN-10099-003.patch > > > Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (YARN-10105) FS-CS converter: separator between mapping rules should be comma
[ https://issues.apache.org/jira/browse/YARN-10105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024935#comment-17024935 ] Peter Bacsko edited comment on YARN-10105 at 1/28/20 8:10 AM: -- Patch YARN-10099 introduces tests that verify mapping. So if that change is pushed first, this one must be rebased. Alternatively, we can merge these changes into YARN-10099, reducing the number of commits. [~snemeth] thoughts? was (Author: pbacsko): Patch YARN-10099 introduces tests that verify mapping. So if that change is pushed first, this one must be rebased. > FS-CS converter: separator between mapping rules should be comma > > > Key: YARN-10105 > URL: https://issues.apache.org/jira/browse/YARN-10105 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10105-001.patch > > > A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10105) FS-CS converter: separator between mapping rules should be comma
[ https://issues.apache.org/jira/browse/YARN-10105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024935#comment-17024935 ] Peter Bacsko commented on YARN-10105: - Patch YARN-10099 introduces tests that verify mapping. So if that change is pushed first, this one must be rebased. > FS-CS converter: separator between mapping rules should be comma > > > Key: YARN-10105 > URL: https://issues.apache.org/jira/browse/YARN-10105 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10105-001.patch > > > A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10085) FS-CS converter: remove mixed ordering policy check
[ https://issues.apache.org/jira/browse/YARN-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024375#comment-17024375 ] Peter Bacsko commented on YARN-10085: - [~snemeth] please review this change when you've got some time. > FS-CS converter: remove mixed ordering policy check > --- > > Key: YARN-10085 > URL: https://issues.apache.org/jira/browse/YARN-10085 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Critical > Attachments: YARN-10085-001.patch, YARN-10085-002.patch, > YARN-10085-003.patch, YARN-10085-004.patch, YARN-10085-004.patch, > YARN-10085-005.patch, YARN-10085-006.patch > > > In the converter, this part is very strict and probably unnecessary: > {noformat} > // Validate ordering policy > if (queueConverter.isDrfPolicyUsedOnQueueLevel()) { > if (queueConverter.isFifoOrFairSharePolicyUsed()) { > throw new ConversionException( > "DRF ordering policy cannot be used together with fifo/fair"); > } else { > capacitySchedulerConfig.set( > CapacitySchedulerConfiguration.RESOURCE_CALCULATOR_CLASS, > DominantResourceCalculator.class.getCanonicalName()); > } > } > {noformat} > It's also misleading, because Fair policy can be used under DRF, so the error > message is incorrect. > Let's remove these checks and rewrite the converter in a way that it > generates a valid config even if fair/drf is somehow mixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10104) FS-CS converter: dry run should work without output defined
[ https://issues.apache.org/jira/browse/YARN-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10104: Attachment: YARN-10104-002.patch > FS-CS converter: dry run should work without output defined > --- > > Key: YARN-10104 > URL: https://issues.apache.org/jira/browse/YARN-10104 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10104-001.patch, YARN-10104-002.patch > > > The "-d" switch doesn't work properly. > You still have to define either "-p" or "-o", which is not the way the tool > is supposed to work (ie. it doesn't need to generate any output after the > conversion). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10105) FS-CS converter: separator between mapping rules should be comma
[ https://issues.apache.org/jira/browse/YARN-10105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10105: Attachment: YARN-10105-001.patch > FS-CS converter: separator between mapping rules should be comma > > > Key: YARN-10105 > URL: https://issues.apache.org/jira/browse/YARN-10105 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10105-001.patch > > > A converted configuration throws this error: > {noformat} > 2020-01-27 03:35:35,007 INFO > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned > to standby state > 2020-01-27 03:35:35,008 FATAL > org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting > ResourceManager > java.lang.IllegalArgumentException: Illegal queue mapping > u:%user:%user;u:%user:root.users.%user;u:%user:root.default > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) > at > org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) > at > org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) > at > org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) > at > org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) > {noformat} > Mapping rules should be separated by a "," character, not by a semicolon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10103) Capacity scheduler: add support for create=true/false per mapping rule
[ https://issues.apache.org/jira/browse/YARN-10103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10103: Labels: fs2cs (was: ) > Capacity scheduler: add support for create=true/false per mapping rule > -- > > Key: YARN-10103 > URL: https://issues.apache.org/jira/browse/YARN-10103 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Priority: Major > Labels: fs2cs > > You can't ask Capacity Scheduler for a mapping to create a queue if it > doesn't exist. > For example, this mapping would use the first rule if the queue exist. If it > doesn't, then it proceeds to the next rule: > {{u:%user:%primary_group.%user:create=false;u:%user%:root.default}} > Let's say user "alice" belongs to the "admins" group. It would first try to > map {{root.admins.alice}}. But, if the queue doesn't exist, then it places > the application into {{root.default}}. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Labels: fs2cs (was: ) > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly > - > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10099-001.patch, YARN-10099-002.patch > > > Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10104) FS-CS converter: dry run should work without output defined
[ https://issues.apache.org/jira/browse/YARN-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10104: Labels: fs2cs (was: ) > FS-CS converter: dry run should work without output defined > --- > > Key: YARN-10104 > URL: https://issues.apache.org/jira/browse/YARN-10104 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Labels: fs2cs > Attachments: YARN-10104-001.patch > > > The "-d" switch doesn't work properly. > You still have to define either "-p" or "-o", which is not the way the tool > is supposed to work (ie. it doesn't need to generate any output after the > conversion). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Created] (YARN-10105) FS-CS converter: separator between mapping rules should be comma
Peter Bacsko created YARN-10105: --- Summary: FS-CS converter: separator between mapping rules should be comma Key: YARN-10105 URL: https://issues.apache.org/jira/browse/YARN-10105 Project: Hadoop YARN Issue Type: Sub-task Reporter: Peter Bacsko Assignee: Peter Bacsko A converted configuration throws this error: {noformat} 2020-01-27 03:35:35,007 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioned to standby state 2020-01-27 03:35:35,008 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager java.lang.IllegalArgumentException: Illegal queue mapping u:%user:%user;u:%user:root.users.%user;u:%user:root.default at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacitySchedulerConfiguration.getQueueMappings(CapacitySchedulerConfiguration.java:1113) at org.apache.hadoop.yarn.server.resourcemanager.placement.UserGroupMappingPlacementRule.initialize(UserGroupMappingPlacementRule.java:244) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.getUserGroupMappingPlacementRule(CapacityScheduler.java:671) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.updatePlacementRules(CapacityScheduler.java:712) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:753) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initScheduler(CapacityScheduler.java:361) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.serviceInit(CapacityScheduler.java:426) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:108) {noformat} Mapping rules should be separated by a "," character, not by a semicolon. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10099) FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly
[ https://issues.apache.org/jira/browse/YARN-10099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10099: Summary: FS-CS converter: handle allow-undeclared-pools and user-as-default-queue properly (was: FS-CS converter: handle allow-undeclared-pools and user-as-default queue properly) > FS-CS converter: handle allow-undeclared-pools and user-as-default-queue > properly > - > > Key: YARN-10099 > URL: https://issues.apache.org/jira/browse/YARN-10099 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10099-001.patch, YARN-10099-002.patch > > > Based on the latest documentation, there are two important properties that > are ignored if we have placement rules: > ||Property||Explanation|| > |yarn.scheduler.fair.allow-undeclared-pools|If this is true, new queues can > be created at application submission time, whether because they are specified > as the application’s queue by the submitter or because they are placed there > by the user-as-default-queue property. If this is false, any time an app > would be placed in a queue that is not specified in the allocations file, it > is placed in the “default” queue instead. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > |yarn.scheduler.fair.user-as-default-queue|Whether to use the username > associated with the allocation as the default queue name, in the event that a > queue name is not specified. If this is set to “false” or unset, all jobs > have a shared default queue, named “default”. Defaults to true. *If a queue > placement policy is given in the allocations file, this property is ignored.*| > Right now these settings affects the conversion regardless of the placement > rules. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10085) FS-CS converter: remove mixed ordering policy check
[ https://issues.apache.org/jira/browse/YARN-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17024292#comment-17024292 ] Peter Bacsko commented on YARN-10085: - Patch v6 changes: # Print warning in the unsupported case (calculator is DRC but queue policy is just "fair") # Extra unit tests > FS-CS converter: remove mixed ordering policy check > --- > > Key: YARN-10085 > URL: https://issues.apache.org/jira/browse/YARN-10085 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Critical > Attachments: YARN-10085-001.patch, YARN-10085-002.patch, > YARN-10085-003.patch, YARN-10085-004.patch, YARN-10085-004.patch, > YARN-10085-005.patch, YARN-10085-006.patch > > > In the converter, this part is very strict and probably unnecessary: > {noformat} > // Validate ordering policy > if (queueConverter.isDrfPolicyUsedOnQueueLevel()) { > if (queueConverter.isFifoOrFairSharePolicyUsed()) { > throw new ConversionException( > "DRF ordering policy cannot be used together with fifo/fair"); > } else { > capacitySchedulerConfig.set( > CapacitySchedulerConfiguration.RESOURCE_CALCULATOR_CLASS, > DominantResourceCalculator.class.getCanonicalName()); > } > } > {noformat} > It's also misleading, because Fair policy can be used under DRF, so the error > message is incorrect. > Let's remove these checks and rewrite the converter in a way that it > generates a valid config even if fair/drf is somehow mixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10085) FS-CS converter: remove mixed ordering policy check
[ https://issues.apache.org/jira/browse/YARN-10085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10085: Attachment: YARN-10085-006.patch > FS-CS converter: remove mixed ordering policy check > --- > > Key: YARN-10085 > URL: https://issues.apache.org/jira/browse/YARN-10085 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Critical > Attachments: YARN-10085-001.patch, YARN-10085-002.patch, > YARN-10085-003.patch, YARN-10085-004.patch, YARN-10085-004.patch, > YARN-10085-005.patch, YARN-10085-006.patch > > > In the converter, this part is very strict and probably unnecessary: > {noformat} > // Validate ordering policy > if (queueConverter.isDrfPolicyUsedOnQueueLevel()) { > if (queueConverter.isFifoOrFairSharePolicyUsed()) { > throw new ConversionException( > "DRF ordering policy cannot be used together with fifo/fair"); > } else { > capacitySchedulerConfig.set( > CapacitySchedulerConfiguration.RESOURCE_CALCULATOR_CLASS, > DominantResourceCalculator.class.getCanonicalName()); > } > } > {noformat} > It's also misleading, because Fair policy can be used under DRF, so the error > message is incorrect. > Let's remove these checks and rewrite the converter in a way that it > generates a valid config even if fair/drf is somehow mixed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10104) FS-CS converter: dry run should work without output defined
[ https://issues.apache.org/jira/browse/YARN-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10104: Attachment: YARN-10104-001.patch > FS-CS converter: dry run should work without output defined > --- > > Key: YARN-10104 > URL: https://issues.apache.org/jira/browse/YARN-10104 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > Attachments: YARN-10104-001.patch > > > The "-d" switch doesn't work properly. > You still have to define either "-p" or "-o", which is not the way the tool > is supposed to work (ie. it doesn't need to generate any output after the > conversion). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10104) FS-CS converter: dry run should work without output defined
[ https://issues.apache.org/jira/browse/YARN-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10104: Summary: FS-CS converter: dry run should work without output defined (was: FS-CS converter: dryRun requires either -p or -o) > FS-CS converter: dry run should work without output defined > --- > > Key: YARN-10104 > URL: https://issues.apache.org/jira/browse/YARN-10104 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > > The "-d" switch doesn't work properly. > You still have to define either "-p" or "-o", which is not the way the tool > is supposed to work (ie. it doesn't need to generate any output after the > conversion). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Updated] (YARN-10104) FS-CS converter: dryRun requires either -p or -o
[ https://issues.apache.org/jira/browse/YARN-10104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Bacsko updated YARN-10104: Description: The "-d" switch doesn't work properly. You still have to define either "-p" or "-o", which is not the way the tool is supposed to work (ie. it doesn't need to generate any output after the conversion). was: The "-d" / "--dry-run" switch doesn't work properly. You still have to define either "-p" or "-o", which is not the way the tool is supposed to work (ie. it doesn't need to generate any output after the conversion). > FS-CS converter: dryRun requires either -p or -o > > > Key: YARN-10104 > URL: https://issues.apache.org/jira/browse/YARN-10104 > Project: Hadoop YARN > Issue Type: Sub-task >Reporter: Peter Bacsko >Assignee: Peter Bacsko >Priority: Major > > The "-d" switch doesn't work properly. > You still have to define either "-p" or "-o", which is not the way the tool > is supposed to work (ie. it doesn't need to generate any output after the > conversion). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org