[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302978#comment-17302978 ] Wilfred Spiegelenburg commented on YARN-10652: -- Thank you to [~sahuja] for the fix, and to all ([~snemeth] , [~shuzirra] , [~gandras] & [~pbacsko]) for the discussion and resolution around this jira. I committed to trunk with a comment in the commit message: {quote}This only fixes the user name resolution for weights in the queues. It does not add generic support for user names with dots in all use cases in the capacity scheduler. {quote} > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17302845#comment-17302845 ] Szilard Nemeth commented on YARN-10652: --- Hi [~sahuja], Answering your comment from [here|https://issues.apache.org/jira/browse/YARN-10652?focusedCommentId=17295634=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17295634]. 1. This might be tough to implement but [~pbacsko] and [~shuzirra] know the internals of the placement engine better than myself. 2. I think it's okay to have it documented, so I'd choose this from your suggestions. Could you please file a jira for this? 3. This is also a good idea. Furthermore, can you file a follow-up jira (you can file more if necessary) as suggested by [Peter's comment|https://issues.apache.org/jira/browse/YARN-10652?focusedCommentId=17295964=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17295964] to cover the problematic cases we already discovered during code inspection and while having our discussion here? All in all, if you file follow-up jiras to make this use-case more stable and consistent, I'm fine. So, I'm giving +1 (binding) for your patch. [~wilfreds] I get your point with the last comment. Based on my comment above: As you wanted to commit this in the first place, please go ahead with committing. Thanks. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17297089#comment-17297089 ] Wilfred Spiegelenburg commented on YARN-10652: -- I completely agree with your assessment [~pbacsko]. This is nowhere near a full fix for the dot problem at all. That needs to be tackled one issue at a time. We should not do all of it now. We can take multiple jiras to fix these issues. I thus second the case by case solution approach. Fixing this one outside of placement rule changes is one step. Introducing a standard way for the property resolution for all properties that use the would be a *nice* to have, again not needed now. The property introduced for max apps resolves without an issue even with dots in the name. Placement rules are complex, I would not recommend that this Jira should look at it at all. [~snemeth] & [~shuzirra]: based on the fact that we need to fix this irrespective of what is done in placement rules I would like to proceed with the commit for this. The change allows the administrator to just use the existing user name in the configuration similar to the "max-parallel-apps" setting. When and if a solution is implemented for the placement rules to support dots in user and group names, which are part of the queue path, new fixes might be needed for this issue and YARN-9930. We might even leave these two as is. That is not a decision we need to make now. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295964#comment-17295964 ] Peter Bacsko commented on YARN-10652: - Hi guys, I think we can reach compromise: let's think about scenarios where dotted usernames can be problematic and address them in a follow-up JIRA. For example, we already know that placement rules involving username (%user placeholder) will definitely exhibit unexpected behavior (interestingly enough this has always been a problem, but just hasn't been reported). So in this case, we can go FS-way and just replace "." with "_dot_". Also, FS does this to primary groups as well, that's another thing that we need to fix. Maybe the cleanName() approach is just fine? When it comes to configuration, {{getValByRegex()}} is only used for this property, so it's likely that we're already good and in other cases, property names are concatenated and dot isn't an issue at all. In YARN-9930, I added "yarn.scheduler.capacity.user..max-parallel-apps", making it a potential suspect, but I don't use regex, just concat strings. IMO we can handle these on a case-by-case basis. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17295634#comment-17295634 ] Siddharth Ahuja commented on YARN-10652: Thank you very much for the review [~snemeth]szilard, appreciate it. Please find my comments below: {quote} As Gergo said, we need to keep consistency. It's one thing that usernames with dots are kind of supported, but is it really supported in all parts of the system? Definitely not for placement rules as the rule Gergo mentioned ("root.user.%user") could cause an issue easily. It's okay that some customers don't want to use placement rules and your change is not strictly related to placement rules. But if we are encouraging using usernames with dots across the codebase, we need to have handle these usernames in all aspects of the system. What if some customers are using usernames with dots and placement rules? There we have a problem, we need a more complete solution. {quote} There is no encouragement from my side :) Customer raised the issue themselves in regards to using this setting (which doesn't work) and hence this JIRA has been raised. I understand yours and everyone else's concern about consistency in regards to using dots in usernames across YARN - perfectly valid. However, there is nothing stopping customers today from using usernames with dots for queue placement, regardless of the fix in this JIRA. Our software doesn't prevent it and the [Capacity Scheduler upstream documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html] has zero mention about the lack of this support or the flakiness of this feature depending on where it is used. Meanwhile, if you look at [Fair Scheduler's upstream documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html] (and s/w - see [1]), at least it talks about conversion of usernames & groups with a "." to "_dot_" even though it doesn't clearly say that it "supports" usernames with dots/periods, but it is implicit. There are two ways to "discourage" customers from using usernames/groups with dots in them: 1. Block all creation/use (including placement rules) of usernames/groups with dots until such time this feature is robust and fully available, and/or, 2. Explicitly state in upstream documentation that there are known issues with Capacity Scheduler around usernames & groupnames with dots/periods, as such, it is strictly not recommended to work with them for the moment. In absence of 1 and/or 2, *+there is nothing stopping customers from using this feature today+* and thus, leading to JIRAs like the one here. We should not leave customers in confusion or worse, let them use this functionality to their own peril. As such, please let me know if there is an easy way to achieve 1. from above (ideal solution) or at the very least go ahead with 2. - I can raise an upstream JIRA and update Capacity Scheduler documentation. {quote} Support for usernames with dots: Was this documented anywhere or is this fact only can be dig up from the codebase? {quote} As mentioned above, Fair Scheduler already supports it, kindly see [Fair Scheduler documentation|https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html] and code at [1]. Usernames with dots are valid usernames in Linux, AD/LDAP. 3. Further, it would be good to inform customers on how they should migrate their users with dots in them from FairScheduler to CapacityScheduler through some sort of documentation. Please let me know what you think of 1, 2 and 3 [~snemeth]. [1] From YARN-2669 all user and group names will be passed through cleanName() (https://github.com/apache/hadoop/blob/a89ca56a1b0eb949f56e7c6c5c25fdf87914a02f/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/placement/FairQueuePlacementUtils.java#L53) which replaces the "." with a dot string. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> >
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17294833#comment-17294833 ] Szilard Nemeth commented on YARN-10652: --- Hi [~sahuja] First of all, thanks for working on this. I can't really add too many things, [~pbacsko] and [~shuzirra] summarized my concerns pretty much. I would like to state my own opinion at least. Again, there will be repercussions of previous comments, bear with me. 1. As Gergo said, we need to keep consistency. It's one thing that usernames with dots are kind of supported, but is it really supported in all parts of the system? Definitely not for placement rules as the rule Gergo mentioned ("root.user.%user") could cause an issue easily. It's okay that some customers don't want to use placement rules and your change is not strictly related to placement rules. But if we are encouraging using usernames with dots across the codebase, we need to have handle these usernames in all aspects of the system. What if some customers are using usernames with dots and placement rules? There we have a problem, we need a more complete solution. 2. Support for usernames with dots: Was this documented anywhere or is this fact only can be dig up from the codebase? 3. We also understand that this is a setting of a queue and usernames are stored in the config objects and you are just retrieving this with that regex. The problem here is that "supporting" this is more like an overstatement as ACL handling / placement rules could be problematic areas. 4. {quote} But we are supporting usernames with dots today. Users with dots in their usernames can submit jobs to the cluster having CS with no issues today (again, I am not talking about queue placement with queues with dots here). There are no errors reported when users with dots are supplied against "yarn.scheduler.capacity..user-settings..weight setting" and in fact, there should NOT be any errors when it is done so. These are real-world usernames and we will have to accept them from any interface, whether it be UI or CLI or anything. {quote} My answer for this is added at 3. 5. [~wilfreds] I don't agree with this: {quote} If you want to solve the generic dot issue for user based placement then that is outside of this change. {quote} Why would we allow usernames with dots more and more places in the code and forget about the generic solution of ? Doesn't make sense for me, just leads to developer and user confusion, IMHO. TBH, it's too confusing as it is now. As [~sahuja] said, users can submit jobs without a problem. Then someone defines a simple username-based placement rule and things will stop working? That's just ridiculous. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17292620#comment-17292620 ] Wilfred Spiegelenburg commented on YARN-10652: -- I do not see the relation with placement rules or the FS for this fix at all. The weight of a queue can be used with or without a placement rule. It is a setting on a queue. Having this setting on a user based queue also does not really make sense. It gives more resources to one user over others in the queue. So I can see this being used on a parent queue or a group queue but not on an individuals leaf queue. On top of that the queue path of the configuration setting is not affected by this change. We're talking about this setting: {code:java} yarn.scheduler.capacity..user-settings..weight{code} The weight is retrieved for a specific queue as defined in the _._ That part is already resolved and is not changed. The only resolution that is changed is the __ part between the words _user-settings_. and _.weight._ The queue path could be anything and is not in play here. It could even a fixed configured queue or one mapped on a group name. The administrator should know as minimal as possible, preferably nothing, about the internals for storing users in the CS. If the queue mapping rule for the user changes the dots to make it a single part of the queue path then that is independent of this change. It still does not change the way the user is stored in the CS. It changes the way you map a user to a queue in the placement rules. On the FS side we thought about standardising dot usage. We considered both cases using and not _dot_ in the config files in user names. When I looked at it I was not sure which was the correct solution. It could lead to strange behaviour and extra administrative work. The admin forgets to remove the dot and all of a sudden the config does not apply. That is why it never went further than just the Jira YARN-5674. With this change as proposed you will support weights for all users with a dot except for the user called: something_.weights_ That will be the only user set that breaks which is far less than breaking all users with a dot in the username. I do not see any other bound properties in the configuration at the moment. If you want to solve the generic dot issue for user based placement then that is outside of this change. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291657#comment-17291657 ] Andras Gyori commented on YARN-10652: - When I first saw this issue, I implicitly thought about the same things [~shuzirra] mentioned. Currently, there is no consensus how configuration parsing is implemented. I agree, that users should not be aware of this, but they might be affected nonetheless, because * With this fix, weight setting is working with usernames including dot * Other properties might not Currently, a somewhat centralised access point is getUserPrefix, which supports username with dots, but as this case shows us, there might be hidden edge-cases, where it is still not working. One problematic point that comes to my mind is getPropsWithPrefix method, which is sometimes used in the Configuration class. This way, you might get properties for user "a", when you set a property for user "a.b". That being said, this might take a lot of time to investigate, and it is better to fix an actual discovered case, and be vigilant about this scenario in the future. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291628#comment-17291628 ] Siddharth Ahuja commented on YARN-10652: Thanks [~pbacsko] for confirming that this issue has no direct relation to placement. Thanks [~shuzirra] for your insights. I see what you are trying to say, however, I don't believe we need to wait for a CS-wide solution on how usernames are internally stored. My arguments are below: In regards to: {quote} If we don't handle user names with dots properly, then we cannot say we support user names with dots. {quote} But we are supporting usernames with dots today. Users with dots in their usernames can submit jobs to the cluster having CS with no issues today (again, I am not talking about queue placement with queues with dots here). There are no errors reported when users with dots are supplied against "yarn.scheduler.capacity..user-settings..weight setting" and in fact, there should NOT be any errors when it is done so. These are real-world usernames and we will have to accept them from any interface, whether it be UI or CLI or anything. There is no need to wait to open this up on the front-end as they are already being stored as a String in our code, please see https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L3902 & https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L3903. In regards to mapping rules, users will again specify them with real-world usernames. If the real-world username has a dot in them, then, thats how it should be accepted. How we store it internally should not matter to the user when they are specifying these settings. Opening up this setting - "yarn.scheduler.capacity..user-settings..weight setting" should have no bearing on how this is actually stored internally in YARN CS whether now or in future. My solution is catering for an issue that only surfaces on the front-end, not the back-end so I still don't probably see how this needs to wait for any future refactoring on implementation side of things. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291614#comment-17291614 ] Gergely Pollak commented on YARN-10652: --- The problem here is the consistency. There are multiple places where user names are used in queue path parts, mapping rules is one example. If we don't handle user names with dots properly, then we cannot say we support user names with dots. We need to handle in each and every case, which means we need a more centralized or at least consistent solution. Simply replacing the dots with underscores for this property won't solve the issue CS wide, and we need a CS wide solution, which might consist of multiple separate smaller solutions like this, but we need to centralize this effort. For example as soon someone introduces a rule like root.user.%user, we already have an issue with user names with dots, and if we handle them differently here, than your solution, we might get inconsistent configuration and behavior. Also we need to check other places where usernames with dots (and group names) can cause issues. Also I prefer the FS solution for substitution with '_dot_' rather than a simple '_' since there is a much smaller chance of user name collision this way. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291612#comment-17291612 ] Peter Bacsko commented on YARN-10652: - [~sahuja] you're right in saying that it has no direct relation in the placement. In the first part of my comment, I was just thinking out loud that MAYBE using "_" instead of "." in the property is also a solution, but it comes with its own problems. The placement stuff is different, it's something that we haven't considered so far. Currently, the new placement engine simply replaces placeholders like "root.users.%user" to "root.users.firstname.lastname", which is likely not what we want. It will not work in percentage mode, because "firstname" is a parent and you can't create parents under a ManagedParentQueue. In the new weight mode, it can work, but again, the intention is to have something like "root.users.firstname_lastname", just a single leaf. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291595#comment-17291595 ] Siddharth Ahuja commented on YARN-10652: Also, yarn.scheduler.capacity..user-settings..weight setting is a user-specific setting, not an internal implementation. That is to say that sure, even if you want to manage usernames with "." as "_" internally, it should not force the users who are specifying their users containing a "." against this setting - "yarn.scheduler.capacity..user-settings..weight setting" should now instead use "_" because our s/w wants to store "_" instead of a ".". Substituting a "." with an "_" is an internal s/w implementation thing, however, we cannot prohibit users from supplying usernames with a "." against yarn.scheduler.capacity..user-settings..weight. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291583#comment-17291583 ] Siddharth Ahuja commented on YARN-10652: Thanks [~rreti] & [~pbacsko] for your comments. However, please correct me if I am wrong (and I must admit that I am primarily familiar with FairScheduler as well), however, {{yarn.scheduler.capacity..user-settings..weight}} setting has nothing to do with queue placement. We are not trying to have the user -> firstname.lastname placed into its own queue like root.firstname.lastname here. What we are instead trying to solve is that if a user-> firstname.lastname wants to submit a job to the root.default queue as an example, then, that user can certainly do that today (kindly see the attached screenshots - there is nothing stopping that today) and considering it can do that, the user weight setting that decides the user's share in the root.default (not root.firstname.lastname) queue should be able to cater for this user, that's all. Giving firstname.lastname user it's share through this setting does not trigger any placements and as such should not have any cause of concern. Again, please correct me if I am wrong [~pbacsko]. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291572#comment-17291572 ] Peter Bacsko commented on YARN-10652: - "Dot" in the username is clearly a problem. In FS, there is approach in certain situations when dots are replaced to underscores ("_"). Quoting the upstream docs: {noformat} user: the app is placed into a queue with the name of the user who submitted it. Periods in the username will be replace with “_dot_”, i.e. the queue name for user “first.last” is “first_dot_last”. primaryGroup: the app is placed into a queue with the name of the primary group of the user who submitted it. Periods in the group name will be replaced with “_dot_”, i.e. the queue name for group “one.two” is “one_dot_two”. {noformat} Obviously this is slightly different here, because in this case, you'd refer to the username as "firstname_lastname" in a static configuration, which could be confusing. Also, "firstname.lastname" and "firstname_lastname" would clash (unrealistic, but can happen in theory). But in the placement engine, we should definitely consider what FS does and replace "." with "_". > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291563#comment-17291563 ] Rudolf Reti commented on YARN-10652: [~pbacsko] & [~shuzirra] Can you please check this from a Placement point of view? Won't the DOT in the username break that as DOT is used as a delimiter there? > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291377#comment-17291377 ] Hadoop QA commented on YARN-10652: -- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 1m 19s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} {color} | {color:green} 0m 0s{color} | {color:green}test4tests{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 5s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 59s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 50s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 53s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 16m 48s{color} | {color:green}{color} | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green}{color} | {color:green} trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green}{color} | {color:green} trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 1m 50s{color} | {color:blue}{color} | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s{color} | {color:green}{color} | {color:green} trunk passed {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 55s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 47s{color} | {color:green}{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green}{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 14m 53s{color} | {color:green}{color} | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 38s{color} | {color:green}{color} | {color:green} the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 33s{color} | {color:green}{color} | {color:green} the
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291340#comment-17291340 ] Siddharth Ahuja commented on YARN-10652: Thanks a lot for the review [~wilfreds], much appreciate it! Sure, happy to wait for any comments. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291339#comment-17291339 ] Wilfred Spiegelenburg commented on YARN-10652: -- Change looks good +1 (binding) I'll let it sit for a day or so for other people to have a look at this too. I will commit if there are no comments in the next day or so. > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: Correct user weight of 0.76 picked up for the user with > a dot after the patch.png, Incorrect default user weight of 1.0 being picked > for the user with a dot before the patch.png, YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org
[jira] [Commented] (YARN-10652) Capacity Scheduler fails to handle user weights for a user that has a "." (dot) in it
[ https://issues.apache.org/jira/browse/YARN-10652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17291336#comment-17291336 ] Siddharth Ahuja commented on YARN-10652: +cc [~snemeth] > Capacity Scheduler fails to handle user weights for a user that has a "." > (dot) in it > - > > Key: YARN-10652 > URL: https://issues.apache.org/jira/browse/YARN-10652 > Project: Hadoop YARN > Issue Type: Bug > Components: capacity scheduler >Affects Versions: 3.3.0 >Reporter: Siddharth Ahuja >Assignee: Siddharth Ahuja >Priority: Major > Attachments: YARN-10652.001.patch > > > AD usernames can have a "." (dot) in them i.e. they can be of the format -> > {{firstname.lastname}}. However, if you specify a username with this format > against the Capacity Scheduler setting -> > {{yarn.scheduler.capacity.root.default.user-settings.firstname.lastname.weight}}, > it fails to be applied and is instead assigned the default of 1.0f weight. > This renders the user weight feature (being used as a means of setting user > priorities for a queue) unusable for such users. > This limitation comes from [1]. From [1], only word characters (A word > character: [a-zA-Z_0-9]) (see [2]) are permissible at the moment which is no > good for AD names that contain a "." (dot). > Similar discussion has been had in a few HADOOP jiras e.g. HADOOP-7050 and > HADOOP-15395 and the outcome was to use non-whitespace characters i.e. > instead of {{\w+}}, use {{\S+}}. > We could go down similar path and unblock this feature for the AD usernames > with a "." (dot) in them. > [1] > https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/capacity/CapacitySchedulerConfiguration.java#L1953 > [2] > https://docs.oracle.com/javase/tutorial/essential/regex/pre_char_classes.html -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org