[
https://issues.apache.org/jira/browse/HADOOP-12505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974478#comment-14974478
]
Chris Nauroth commented on HADOOP-12505:
----------------------------------------
The Hadoop code itself does not consistently enforce POSIX compliance for user
names or group names. It's more a function of the hosting OS. For example,
Harsh has pointed out that for operators using the JNI-based implementation
instead of shell-based, Hadoop users already can show up with membership in
groups that don't have a POSIX-compliant name. I've seen that for Windows
deployments, and it sounds like the use case here was a Linux deployment
connected to Active Directory.
The group mapping alone is not sufficient to enforce POSIX compliance either.
This is only used to populate the user's set of groups. It does not control
other input paths for group names. For example, WebHDFS and Java
{{FileSystem#setOwner}} API calls can accept names with spaces. Here is an
example, tested on Mac.
{code}
> curl -X PUT
> 'http://127.0.0.1:50070/webhdfs/v1/hello2?op=SETOWNER&owner=Chris%20Nauroth&group=Domain%20Users&user.name=chris'
> hdfs dfs -ls /hello2
-rw-r--r-- 3 Chris Nauroth Domain Users 6 2014-07-10 11:24 /hello2
{code}
This kind of data of course complicates parsing of shell output, and I imagine
many operators would prefer to enforce POSIX-compliant names by policy.
However, I don't believe Hadoop has taken responsibility for that enforcement.
I don't think the existing implementation of {{ShellBasedUnixGroupsMapping}}
really provides any POSIX compliance benefits, at least not intentionally. In
the example given, it would split the group "Domain Users" on spaces and decide
to put the user into 2 groups: "Domain" and "Users". While those split names
don't have spaces, they're also still not POSIX compliant because of the
capital letters, and more importantly, they're completely erroneous. Hopefully
the split isn't putting anyone into a real group where they don't really belong.
I see this as a bug rather than a POSIX compliance feature. I would prefer to
see the -1 lifted and have the bug fixed. That said, I also see it as low
priority, since the majority of deployments I see use the JNI-based
implementation now, which does not have the bug.
> ShellBasedUnixGroupMapping should support group names with space
> ----------------------------------------------------------------
>
> Key: HADOOP-12505
> URL: https://issues.apache.org/jira/browse/HADOOP-12505
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Wei-Chiu Chuang
> Assignee: Wei-Chiu Chuang
>
> In a typical configuration, group name is obtained from AD through SSSD/LDAP.
> AD permits group names with space (e.g. "Domain Users").
> Unfortunately, the present implementation of ShellBasedUnixGroupMapping
> parses the output of shell command "id -Gn", and assumes group names are
> separated by space.
> This could be achieved by using a combination of shell scripts, for example,
> bash -c 'id -G weichiu | tr " " "\n" | xargs -I % getent group "%" | cut
> -d":" -f1'
> But I am still looking for a more compact form, and potentially more
> efficient one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)