[ 
https://issues.apache.org/jira/browse/HADOOP-12505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974478#comment-14974478
 ] 

Chris Nauroth commented on HADOOP-12505:
----------------------------------------

The Hadoop code itself does not consistently enforce POSIX compliance for user 
names or group names.  It's more a function of the hosting OS.  For example, 
Harsh has pointed out that for operators using the JNI-based implementation 
instead of shell-based, Hadoop users already can show up with membership in 
groups that don't have a POSIX-compliant name.  I've seen that for Windows 
deployments, and it sounds like the use case here was a Linux deployment 
connected to Active Directory.

The group mapping alone is not sufficient to enforce POSIX compliance either.  
This is only used to populate the user's set of groups.  It does not control 
other input paths for group names.  For example, WebHDFS and Java 
{{FileSystem#setOwner}} API calls can accept names with spaces.  Here is an 
example, tested on Mac.

{code}
> curl -X PUT 
> 'http://127.0.0.1:50070/webhdfs/v1/hello2?op=SETOWNER&owner=Chris%20Nauroth&group=Domain%20Users&user.name=chris'

> hdfs dfs -ls /hello2
-rw-r--r--   3 Chris Nauroth Domain Users          6 2014-07-10 11:24 /hello2
{code}

This kind of data of course complicates parsing of shell output, and I imagine 
many operators would prefer to enforce POSIX-compliant names by policy.  
However, I don't believe Hadoop has taken responsibility for that enforcement.

I don't think the existing implementation of {{ShellBasedUnixGroupsMapping}} 
really provides any POSIX compliance benefits, at least not intentionally.  In 
the example given, it would split the group "Domain Users" on spaces and decide 
to put the user into 2 groups: "Domain" and "Users".  While those split names 
don't have spaces, they're also still not POSIX compliant because of the 
capital letters, and more importantly, they're completely erroneous.  Hopefully 
the split isn't putting anyone into a real group where they don't really belong.

I see this as a bug rather than a POSIX compliance feature.  I would prefer to 
see the -1 lifted and have the bug fixed.  That said, I also see it as low 
priority, since the majority of deployments I see use the JNI-based 
implementation now, which does not have the bug.

> ShellBasedUnixGroupMapping should support group names with space
> ----------------------------------------------------------------
>
>                 Key: HADOOP-12505
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12505
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>
> In a typical configuration, group name is obtained from AD through SSSD/LDAP. 
> AD permits group names with space (e.g. "Domain Users").
> Unfortunately, the present implementation of ShellBasedUnixGroupMapping 
> parses the output of shell command "id -Gn", and assumes group names are 
> separated by space.
> This could be achieved by using a combination of shell scripts, for example,
> bash -c 'id -G weichiu | tr " " "\n" | xargs -I % getent group "%" | cut 
> -d":" -f1'
> But I am still looking for a more compact form, and potentially more 
> efficient one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to