Harsh J created HADOOP-8845:
-------------------------------
Summary: When looking for parent paths info, globStatus must
filter out non-directory elements to avoid an AccessControlException
Key: HADOOP-8845
URL: https://issues.apache.org/jira/browse/HADOOP-8845
Project: Hadoop Common
Issue Type: Bug
Components: fs
Affects Versions: 2.0.0-alpha
Reporter: Harsh J
Assignee: Harsh J
A brief description from my colleague Stephen Fritz who helped discover it:
{quote}
[root@node1 ~]# su - hdfs
-bash-4.1$ echo "My Test String">testfile <-- just a text file, for testing
below
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir <-- create a directory
-bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 <-- create a subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile <-- put the test
file in the subdirectory
-bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile <-- put the test file
in the directory
-bash-4.1$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile
All files are where we expect them...OK, let's try reading
-bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- success!
-bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- success!
-bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile
My Test String <-- success!
Note that we used an '*' in the cat command, and it correctly found the
subdirectory '/tmp/testdir/1', and ignore the regular file
'/tmp/testdir/testfile'
-bash-4.1$ exit
logout
[root@node1 ~]# su - testuser <-- lets try it as a different user:
[testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir
drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1
-rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile
-rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile
My Test String <-- good
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile
My Test String <-- so far so good
[testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile
cat: org.apache.hadoop.security.AccessControlException: Permission denied:
user=testuser, access=EXECUTE,
inode="/tmp/testdir/testfile":hdfs:hadoop:-rw-r--r--
{code}
Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile
cause we tried to access the /tmp/testdir/testfile/testfile as a path. This
shouldn't happen, as the testfile is a file and not a path parent to be looked
up upon.
Surprisingly the superuser avoids hitting into the error, as a result of
bypassing permissions, but that can be looked up on another JIRA - if it is
fine to let it be like that or not.
This JIRA targets a client-sided fix to not cause such /path/file/dir kinda
lookups.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira