[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13503750#comment-13503750 ] Harsh J commented on HADOOP-8845: - Hi Daryn, Indeed, the UGI test in my pTestClosure6 passes on trunk now so HADOOP-8906 should have fixed this case completely. Thanks for the heads up! Shall we retarget this JIRA for fixing the same on FileContext (sort of a clone of HADOOP-8906 for the new FC)? Or would the UGI test be worth an addition anyway? Cause we couldn't catch this issue unless its a non super-user. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13501261#comment-13501261 ] Harsh J commented on HADOOP-8845: - I'll revise the patch here to include fixes for FileContext's globbing implementation as well, but for the reuse goal I have filed HADOOP-9068 for later. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13501362#comment-13501362 ] Daryn Sharp commented on HADOOP-8845: - Are you sure this patch is still needed? This should have already been fixed by HADOOP-8906. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13471267#comment-13471267 ] Harsh J commented on HADOOP-8845: - Thanks all. Am addressing all your comments. What ought to be the intended behavior when symlinks are thrown into the picture? To resolve or not to resolve? It is quite unfortunate that we've the globbing code copied instead of shared. The FileContext copy does not even have tests for itself! But I can't imagine sharing code if we begin trying to filter or resolve symlinks - given two implementations, one that handles it and the other that doesn't. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13466965#comment-13466965 ] Eli Collins commented on HADOOP-8845: - globPathsLevel is a generic method, and globStatus which calls it claims to return all matching path names, why is it OK to unconditionally filter out all files from its results? Since * can match the empty string, in other contexts it could be appropriate to return /tmp/testdir/testfile for /tmp/testdir/*/testfile. Ie is there a place where we know we should just be checking directory path elements? The comment in globStatusInternal (// list parent directories and then glob the results) by one of the cases indicates is the intent but it's valid to pass both files and directories to listStatus. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467110#comment-13467110 ] Harsh J commented on HADOOP-8845: - bq. Since * can match the empty string, in other contexts it could be appropriate to return /tmp/testdir/testfile for /tmp/testdir/*/testfile. Nice catch. I will add a test for this to see if we aren't handling it already. bq. Ie is there a place where we know we should just be checking directory path elements? The comment in globStatusInternal (// list parent directories and then glob the results) by one of the cases indicates is the intent but it's valid to pass both files and directories to listStatus. The parts I've changed this under, try to fetch parents, which can't mean anything but directories AFAICT. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467162#comment-13467162 ] Robert Joseph Evans commented on HADOOP-8845: - I would argue that even if there is a specific need for non-standard globbing we don't want to support it. POSIX compliance is what most people would expect from HDFS, when we deviate from it users will get confused and angry. Especially if rm deletes more files then they want. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13467229#comment-13467229 ] Andy Isaacson commented on HADOOP-8845: --- (sorry for the markup messup in my last comment.) The currently pending patch specifically checks in {{pTestClosure6}} that the case I mentioned is handled correctly, so I think we're all on the same page. :) Code-wise, one minor comment: {code} + public boolean apply(FileStatus input) { +return input.isDirectory() ? true : false; + } {code} This is an anti-pattern; {{foo() ? true : false}} is the same as {{foo()}}. Other than that, LGTM on the code level. I haven't carefully read the GlobFilter implementation to see if there's a cleaner/simpler way to implement this bugfix. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13465519#comment-13465519 ] Harsh J commented on HADOOP-8845: - Ping? Its a trivial fix to not lookup non-directories and I have tests attached (similar to pClosure5, but that mkdired everything and couldn't run into EXECUTE-less issues as described). When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13464889#comment-13464889 ] Harsh J commented on HADOOP-8845: - {quote} -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ha.TestZKFailoverController org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics org.apache.hadoop.hdfs.TestPersistBlocks {quote} These tests do not rely on globbing at all, and hence the failures are unrelated to this patch on the core-side. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it be like that or not. This JIRA targets a client-sided fix to not cause such /path/file/dir or /path/file/file kinda lookups. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463603#comment-13463603 ] Hadoop QA commented on HADOOP-8845: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546641/HADOOP-8845.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.ha.TestZKFailoverController org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics org.apache.hadoop.hdfs.TestPersistBlocks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1524//testReport/ Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1524//console This message is automatically generated. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of
[jira] [Commented] (HADOOP-8845) When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException
[ https://issues.apache.org/jira/browse/HADOOP-8845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13463520#comment-13463520 ] Hadoop QA commented on HADOOP-8845: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546608/HADOOP-8845.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HADOOP-Build/1522//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HADOOP-Build/1522//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-common.html Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/1522//console This message is automatically generated. When looking for parent paths info, globStatus must filter out non-directory elements to avoid an AccessControlException Key: HADOOP-8845 URL: https://issues.apache.org/jira/browse/HADOOP-8845 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 2.0.0-alpha Reporter: Harsh J Assignee: Harsh J Labels: glob Attachments: HADOOP-8845.patch, HADOOP-8845.patch A brief description from my colleague Stephen Fritz who helped discover it: {code} [root@node1 ~]# su - hdfs -bash-4.1$ echo My Test Stringtestfile -- just a text file, for testing below -bash-4.1$ hadoop dfs -mkdir /tmp/testdir -- create a directory -bash-4.1$ hadoop dfs -mkdir /tmp/testdir/1 -- create a subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/1/testfile -- put the test file in the subdirectory -bash-4.1$ hadoop dfs -put testfile /tmp/testdir/testfile -- put the test file in the directory -bash-4.1$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile All files are where we expect them...OK, let's try reading -bash-4.1$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- success! -bash-4.1$ hadoop dfs -cat /tmp/testdir/*/testfile My Test String -- success! Note that we used an '*' in the cat command, and it correctly found the subdirectory '/tmp/testdir/1', and ignore the regular file '/tmp/testdir/testfile' -bash-4.1$ exit logout [root@node1 ~]# su - testuser -- lets try it as a different user: [testuser@node1 ~]$ hadoop dfs -lsr /tmp/testdir drwxr-xr-x - hdfs hadoop 0 2012-09-25 06:52 /tmp/testdir/1 -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/1/testfile -rw-r--r-- 3 hdfs hadoop 15 2012-09-25 06:52 /tmp/testdir/testfile [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/testfile My Test String -- good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/1/testfile My Test String -- so far so good [testuser@node1 ~]$ hadoop dfs -cat /tmp/testdir/*/testfile cat: org.apache.hadoop.security.AccessControlException: Permission denied: user=testuser, access=EXECUTE, inode=/tmp/testdir/testfile:hdfs:hadoop:-rw-r--r-- {code} Essentially, we hit a ACE with access=EXECUTE on file /tmp/testdir/testfile cause we tried to access the /tmp/testdir/testfile/testfile as a path. This shouldn't happen, as the testfile is a file and not a path parent to be looked up upon. {code} 2012-09-25 07:24:27,406 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 8020, call getFileInfo(/tmp/testdir/testfile/testfile) {code} Surprisingly the superuser avoids hitting into the error, as a result of bypassing permissions, but that can be looked up on another JIRA - if it is fine to let it