Sivaguru Sankaridurg created HADOOP-14845:
---------------------------------------------

             Summary: getFileStatus not making any auth checks
                 Key: HADOOP-14845
                 URL: https://issues.apache.org/jira/browse/HADOOP-14845
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs/azure
            Reporter: Sivaguru Sankaridurg


The HDFS spec requires only traverse checks for any file accessed via 
getFileStatus ... and since WASB does not support traverse checks, removing 
this call effectively removed all protections for the getFileStatus call. The 
reasoning at that time was that doing a performAuthCheck was the wrong thing to 
do, since it was going against the spec....and that the correct fix to the 
getFileStatus issue was to implement traverse checks rather than go against the 
spec by calling performAuthCheck. The side-effects of such a change were not 
fully clear at that time, but the thinking was that it was safer to remain true 
to the spec, as far as possible.
The reasoning remains correct even today. But in view of the security hole 
introduced by this change (that anyone can load up any other user's data in 
hive), and keeping in mind that WASB does not intend to implement traverse 
checks, we propose a compromise.
We propose (re)introducing a read-access check to getFileStatus(), that would 
check the existing ancestor for read-access whenever invoked. Although not 
perfect (in that it is a departure from the spec), we believe that it is a good 
compromise between having no checks at all; and implementing full-blown 
traverse checks.
For scenarios that deal with intermediate folders like mkdirs, the call would 
check for read access against an existing ancestor (when invoked from shell) 
for intermediate non-existent folders – {{ mkdirs /foo/bar, where only "/" 
exists, would result in read-checks against "/" for "/","/foo" and "/foo/bar" 
}}. This can be thought of, as being a close-enough substitute for the traverse 
checks that hdfs does.
For other scenarios that don't deal with non-existent intermediate folders – 
like read, delete etc, the check will happen against the parent. Once again, we 
can think of the read-check against the parent as a substitute for the traverse 
check, which can be customized for various users with ranger policies.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to