Vadim Spector created SENTRY-2014:
-------------------------------------

             Summary: Incorrect handling of HDFS paths with multiple slashes
                 Key: SENTRY-2014
                 URL: https://issues.apache.org/jira/browse/SENTRY-2014
             Project: Sentry
          Issue Type: Bug
            Reporter: Vadim Spector
            Assignee: Vadim Spector


There are at least two places in the code where HDFS paths may not be parsed 
correctly:

a) PathsUpdate.parsePath() does not handle collapse duplicate slashes in the 
path portion of URI into one slash. This method is used when getting paths data 
from HMS store. HDFS paths with duplicate slashes are perfectly legal and the 
specs refer to UNIX guidelines saying that multiple slashes should be treated 
as single slashes. If we keep multiple slashes in the path, such a path may be 
incorrectly split into path entries with some entries being empty, ultimately 
resulting in hard-to-troubleshoot ACL problems in the field. We should not 
assume that the URIs fed into parsePath() have already been normalized. It's 
easier to fix the code.

b) NotificationProcessor.splitPath() is using "/" regex instead of the correct 
"/+" one. While the inputs to this class _may_ be controlled by Sentry 
software, which _may_ normalize paths properly, it is better not to make such 
assumptions and just fix the code.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to