[
https://issues.apache.org/jira/browse/SENTRY-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vadim Spector updated SENTRY-2014:
----------------------------------
Attachment: SENTRY-2014.01.patch
> Incorrect handling of HDFS paths with multiple slashes
> ------------------------------------------------------
>
> Key: SENTRY-2014
> URL: https://issues.apache.org/jira/browse/SENTRY-2014
> Project: Sentry
> Issue Type: Bug
> Reporter: Vadim Spector
> Assignee: Vadim Spector
> Attachments: SENTRY-2014.01.patch
>
>
> There are at least three places in the code where HDFS paths may not be
> parsed correctly:
> a) PathsUpdate.parsePath() does not handle collapse duplicate slashes in the
> path portion of URI into one slash. This method is used when getting paths
> data from HMS store. HDFS paths with duplicate slashes are perfectly legal
> and the specs refer to UNIX guidelines saying that multiple slashes should be
> treated as single slashes. If we keep multiple slashes in the path, such a
> path may be incorrectly split into path entries with some entries being
> empty, ultimately resulting in hard-to-troubleshoot ACL problems in the
> field. We should not assume that the URIs fed into parsePath() have already
> been normalized. It's easier to fix the code.
> b) NotificationProcessor.splitPath() is using "/" regex instead of the
> correct "/+" one. While the inputs to this class _may_ be controlled by
> Sentry software, which _may_ normalize paths properly, it is better not to
> make such assumptions and just fix the code.
> c) SentryStore.retrieveFullPathsImageCore() splits paths retrieved from
> database as "path.split("/") instead of path.split("/+")
> This may result in HDFS sync failures.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)