[
https://issues.apache.org/jira/browse/SENTRY-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209896#comment-16209896
]
Vadim Spector edited comment on SENTRY-1993 at 10/18/17 7:46 PM:
-----------------------------------------------------------------
[[email protected]], there is a way.
For some weird reason, TestHMSPathsFullDump.java uses unofficial method
_addPathsToAuthzObject(String, List<String>), where each String in a List is a
path. It also happens to automatically "fix" double slashes in those paths, so
no wonder, your test passes even without your fix. "Testing" APIs that are
never called apart from testing is bogus and needs to be fixed, but that's a
different story ...
In reality on Sentry HMS plugin where full dump is actually generated, the
following method is called instead: addPathsToAuthzObject(String,
List<List<String>>), where each element in an outer List is a path parsed into
a List<String>. And parsed incorrectly, so each additional slash results in
additional empty path entry.
So, the right way to make the test fail would be to pass your
"after_double_slash" entry via official API as
_hmsPaths.addPathsToAuthzObject("db1.tbl11",
Arrays.asList(Arrays.asList("user", "hive", "warehouse", "db1", "tbl11",
"part_duplicate2", "", "after_double_slash")));_
Then the test fails with StringIndexOutOfBoundsException as expected. If you
introduce your fix, then the test still fails, because your assert statement
_Assert.assertEquals(new HashSet<String>(Arrays.asList("db1.tbl11")),
hmsPaths2.findAuthzObject(new String[]{"user", "hive", "warehouse", "db1",
"tbl11", "part_duplicate2", "after_double_slash"}, false));_
should be replaced with (notice "" path entry):
_Assert.assertEquals(new HashSet<String>(Arrays.asList("db1.tbl11")),
hmsPaths2.findAuthzObject(new String[]{"user", "hive", "warehouse", "db1",
"tbl11", "part_duplicate2", "", "after_double_slash"}, false));_
I verified it locally, so rest assured your patch works. The reason Im not sure
we should "fix" the test code this way is because it is weird and hdfs path
handling has to be fixed. So, once we fix it, your test would need to be fixed
as well (perhaps HMSPaths should ignore empty "" entry in List<String>). So, I
suggest to keep your patch as-is.
was (Author: [email protected]):
[[email protected]], there is a way.
For some weird reason, TestHMSPathsFullDump.java uses unofficial method
_addPathsToAuthzObject(String, List<String>), where each String in a List is a
path. It also happens to automatically "fix" double slashes in those paths, so
no wonder, your test passes even without your fix. "Testing" APIs that are
never called apart from testing is bogus and needs to be fixed, but that's a
different story ...
In reality on Sentry HMS plugin where full dump is actually generated, the
following method is called instead: addPathsToAuthzObject(String,
List<List<String>>), where each element in an outer List is a path parsed into
a List<String>. And parsed incorrectly, so each additional slash results in
additional empty path entry.
So, the right way to make the test fail would be to pass your
"after_double_slash" entry via official API as
_hmsPaths.addPathsToAuthzObject("db1.tbl11",
Arrays.asList(Arrays.asList("user", "hive", "warehouse", "db1", "tbl11",
"part_duplicate2", "", "after_double_slash")));_
Then the test fails with StringIndexOutOfBoundsException as expected. If you
introduce your fix, then the test still fails, because your assert statement
_Assert.assertEquals(new HashSet<String>(Arrays.asList("db1.tbl11")),
hmsPaths2.findAuthzObject(new String[]{"user", "hive", "warehouse", "db1",
"tbl11", "part_duplicate2", "after_double_slash"}, false));_
should be replaced with (notice "" path entry):
_Assert.assertEquals(new HashSet<String>(Arrays.asList("db1.tbl11")),
hmsPaths2.findAuthzObject(new String[]{"user", "hive", "warehouse", "db1",
"tbl11", "part_duplicate2", "", "after_double_slash"}, false));_
I verified it locally, so rest assured your patch works. The reason Im not sure
we should "fix" the test code this way is because it is weird and hdfs path
handling has to be fixed. So, once we fix it, your test would need to be fixed
as well (perhaps HMSPaths should ignore empty "" entry in List<String>). So, I
suggest to keep your patch as-is.
> StringIndexOutOfBoundsException in HMSPathsDumper.java
> ------------------------------------------------------
>
> Key: SENTRY-1993
> URL: https://issues.apache.org/jira/browse/SENTRY-1993
> Project: Sentry
> Issue Type: Bug
> Reporter: Misha Dmitriev
> Assignee: Misha Dmitriev
> Attachments: SENTRY-1993.01.patch
>
>
> The following line in HMSPathsDumper.java is causing
> StringIndexOutOfBoundsException:
> {code}
> if (tChildPathElement.charAt(0) == DupDetector.REPLACEMENT_STRING_PREFIX) {
> {code}
> It only happens when a path element is "", when someone mistakenly specifies
> hdfs path with two "/" in the path section, like
> hdfs://server//element1//element2, instead of
> hdfs://server/element1/element2. In principle, such paths are invalid, but
> this code should be made resistant to them anyway.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)