[ 
https://issues.apache.org/jira/browse/SENTRY-1993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16209896#comment-16209896
 ] 

Vadim Spector edited comment on SENTRY-1993 at 10/18/17 7:50 PM:
-----------------------------------------------------------------

[[email protected]], there is a way.

For some weird reason, TestHMSPathsFullDump.java uses unofficial method 
addPathsToAuthzObject(String, List<String>), where each String in a List is a 
path. It also happens to automatically "fix" double slashes in those paths, so 
no wonder, your test passes even without your fix. "Testing" APIs that are 
never called apart from testing is bogus and needs to be fixed, but that's a 
different story ...

In reality on Sentry HMS plugin where full dump is actually generated, the 
following method is called instead: addPathsToAuthzObject(String, 
List<List<String>>), where each element in an outer List is a path parsed into 
a List<String>. And parsed incorrectly, so each additional slash results in 
additional empty path entry.

So, the right way to make the test fail would be to pass your 
"after_double_slash" entry via official API as
{code}
hmsPaths.addPathsToAuthzObject("db1.tbl11", Arrays.asList(Arrays.asList("user", 
"hive", "warehouse", "db1", "tbl11", "part_duplicate2", "", 
"after_double_slash")));
{code}
Then the test fails with StringIndexOutOfBoundsException as expected. If you 
introduce your fix, then the test still fails, because your assert statement
{code}
Assert.assertEquals(new HashSet<String>(Arrays.asList("db1.tbl11")), 
hmsPaths2.findAuthzObject(new String[]{"user", "hive", "warehouse", "db1", 
"tbl11", "part_duplicate2", "after_double_slash"}, false));
{code}
should be replaced with (notice "" path entry):
{code}
Assert.assertEquals(new HashSet<String>(Arrays.asList("db1.tbl11")), 
hmsPaths2.findAuthzObject(new String[]{"user", "hive", "warehouse", "db1", 
"tbl11", "part_duplicate2", "", "after_double_slash"}, false));
{code}

I verified it locally, so rest assured your patch works. The reason Im not sure 
we should "fix" the test code this way is because it is weird and hdfs path 
handling has to be fixed. So, once we fix it, your test would need to be fixed 
as well (perhaps HMSPaths should ignore empty "" entry in List<String>). So, I 
suggest to keep your patch as-is.

[~spena] [~akolb]


was (Author: [email protected]):
[[email protected]], there is a way.

For some weird reason, TestHMSPathsFullDump.java uses unofficial method 
addPathsToAuthzObject(String, List<String>), where each String in a List is a 
path. It also happens to automatically "fix" double slashes in those paths, so 
no wonder, your test passes even without your fix. "Testing" APIs that are 
never called apart from testing is bogus and needs to be fixed, but that's a 
different story ...

In reality on Sentry HMS plugin where full dump is actually generated, the 
following method is called instead: addPathsToAuthzObject(String, 
List<List<String>>), where each element in an outer List is a path parsed into 
a List<String>. And parsed incorrectly, so each additional slash results in 
additional empty path entry.

So, the right way to make the test fail would be to pass your 
"after_double_slash" entry via official API as
{code}
hmsPaths.addPathsToAuthzObject("db1.tbl11", Arrays.asList(Arrays.asList("user", 
"hive", "warehouse", "db1", "tbl11", "part_duplicate2", "", 
"after_double_slash")));
{code}
Then the test fails with StringIndexOutOfBoundsException as expected. If you 
introduce your fix, then the test still fails, because your assert statement
{code}
Assert.assertEquals(new HashSet<String>(Arrays.asList("db1.tbl11")), 
hmsPaths2.findAuthzObject(new String[]{"user", "hive", "warehouse", "db1", 
"tbl11", "part_duplicate2", "after_double_slash"}, false));
{code}
should be replaced with (notice "" path entry):
{code}
Assert.assertEquals(new HashSet<String>(Arrays.asList("db1.tbl11")), 
hmsPaths2.findAuthzObject(new String[]{"user", "hive", "warehouse", "db1", 
"tbl11", "part_duplicate2", "", "after_double_slash"}, false));
{code}

I verified it locally, so rest assured your patch works. The reason Im not sure 
we should "fix" the test code this way is because it is weird and hdfs path 
handling has to be fixed. So, once we fix it, your test would need to be fixed 
as well (perhaps HMSPaths should ignore empty "" entry in List<String>). So, I 
suggest to keep your patch as-is.

> StringIndexOutOfBoundsException in HMSPathsDumper.java
> ------------------------------------------------------
>
>                 Key: SENTRY-1993
>                 URL: https://issues.apache.org/jira/browse/SENTRY-1993
>             Project: Sentry
>          Issue Type: Bug
>            Reporter: Misha Dmitriev
>            Assignee: Misha Dmitriev
>         Attachments: SENTRY-1993.01.patch
>
>
> The following line in HMSPathsDumper.java is causing 
> StringIndexOutOfBoundsException:
> {code}
> if (tChildPathElement.charAt(0) == DupDetector.REPLACEMENT_STRING_PREFIX) {
> {code}
> It only happens when a path element is "", when someone mistakenly specifies 
> hdfs path with two "/" in the path section, like 
> hdfs://server//element1//element2, instead of 
> hdfs://server/element1/element2. In principle, such paths are invalid, but 
> this code should be made resistant to them anyway.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to