[ 
https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706269#comment-13706269
 ] 

Sushanth Sowmyan commented on HIVE-3756:
----------------------------------------

I have a few more thoughts on this. Let's walk through an example:

Let's say Parent Dir d1 has permission/group combination A.
Let's say directory d2 inside Parent Dir has permission/group combination B.

In the case of non-partitioned tables, d1 will be the database/warehouse dir, 
and d2 the table dir.
In the case of partitioned tables, d1 will be the table directory and d2 the 
appropriate partition directories.

If we did not have the flag to inherit permissions on, then whatever data is 
loaded, be it files inside d2 (as during a load operation) or replacing d2 and 
everything in it (as during an insert overwrite operation), will have yet 
another permission/group combination C, which is a function of the user's 
current umask and the user's default group

The purpose behind the subdir inherit permissions flag is to make this 
behaviour go away, and to be able to use the parent dir's permissions/group 
when possible. So far, so good.

Let's say, for purposes of this entire discussion from now onwards, the flag to 
inherit permissions is on.

Now, if we load data into d2, without using overwrite, files inside d2 get 
permission B.
If we load data into d2, using overwrite, we now overwrite d2, and thus, d2 
takes on d1's permissions, and so do the files inside, thus resulting in d2 and 
files inside d2 having permissions/group combination A.

--

While this behaviour is consistent, I find that from a user's perspective, if 
they create a table (say unpartitioned), then chmod/chgrp it to B, and then 
they try to load data into it using an Insert-Overwrite, then they still expect 
that they're only overwriting data inside the table dir, and their expectation 
is that the table still have permissions/group-combination B. They don't want 
it to be replaced by "A", the parent db dir's permissions/group , and they 
don't want "C", the umask/current-user-default-group.

Now, as to whether this requires a new flag that overrides 
"hive.warehouse.subdir.inherit.perms" or whether they want 
"hive.warehouse.subdir.inherit.perms" to work in this way is still up for 
discussion, but there is now need for an additional requirement, that of the 
following:

"If the directory being moved in already exists, and will be deleted so that 
this can be placed, then instead of going with the parent permissions, it 
should go with the previous dir's permissions."

Thoughts?

This can be a separate jira if people feel like it should be, but I think it's 
also a minor modification of this current jira.
                
> "LOAD DATA" does not honor permission inheritence
> -------------------------------------------------
>
>                 Key: HIVE-3756
>                 URL: https://issues.apache.org/jira/browse/HIVE-3756
>             Project: Hive
>          Issue Type: Bug
>          Components: Authorization, Security
>    Affects Versions: 0.9.0
>            Reporter: Johndee Burks
>            Assignee: Chaoyu Tang
>         Attachments: HIVE-3756_1.patch, HIVE-3756.patch
>
>
> When a "LOAD DATA" operation is performed the resulting data in hdfs for the 
> table does not maintain permission inheritance. This remains true even with 
> the "hive.warehouse.subdir.inherit.perms" set to true.
> The issue is easily reproducible by creating a table and loading some data 
> into it. After the load is complete just do a "dfs -ls -R" on the warehouse 
> directory and you will see that the inheritance of permissions worked for the 
> table directory but not for the data. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to