[ https://issues.apache.org/jira/browse/HIVE-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706269#comment-13706269 ]
Sushanth Sowmyan commented on HIVE-3756: ---------------------------------------- I have a few more thoughts on this. Let's walk through an example: Let's say Parent Dir d1 has permission/group combination A. Let's say directory d2 inside Parent Dir has permission/group combination B. In the case of non-partitioned tables, d1 will be the database/warehouse dir, and d2 the table dir. In the case of partitioned tables, d1 will be the table directory and d2 the appropriate partition directories. If we did not have the flag to inherit permissions on, then whatever data is loaded, be it files inside d2 (as during a load operation) or replacing d2 and everything in it (as during an insert overwrite operation), will have yet another permission/group combination C, which is a function of the user's current umask and the user's default group The purpose behind the subdir inherit permissions flag is to make this behaviour go away, and to be able to use the parent dir's permissions/group when possible. So far, so good. Let's say, for purposes of this entire discussion from now onwards, the flag to inherit permissions is on. Now, if we load data into d2, without using overwrite, files inside d2 get permission B. If we load data into d2, using overwrite, we now overwrite d2, and thus, d2 takes on d1's permissions, and so do the files inside, thus resulting in d2 and files inside d2 having permissions/group combination A. -- While this behaviour is consistent, I find that from a user's perspective, if they create a table (say unpartitioned), then chmod/chgrp it to B, and then they try to load data into it using an Insert-Overwrite, then they still expect that they're only overwriting data inside the table dir, and their expectation is that the table still have permissions/group-combination B. They don't want it to be replaced by "A", the parent db dir's permissions/group , and they don't want "C", the umask/current-user-default-group. Now, as to whether this requires a new flag that overrides "hive.warehouse.subdir.inherit.perms" or whether they want "hive.warehouse.subdir.inherit.perms" to work in this way is still up for discussion, but there is now need for an additional requirement, that of the following: "If the directory being moved in already exists, and will be deleted so that this can be placed, then instead of going with the parent permissions, it should go with the previous dir's permissions." Thoughts? This can be a separate jira if people feel like it should be, but I think it's also a minor modification of this current jira. > "LOAD DATA" does not honor permission inheritence > ------------------------------------------------- > > Key: HIVE-3756 > URL: https://issues.apache.org/jira/browse/HIVE-3756 > Project: Hive > Issue Type: Bug > Components: Authorization, Security > Affects Versions: 0.9.0 > Reporter: Johndee Burks > Assignee: Chaoyu Tang > Attachments: HIVE-3756_1.patch, HIVE-3756.patch > > > When a "LOAD DATA" operation is performed the resulting data in hdfs for the > table does not maintain permission inheritance. This remains true even with > the "hive.warehouse.subdir.inherit.perms" set to true. > The issue is easily reproducible by creating a table and loading some data > into it. After the load is complete just do a "dfs -ls -R" on the warehouse > directory and you will see that the inheritance of permissions worked for the > table directory but not for the data. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira