[
https://issues.apache.org/jira/browse/HDFS-15638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xinli Shang updated HDFS-15638:
-------------------------------
Description:
Problem: Currently, when a user tries to accesses a file he/she needs the
permissions of it's parent and ancestors and the permission of that file. This
is correct generally, but for Hive tables directories/files, all the files
under a partition or even a table usually have the same permissions for the
same set of ACL groups. Although the permissions and ACL groups are the same,
the writer still need to call setfacl() for every file. This results in a huge
amount of RPC calls to NN. HDFS has default ACL to solve that but that only
applies to create and copy, but not apply for rename. However, in Hive ETL,
rename is very common.
Proposal: Add a 1-bit flag to directory inodes to indicate whether or not it is
a Hive table directory. If that flag is set, then all the sub-directory and
files under it will just use it's permission and ACL groups settings. By doing
this way, Hive ETL doesn't need to set permissions at the file level. If that
flag is not set(by default), work as before. To set/unset that flag, it would
require admin privilege.
was:
Problem: Currently, when a user tries to accesses a file he/she needs not only
the permission of that file but also the permissions of it's parent and
ancestors. This is correct, but for Hive tables directory/files, all the files
under a partition or even a table usually have the same permissions for the
same set of ACL groups. Although the permissions and ACL groups are the same,
the writer sometimes still need to call setfacl() for every file. This results
in a huge amount of RPC calls to NN. HDFS has default ACL to solve that but
that only applies to create and copy, but not apply for rename. However, in
Hive ETL, rename is very common.
Proposal: Add a 1-bit flag to directory inodes to indicate whether or not it is
a Hive table directory. If that flag is set, then all the sub-directory and
files under it will just use it's permission and ACL groups settings. By doing
this way, Hive ETL doesn't need to set permissions at the file level. If that
flag is not set(by default), work as before. To set/unset that flag, it would
require admin privilege.
> Make Hive tables directory permission check flat
> -------------------------------------------------
>
> Key: HDFS-15638
> URL: https://issues.apache.org/jira/browse/HDFS-15638
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: hdfs
> Reporter: Xinli Shang
> Priority: Major
>
> Problem: Currently, when a user tries to accesses a file he/she needs the
> permissions of it's parent and ancestors and the permission of that file.
> This is correct generally, but for Hive tables directories/files, all the
> files under a partition or even a table usually have the same permissions for
> the same set of ACL groups. Although the permissions and ACL groups are the
> same, the writer still need to call setfacl() for every file. This results in
> a huge amount of RPC calls to NN. HDFS has default ACL to solve that but that
> only applies to create and copy, but not apply for rename. However, in Hive
> ETL, rename is very common.
> Proposal: Add a 1-bit flag to directory inodes to indicate whether or not it
> is a Hive table directory. If that flag is set, then all the sub-directory
> and files under it will just use it's permission and ACL groups settings. By
> doing this way, Hive ETL doesn't need to set permissions at the file level.
> If that flag is not set(by default), work as before. To set/unset that flag,
> it would require admin privilege.
>
>
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]