Jakub Pastuszek created HIVE-11734: -------------------------------------- Summary: Hive Server2 not impersonating HDFS for CREATE TABLE/DATABASE with KERBEROS auth Key: HIVE-11734 URL: https://issues.apache.org/jira/browse/HIVE-11734 Project: Hive Issue Type: Bug Components: Authorization Affects Versions: 1.1.1 Reporter: Jakub Pastuszek
My configuration is as follows: {code} hive-site.xml: hive.server2.enable.doAs=true hive.metastore.execute.setugi=true hive.security.metastore.authorization.auth.reads=true hive.metastore.sasl.enabled=true hive.server2.authentication=KERBEROS hive.server2.thrift.sasl.qop=auth-conf hive.warehouse.subdir.inherit.perms=false ... hdfs-site.xml: dfs.block.access.token.enable=true fs.permissions.umask-mode=027 ... core-site.xml: hadoop.security.authentication=kerberos hadoop.security.authorization=true hadoop.proxyuser.hive.hosts=localhost,master hadoop.proxyuser.hive.groups=* ... {code} When I create a database or a table using Kerberos authorised (kinit) user account and beeline (shell) the HDFS directories created by Hive are owned by 'hive' user and group is same as for parent directory ('data' in my case) ('hive' user does not even belong to that group at all but it is in supergroup). Now when I try to load the data (or do any other map-reduce) the table files end up owned as the kinit'ed user and the actual user running Yarn container is the kinit'ed user (not 'hive'). This is causing a permission issues when I run queries that do map-reduce since I don't own the database and table directories. Also this allows anybody to drop my database/table since this operation is performed as 'hive' user which is in the supergroup. What I want to get is DDL queries to use kinit'ed user when accessing HDFS so database/table directories end up being owned as that user. Is this a bug or configuration problem? Also the group should be users primary group (inherit.perms=false) and not group of the parent directory. This way I can use owner/group authorisation on HDFS to grant/restrict access using groups. As it stands it is serious security issue and also renders the whole doAs/impersonation system useless for me. Also see my question on Serverfault: http://serverfault.com/questions/717483/hive-server2-not-impersonating-hdfs Versions: {code} hadoop-0.20-mapreduce-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64 hadoop-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64 hadoop-client-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64 hadoop-hdfs-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64 hadoop-hdfs-namenode-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64 hadoop-mapreduce-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64 hadoop-mapreduce-historyserver-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64 hadoop-yarn-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64 hadoop-yarn-resourcemanager-2.6.0+cdh5.4.4+597-1.cdh5.4.4.p0.6.el6.x86_64 hive-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch hive-jdbc-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch hive-metastore-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch hive-server2-1.1.0+cdh5.4.4+157-1.cdh5.4.4.p0.6.el6.noarch {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)