If you are using Hive Server 2 through jdbc: - The most common way is to have the data only accessible to the 'hive' user. Since the users don't have access to the underlying HDFS files, Hive can enforce column/row permissions. - The other option is to use doAs and run as the user. That requires giving the 'hive' user proxy privileges.
If you aren't using Hive Server 2, the user acquires tokens before the query gets submitted to Yarn. There are trade offs in each of the models. .. Owen On Fri, Sep 20, 2019 at 9:37 AM Julien Phalip <jpha...@gmail.com> wrote: > Hi, > > My understanding is that the most common (perhaps the only?) way to let > users run Hive queries on datasets stored in HDFS, is to configure Hive as > a proxy user in the namenodes config. > > I'm wondering if, instead of using proxy user privileges, a Hive client > could be configured to first collect HDFS delegation tokens for the user > and then pass those tokens to the Hive server. That way, the Hive server > would use the tokens to authenticate with HDFS on behalf of the user. > > Spark offers something similar to that with the > spark.yarn.access.hadoopFileSystems > <https://spark.apache.org/docs/latest/running-on-yarn.html#kerberos> > property. By chance, is there a way to achieve the same thing for Hive when > using a client like Beeline? > > Thank you, > > Julien >