Re: Delegation tokens for HDFS

Owen O'Malley Fri, 20 Sep 2019 13:52:14 -0700

If you are using Hive Server 2 through jdbc:

   - The most common way is to have the data only accessible to the 'hive'
   user. Since the users don't have access to the underlying HDFS files, Hive
   can enforce column/row permissions.
   - The other option is to use doAs and run as the user. That requires
   giving the 'hive' user proxy privileges.


If you aren't using Hive Server 2, the user acquires tokens before the
query gets submitted to Yarn.

There are trade offs in each of the models.

.. Owen

On Fri, Sep 20, 2019 at 9:37 AM Julien Phalip <jpha...@gmail.com> wrote:

> Hi,
>
> My understanding is that the most common (perhaps the only?) way to let
> users run Hive queries on datasets stored in HDFS, is to configure Hive as
> a proxy user in the namenodes config.
>
> I'm wondering if, instead of using proxy user privileges, a Hive client
> could be configured to first collect HDFS delegation tokens for the user
> and then pass those tokens to the Hive server. That way, the Hive server
> would use the tokens to authenticate with HDFS on behalf of the user.
>
> Spark offers something similar to that with the
> spark.yarn.access.hadoopFileSystems
> <https://spark.apache.org/docs/latest/running-on-yarn.html#kerberos>
> property. By chance, is there a way to achieve the same thing for Hive when
> using a client like Beeline?
>
> Thank you,
>
> Julien
>

Re: Delegation tokens for HDFS

Reply via email to