Re: Concerns about Hive authz2 support

Sergio Pena Mon, 02 Oct 2017 10:34:27 -0700

Sure.

First, here's what Hive Wiki says about authz1 limitations:

The default authorization in Hive
<https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Authorization#LanguageManualAuthorization-3DefaultHiveAuthorization(LegacyMode)>
 is *not designed* with the intent to protect against malicious users
accessing data they should not be accessing. It only helps in preventing
users from accidentally doing operations they are not supposed to do. It is
also incomplete because it does not have authorization checks for many
operations including the grant statement. The authorization checks happen
during Hive query compilation. But as the user is allowed to
execute dfs commands, user-defined functions and shell commands, it is
possible to bypass the client security checks.

See
https://cwiki.apache.org/confluence/display/Hive/SQL+Standard+Based+Hive+Authorization

The above problem is the reason Hive introduced a new authorization API
called authz2. However, I saw that some of those limitations are handled by
Sentry already, such as GRANT privilege checks (on the Sentry server side).
Also, Sentry provides the SentryGrantRevokeTask to handle the GRANT/REVOKE
execution instead of using the authz1 API that Hive provides.

Authz1 uses the following configurations:

On Mon, Oct 2, 2017 at 9:56 AM, Colm O hEigeartaigh <cohei...@apache.org>
wrote:

> Hi Sergio,
>
> Could you give some background as to what the differences are between
> "authz1" and "authz2"? Sorry if this is an obvious question :-)
>
> For the 1.8.0 release, authz1 was supported with Hive 1 and authz2 with
> Hive 2, so I assumed the separate bindings were related to the Hive
> versions being supported. Obviously this is not the case if we are still
> talking about supporting authz1 with Hive 2.0.
>
> Colm.
>
> On Fri, Sep 29, 2017 at 8:59 PM, Sergio Pena <sergio.p...@cloudera.com>
> wrote:
>
> > Hi All,
> >
> > We are running into some problems with the support of Hive Authz V2
> > especially related to the workaround that parses Hive command strings in
> > Sentry using regular expressions to get some info that Hive is not
> sending
> > through the authz2 api. Hive 2.0 made some changes on commands that
> caused
> > issues with Sentry. These are fixed but the concern of doing this SQL
> > parsing exists. We asked the Hive community to give us extra SQL
> > information, but we cannot implement them in Sentry until a Hive release
> is
> > done. There are some concerns about the quality of authz2 too, such as
> > create/drop table and functions calling Sentry twice for authorization
> and
> > the lack of testing being done on it.
> >
> > The original idea for Sentry 2.0 (future release) was to drop authz1
> > support and use authz2 as default but the work is getting delayed until
> > Hive releases something. Now that we bumped the Hive version to 2.0, I
> was
> > wondering if we should continue with authz1 and keep authz2 as an
> > experimental support until Hive releases something we can consume to fix
> > our issues. Then we can deprecate authz1 in a future 2.x release and
> remove
> > it in a major version.
> >
> > I was thinking if we remove any hive-authz2 profile and just add the
> > hive-authz2 classes to the current sentry-binding-hive module so that
> users
> > are allowed to switch either to v1 or v2 (for testing). Also for the
> tests,
> > find a way to run all sentry-tests-hive with v1 and v2 to validate the
> > quality of it.
> >
> > What does the PMC community think? Is it a good or bad idea?
> >
> > - Sergio
> >
>
>
>
> --
> Colm O hEigeartaigh
>
> Talend Community Coder
> http://coders.talend.com
>

Re: Concerns about Hive authz2 support

Reply via email to