Hello Kudu developers,
With a large majority of the Apache Ranger integration work landing, I
wanted to revisit this plan and potentially move forward removing Sentry
and upgrading Hive.
Steps 1 through 3 from the plan have been completed. Below are some related
commits:
1. commit the Hive 3 preparation patch to simplify upgrading in the
future
- https://github.com/apache/kudu/commit/76b80e
- https://github.com/apache/kudu/commit/ab69837
- https://github.com/apache/kudu/commit/e20487a
- https://github.com/apache/kudu/commit/d96f8fc
- https://github.com/apache/kudu/commit/8fb170b
2. Verify the feasibility of upgrading with the mentioned POC patches,
but do not commit them.
- https://gerrit.cloudera.org/#/c/14020/
- https://gerrit.cloudera.org/#/c/13256/
3. Start work on an Apache Ranger integration for Kudu.
- https://github.com/apache/kudu/commit/f392503
- https://github.com/apache/kudu/commit/2ea8478
- https://github.com/apache/kudu/commit/e13fd4a
- https://github.com/apache/kudu/commit/0d29977
Since the writing of the original email, Sentry has not added support for
Hive 3. This means that step 4 is not an option at this time. As a result I
propose to move forward with step 5 and start removing Sentry 3 support and
upgrading to Hive 3.
The planned steps are as follows:
1. Rebase and commit the patch to disable Sentry tests
2. Rebase and commit the patch to upgrade to upgrade to Hive 3
3. Document the removal of Sentry support in the release notes.
4. Remove Sentry tests and code.
- This can be done after the next release.
- Though the tests are removed and we use Hive 3 we still technically
work with Hive 2 and Sentry. We are no longer testing/validating
it though.
Please let me know if you have any thoughts or feedback on the above plan.
Thank you,
Grant
On Tue, Aug 13, 2019 at 3:06 AM Adar Lieber-Dembo <[email protected]>
wrote:
> +1, thanks for all of the details.
>
> On Fri, Aug 9, 2019 at 3:21 PM Grant Henke <[email protected]> wrote:
> >
> > Hello Kudu developers,
> >
> > Recently I have started work on upgrading Kudu to use Apache Hive 3.x.
> > Given this is a major upgrade it does come with some challenges. As of
> Kudu
> > 1.10.0 we use Hive in the HMS synchronization feature. This feature
> > includes a Kudu server side notification listener and HMS client. It also
> > includes a Java side HMS plugin to enforce Kudu imperatives within the
> HMS.
> > That feature is useful on its own in many ways, but is also required for
> > fine grained authorization via Apache Sentry.
> >
> > The primary challenge is that Apache Sentry currently does not support
> Hive
> > 3 and it will likely take a large effort to enable support. It is also
> > unclear if there is anyone in the Sentry community that want's to
> > contribute and release such support.
> >
> > I have started preliminary efforts to support Hive 3 in Kudu and the HMS
> > synchronization feature. This includes 3 patches. The first patch
> > <https://gerrit.cloudera.org/#/c/14018/> is changes that work in both
> Hive
> > 2 and Hive 3 that minimize the work needed when we upgrade in the future.
> > This can be committed to master when reviewed and ready. The second patch
> > <https://gerrit.cloudera.org/#/c/14006> disables the sentry integration
> so
> > I can test the changes required to support HMS synchronization on its
> own.
> > Those changes and testing are the third patch
> > <https://gerrit.cloudera.org/#/c/13256/>.
> >
> > Given fine grained authorization is a critical feature for many users, we
> > can't remove Sentry support without providing an alternative
> authorization
> > implementation. At the same time we have started work on authorization
> via
> > Apache Ranger. Once that implementation exists and has been
> > contributed/released we can make a decision about how to move forward.
> >
> > Given what we know today and the current situation here is my suggested
> > plan:
> >
> > 1. Commit the Hive 3 preparation patch to simplify upgrading in the
> > future
> > 2. Verify the feasibility of upgrading with the mentioned POC patches,
> > but do not commit them.
> > - This means we will remain on Hive 2 until step 4 or 5 below.
> > 3. Start work on an Apache Ranger integration for Kudu.
> > 4. If Hive 3 support is added in Sentry, consider upgrading to Hive 3
> > then.
> > 5. When Ranger support is complete, consider removing Sentry support
> in
> > favor of Ranger and upgrade to Hive 3.
> > - This may require a migration path from Sentry to Ranger.
> >
> > Please let me know if you have any thoughts or feedback on the above
> plan.
> >
> > Thank you,
> > Grant
> > --
> > Grant Henke
> > Software Engineer | Cloudera
> > [email protected] | twitter.com/gchenke | linkedin.com/in/granthenke
>
--
Grant Henke
Software Engineer | Cloudera
[email protected] | twitter.com/gchenke | linkedin.com/in/granthenke