[jira] [Commented] (RANGER-2128) Implement SparkSQL plugin

Don Bosco Durai (JIRA) Mon, 25 Jun 2018 16:02:08 -0700


    [ 
https://issues.apache.org/jira/browse/RANGER-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16522919#comment-16522919
 ]


Don Bosco Durai commented on RANGER-2128:
-----------------------------------------

{quote}It has exposed Parser/Analyzer/Optimizer/Planner, which is so great for 
all users. It also makes it easier for users to call our plug-in.

1. spark-authorizer is designed as a Optimize Rule for Spark SQL and executed 
after all other default rules because rules, such as column pruning, projection 
push down, and so on, should be operated first.
{quote}
I was wondering if it would be difficult to migrate your extension to use the 
official hook provided by Spark? If we can do that, then it might be easy to 
add Ranger features like dynamic UDF and row level filtering.
{quote}2. spark-authorizer has to visit hive SessionState object which is not 
accessible for spark context classloader because Spark use a isolated 
classloader to load hive client jars.
2.1 spark-authorizer itself will rewrite SessionState object the first time to 
do privileges checking 
{quote}
I checked that. It is a pretty good hack that works :) I had to update it to 
support custom authentication. The current Ranger Hive Plugin use Hadoop UGI, 
which only knows Kerberos and Simple Auth. 
{quote}2.2 kyuubi hacks spark and turn off that classloader.
{quote}
I went through your documentation, it seems you have added a lot of good 
features. Currently, kyuubi is a custom build. Is it possible to integrate your 
extensions as an addon to existing deployment? In this way, users can deploy 
the default Thrift Server, but using some properties or code injections adds 
your feature? We might then able to support Livy also with the same code base.
{quote}3. spark-authorizer reuses the ranger hive plugin(0.5）which contains 
incompatible jersey dependencies with spark ones.
{quote}
There are few limitations with Ranger 0.5, most notably it doesn't support Tag 
Based policies. I was thinking, we should just implement first class plugin for 
SparkSQL using Ranger 0.7 or 1.0. It could use the same Hive 
ServiceDef/Policies, but native implementation for SparkSQL. In this way, we 
don't have to be dependent with Hive libraries and it's limitation.

 
{quote}And what are the steps I should follow to contribute Ranger?
{quote}
I have added you as a contributor to Ranger. You should be able to assign Jira 
to yourself and create new ones. I was thinking of splitting the work among 
those interested. Since you are familiar with the Spark code, do you want to 
look into the new extensions and see how we can implement basic authorization 
and advanced features like dynamic masking/UDF and Row Level filtering? I can 
look into Tag based policies and also see if I can extract your current Spark 
Authorizer feature into native SparkSQL Ranger Plugin.

Give me your thoughts and suggestions.

Thanks

 

 

 

 

 

> Implement SparkSQL plugin
> -------------------------
>
>                 Key: RANGER-2128
>                 URL: https://issues.apache.org/jira/browse/RANGER-2128
>             Project: Ranger
>          Issue Type: New Feature
>          Components: plugins, Ranger
>    Affects Versions: 1.1.0
>            Reporter: t oo
>            Priority: Major
>             Fix For: 1.1.0
>
>
> Implement SparkSQL plugin



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (RANGER-2128) Implement SparkSQL plugin

Reply via email to