[ 
https://issues.apache.org/jira/browse/SUBMARINE-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adesh Kumar Rao updated SUBMARINE-834:
--------------------------------------
    Description: 
Currently, the spark-security module is tightly coupled with 
"HiveTableRelation" and "MetastoreRelation". 

 This does not work for Hive transactional (ACID enabled) table. Since Hive has 
changed the way data/metadata is stored for transactional tables when compared 
to non-transactional tables. Therefore, Spark can not read Hive transactional 
tables directly. 

So even if security module can enforce security on such tables, spark can't 
actually read anything. 

 

Reading hive's transactional tables in spark, needs 
[Spark-Acid|https://github.com/qubole/spark-acid], implemented as Spark 
Datasource. Since security module is tightly coupled with "HiveTableRelation" 
and "MetastoreRelation", it does not provides authorization support for any 
datasource.

 

The idea is to support Spark datasource authorization (which has 
db/table/column/partitions analogous to hive). 

 

We can create generic interfaces for datasource, which each datasource can 
implement and then it can be authorized using the existing codebase.

 

  was:
Currently, the spark-security module is tightly coupled with 
"HiveTableRelation" and "MetastoreRelation". 

 This does not work for Hive transactional (ACID enabled) table. Since Hive has 
changed the way data/metadata is stored for transactional tables when compared 
to non-transactional tables. Therefore, Spark can not read Hive transactional 
tables directly. 

So even if security module can enforce security on such tables, spark can't 
actually read anything. 

 

Reading hive's transactional tables in spark, needs 
[Spark-Acid|https://github.com/qubole/spark-acid], implemented as Spark 
Datasource. Since security module is tightly coupled with "HiveTableRelation" 
and "MetastoreRelation", it does not provides authorization support for any 
datasource.

 

The idea is to support Spark datasource authorization (which has 
db/table/column/partitions analogous, similar to hive). 

 

We can create generic interfaces for datasource, which each datasource can 
implement and then it can be authorized using the existing codebase.

 


> Ranger support for Spark Datasources
> ------------------------------------
>
>                 Key: SUBMARINE-834
>                 URL: https://issues.apache.org/jira/browse/SUBMARINE-834
>             Project: Apache Submarine
>          Issue Type: New Feature
>          Components: Security
>            Reporter: Adesh Kumar Rao
>            Priority: Major
>
> Currently, the spark-security module is tightly coupled with 
> "HiveTableRelation" and "MetastoreRelation". 
>  This does not work for Hive transactional (ACID enabled) table. Since Hive 
> has changed the way data/metadata is stored for transactional tables when 
> compared to non-transactional tables. Therefore, Spark can not read Hive 
> transactional tables directly. 
> So even if security module can enforce security on such tables, spark can't 
> actually read anything. 
>  
> Reading hive's transactional tables in spark, needs 
> [Spark-Acid|https://github.com/qubole/spark-acid], implemented as Spark 
> Datasource. Since security module is tightly coupled with "HiveTableRelation" 
> and "MetastoreRelation", it does not provides authorization support for any 
> datasource.
>  
> The idea is to support Spark datasource authorization (which has 
> db/table/column/partitions analogous to hive). 
>  
> We can create generic interfaces for datasource, which each datasource can 
> implement and then it can be authorized using the existing codebase.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to