[
https://issues.apache.org/jira/browse/SUBMARINE-834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Adesh Kumar Rao updated SUBMARINE-834:
--------------------------------------
Description:
Currently, the spark-security module is tightly coupled with
"HiveTableRelation" and "MetastoreRelation".
This does not work for Hive transactional (ACID enabled) table. Since Hive has
changed the way data/metadata is stored for transactional tables when compared
to non-transactional tables. Therefore, Spark can not read Hive transactional
tables directly.
So even if security module can enforce security on such tables, spark can't
actually read anything.
Reading hive's transactional tables in spark, needs
[Spark-Acid|https://github.com/qubole/spark-acid], implemented as Spark
Datasource. Since security module is tightly coupled with "HiveTableRelation"
and "MetastoreRelation", it does not provides authorization support for any
datasource.
The idea is to support Spark datasource authorization (which has
db/table/column/partitions analogous, similar to hive).
We can create generic interfaces for datasource, which each datasource can
implement and then it can be authorized using the existing codebase.
was:
Currently, the spark-security module is tightly coupled with
"HiveTableRelation" and "MetastoreRelation".
This does not work for Hive transactional (ACID enabled) table. Since Hive has
changed the way data/metadata is stored for transactional tables when compared
to non-transactional tables. Therefore, Spark can not read Hive transactional
tables directly.
So even if security module can enforce security on such tables, spark can't
actually read anything.
Reading hive's transactional tables in spark, needs
[Spark-Acid|https://github.com/qubole/spark-acid], implemented as Spark
Datasource. Since security module is tightly coupled with "HiveTableRelation"
and "MetastoreRelation", it does not provides authorization support for any
datasource.
The idea is to support Spark datasource authorization (which has
db/table/column/partitions anologus, similar to hive).
We can create generic interfaces for datasource, which each datasource can
implement and then it can authorized using the existing codebase.
> Ranger support for Spark Datasources
> ------------------------------------
>
> Key: SUBMARINE-834
> URL: https://issues.apache.org/jira/browse/SUBMARINE-834
> Project: Apache Submarine
> Issue Type: New Feature
> Components: Security
> Reporter: Adesh Kumar Rao
> Priority: Major
>
> Currently, the spark-security module is tightly coupled with
> "HiveTableRelation" and "MetastoreRelation".
> This does not work for Hive transactional (ACID enabled) table. Since Hive
> has changed the way data/metadata is stored for transactional tables when
> compared to non-transactional tables. Therefore, Spark can not read Hive
> transactional tables directly.
> So even if security module can enforce security on such tables, spark can't
> actually read anything.
>
> Reading hive's transactional tables in spark, needs
> [Spark-Acid|https://github.com/qubole/spark-acid], implemented as Spark
> Datasource. Since security module is tightly coupled with "HiveTableRelation"
> and "MetastoreRelation", it does not provides authorization support for any
> datasource.
>
> The idea is to support Spark datasource authorization (which has
> db/table/column/partitions analogous, similar to hive).
>
> We can create generic interfaces for datasource, which each datasource can
> implement and then it can be authorized using the existing codebase.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]