Bosco created RANGER-4910:
-----------------------------
Summary: Develop Apache Ranger Plugin for Polaris to Enhance
Access Control for Apache Iceberg
Key: RANGER-4910
URL: https://issues.apache.org/jira/browse/RANGER-4910
Project: Ranger
Issue Type: New Feature
Components: plugins
Reporter: Bosco
Polaris, recently open-sourced by Snowflake, provides comprehensive technical
metadata management for Apache Iceberg. Key features of Polaris include:
- *RBAC (Role-Based Access Control):* Polaris supports RBAC for table and
view-level operations. [See
Documentation]([https://polaris.io/#tag/Access-Control])
- *Role Management:* Polaris allows the creation of Principals with roles like
Data Engineer, Data Scientist, etc.
- *Catalog Roles:* Specialized roles like Catalog Administrators, Catalog
Readers, and Catalog Contributors can be defined to manage access to different
parts of the data catalog.
- *Granular Privileges:* Polaris provides fine-grained privileges for
operations on Tables, Views, Namespaces, and Catalogs. Examples include
`TABLE_CREATE`, `TABLE_READ_DATA`, `TABLE_WRITE_DATA`, `VIEW_CREATE`,
`NAMESPACE_CREATE`, `CATALOG_MANAGE_CONTENT`, and more.
- *Credential Vending:* Polaris vends credentials based on the specific table
the user is trying to access.
- *API for Role Management:* Polaris offers an API to manage grants for roles,
allowing fine-tuned control over data access.
*Objective:*
To enhance the usability and security of Polaris for Apache Iceberg users, the
request is to develop an Apache Ranger plugin that integrates Polaris' access
control features with Apache Ranger. This integration will allow for
centralized and consistent management of access policies, audit logging, and
fine-grained access control across different tools used with Apache Iceberg.
*Use Cases:*
1. *Centralized Access Policy Management:*
- Implement centralized and consistent management of access policies for data
stored using Apache Iceberg across multiple tools and environments.
2. *Access Control for Data Engineering Workloads:*
- Manage and control access to datasets used by Data Engineering workloads
(e.g., Apache Spark) with a coarser-grained approach at the table level.
3. *Fine-Grained Access Control for Data Analysts:*
- Provide fine-grained access control for Data Analysts using compute engines
like Trino. This control can be enforced by leveraging the native Ranger Plugin
in Trino, allowing for more granular control over data access at the table,
view, or even column level.
4. *Centralized Access Auditing:*
- Enable centralized collection and analysis of access audit logs across all
tools used to access datasets in Iceberg, ensuring comprehensive auditing and
compliance.
*References:*
- [PolarisAuthorizer Class on
GitHub]([https://github.com/polaris-catalog/polaris/blob/main/polaris-core/src/main/java/io/polaris/core/auth/PolarisAuthorizer.java):]
The `PolarisAuthorizer` class provides the core authorization logic in
Polaris, which can be leveraged by the Apache Ranger plugin.
*Expected Deliverables:*
- A fully functional Apache Ranger plugin for Polaris that supports the
outlined use cases.
- Documentation on how to configure and deploy the plugin.
- Integration tests to ensure the plugin works as expected with Apache Iceberg
and other tools like Apache Spark and Trino.
- A detailed user guide explaining how to use the plugin for managing access
control in various scenarios.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)