This is an automated email from the ASF dual-hosted git repository.
yufei pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/polaris.git
The following commit(s) were added to refs/heads/main by this push:
new b7824e38 Remove unrelated doc (#883)
b7824e38 is described below
commit b7824e3836f6b47050bb5b0f4f8423fecb3fa712
Author: Yufei Gu <[email protected]>
AuthorDate: Mon Jan 27 08:59:38 2025 -0800
Remove unrelated doc (#883)
---
site/content/in-dev/unreleased/overview.md | 57 +++---------------------------
1 file changed, 4 insertions(+), 53 deletions(-)
diff --git a/site/content/in-dev/unreleased/overview.md
b/site/content/in-dev/unreleased/overview.md
index 41f8daea..3c5713a1 100644
--- a/site/content/in-dev/unreleased/overview.md
+++ b/site/content/in-dev/unreleased/overview.md
@@ -59,7 +59,7 @@ A catalog can be one of the following two types:
- Internal: The catalog is managed by Polaris. Tables from this catalog can be
read and written in Polaris.
- External: The catalog is externally managed by another Iceberg catalog
provider (for example, Snowflake, Glue, Dremio Arctic). Tables from
- this catalog are synced to Polaris. These tables are read-only in Polaris.
In the current release, only a Snowflake external catalog is provided.
+ this catalog are synced to Polaris. These tables are read-only in Polaris.
A catalog is configured with a storage configuration that can point to S3,
Azure storage, or GCS.
@@ -68,16 +68,6 @@ A catalog is configured with a storage configuration that
can point to S3, Azure
You create *namespaces* to logically group Iceberg tables within a catalog. A
catalog can have multiple namespaces. You can also create
nested namespaces. Iceberg tables belong to namespaces.
-### Apache Iceberg™ tables and catalogs
-
-In an internal catalog, an Iceberg table is registered in Polaris, but read
and written via query engines. The table data and
-metadata is stored in your external cloud storage. The table uses Polaris as
the Iceberg catalog.
-
-If you have tables housed in another Iceberg catalog, you can sync these
tables to an external catalog in Polaris.
-If you sync this catalog to Polaris, it appears as an external catalog in
Polaris. Clients connecting to the external
-catalog can read from or write to these tables. However, clients connecting to
Polaris will only be able to
-read from these tables.
-
> **Important**
>
> For the access privileges defined for a catalog to be enforced correctly,
> the following conditions must be met:
@@ -97,51 +87,14 @@ read from these tables.
> - /namespace1/namespace1a/customers/<files for the customers table *only*>
> - /namespace1/namespace1a/orders/<files for the orders table *only*>
-### Service principal
-
-A service principal is an entity that you create in Polaris. Each service
principal encapsulates credentials that you use to connect
-to Polaris.
-
-Query engines use service principals to connect to catalogs.
-
-Polaris generates a Client ID and Client Secret pair for each service
principal.
-
-The following table displays example service principals that you might create
in Polaris:
-
- | Service connection name | Purpose |
- | --------------------------- | ----------- |
- | Flink ingestion | For Apache Flink® to ingest streaming
data into Apache Iceberg™ tables. |
- | Spark ETL pipeline | For Apache Spark™ to run ETL pipeline
jobs on Iceberg tables. |
- | Snowflake data pipelines | For Snowflake to run data pipelines for
transforming data in Apache Iceberg™ tables. |
- | Trino BI dashboard | For Trino to run BI queries for powering a
dashboard. |
- | Snowflake AI team | For Snowflake to run AI jobs on data in
Apache Iceberg™ tables. |
-
-### Service connection
-
-A service connection represents a REST-compatible engine (such as Apache
Spark™, Apache Flink®, or Trino) that can read from and write to
Polaris
-Catalog. When creating a new service connection, the Polaris administrator
grants the service principal that is created with the new service
-connection either a new or existing principal role. A principal role is a
resource in Polaris that you can use to logically group Polaris
-service principals together and grant privileges on securable objects. For
more information, see [Principal role]({{% ref "access-control#principal-role"
%}}). Polaris uses a role-based access control (RBAC) model to grant service
principals access to resources. For more information,
-see [Access control]({{% ref "access-control" %}}). For a diagram of this
model, see [RBAC model]({{% ref "access-control#rbac-model" %}}).
-
-If the Polaris administrator grants the service principal for the new service
connection a new principal role, the service principal
-doesn't have any privileges granted to it yet. When securing the catalog that
the new service connection will connect to, the Polaris
-administrator grants privileges to catalog roles and then grants these catalog
roles to the new principal role. As a result, the service
-principal for the new service connection has these privileges. For more
information about catalog roles, see [Catalog role]({{% ref
"access-control#catalog-role" %}}).
-
-If the Polaris administrator grants an existing principal role to the service
principal for the new service connection, the service principal
-has the same privileges granted to the catalog roles that are granted to the
existing principal role. If needed, the Polaris
-administrator can grant additional catalog roles to the existing principal
role or remove catalog roles from it to adjust the privileges
-bestowed to the service principal. For an example of how RBAC works in
Polaris, see [RBAC example]({{% ref "access-control#rbac-example" %}}).
-
### Storage configuration
-A storage configuration stores a generated identity and access management
(IAM) entity for your external cloud storage and is created
+A storage configuration stores a generated identity and access management
(IAM) entity for your cloud storage and is created
when you create a catalog. The storage configuration is used to set the values
to connect Polaris to your cloud storage. During the
catalog creation process, an IAM entity is generated and used to create a
trust relationship between the cloud storage provider and Polaris
Catalog.
-When you create a catalog, you supply the following information about your
external cloud storage:
+When you create a catalog, you supply the following information about your
cloud storage:
| Cloud storage provider | Information |
| -----------------------| ----------- |
@@ -172,12 +125,10 @@ In the following example workflow, Bob creates an Apache
Iceberg™ table na
## Security and access control
-This section describes security and access control.
-
### Credential vending
To secure interactions with service connections, Polaris vends temporary
storage credentials to the query engine during query
-execution. These credentials allow the query engine to run the query without
requiring access to your external cloud storage for
+execution. These credentials allow the query engine to run the query without
requiring access to your cloud storage for
Iceberg tables. This process is called credential vending.
As of now, the following limitation is known regarding Apache Iceberg support: