This is an automated email from the ASF dual-hosted git repository. yzheng pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/polaris.git
The following commit(s) were added to refs/heads/main by this push: new 30acee6a8 Fix hugo blockquote (#1967) 30acee6a8 is described below commit 30acee6a867186c9bbd743cb01831635859f7eae Author: Yong Zheng <yongzheng0...@gmail.com> AuthorDate: Fri Jun 27 00:46:30 2025 -0500 Fix hugo blockquote (#1967) * Fix hugo blockquote * Add license header --- site/content/in-dev/unreleased/_index.md | 12 ++--- site/content/in-dev/unreleased/access-control.md | 10 ++-- site/content/in-dev/unreleased/admin-tool.md | 7 +-- .../in-dev/unreleased/command-line-interface.md | 14 +++--- site/content/in-dev/unreleased/configuration.md | 18 ++++---- .../configuring-polaris-for-production.md | 1 - site/content/in-dev/unreleased/entities.md | 8 ++-- site/content/in-dev/unreleased/evolution.md | 14 +++--- site/content/in-dev/unreleased/generic-table.md | 12 ++--- .../in-dev/unreleased/getting-started/_index.md | 2 +- .../getting-started/deploying-polaris/_index.md | 2 +- .../deploying-polaris/quickstart-deploy-aws.md | 4 +- .../deploying-polaris/quickstart-deploy-azure.md | 4 +- .../deploying-polaris/quickstart-deploy-gcp.md | 2 +- site/content/in-dev/unreleased/metastores.md | 13 ++++-- .../in-dev/unreleased/polaris-spark-client.md | 21 ++++----- site/content/in-dev/unreleased/policy.md | 9 ++-- site/content/in-dev/unreleased/realm.md | 6 +-- site/content/in-dev/unreleased/telemetry.md | 18 ++++---- site/layouts/_markup/render-blockquote.html | 53 ++++++++++++++++++++++ 20 files changed, 142 insertions(+), 88 deletions(-) diff --git a/site/content/in-dev/unreleased/_index.md b/site/content/in-dev/unreleased/_index.md index 28ef3e255..023d3b0f7 100644 --- a/site/content/in-dev/unreleased/_index.md +++ b/site/content/in-dev/unreleased/_index.md @@ -31,11 +31,10 @@ cascade: # This file will NOT be copied into a new release's versioned docs folder. --- -{{< alert title="Warning" color="warning" >}} -These pages refer to the current state of the main branch, which is still under active development. - -Functionalities can be changed, removed or added without prior notice. -{{< /alert >}} +> [!WARNING] +> These pages refer to the current state of the main branch, which is still under active development. +> +> Functionalities can be changed, removed or added without prior notice. Apache Polaris (Incubating) is a catalog implementation for Apache Iceberg™ tables and is built on the open source Apache Iceberg™ REST protocol. @@ -83,8 +82,7 @@ A catalog is configured with a storage configuration that can point to S3, Azure You create *namespaces* to logically group Iceberg tables within a catalog. A catalog can have multiple namespaces. You can also create nested namespaces. Iceberg tables belong to namespaces. -> **Important** -> +> [!Important] > For the access privileges defined for a catalog to be enforced correctly, > the following conditions must be met: > > - The directory only contains the data files that belong to a single table. diff --git a/site/content/in-dev/unreleased/access-control.md b/site/content/in-dev/unreleased/access-control.md index f8c21ab78..65d1e09ba 100644 --- a/site/content/in-dev/unreleased/access-control.md +++ b/site/content/in-dev/unreleased/access-control.md @@ -72,8 +72,7 @@ in the catalog, such as catalog namespaces or tables. You can create one or more You grant privileges to a catalog role and then grant the catalog role to a principal role to bestow the privileges to one or more service principals. -> **Note** -> +> [!NOTE] > If you update the privileges bestowed to a service principal, the updates > won't take effect for up to one hour. This means that if you > revoke or grant some privileges for a catalog, the updated privileges won't > take effect on any service principal with access to that catalog > for up to one hour. @@ -104,9 +103,8 @@ This section describes the privileges that are available in the Polaris access c roles are granted to principal roles, and principal roles are granted to service principals to specify the operations that service principals can perform on objects in Polaris. -> **Important** -> -> You can only grant privileges at the catalog level. Fine-grained access controls are not available. For example, you can grant read +> [!IMPORTANT] +> You can only grant privileges at the catalog level. Fine-grained access controls are not available. For example, you can grant read > privileges to all tables in a catalog but not to an individual table in the > catalog. To grant the full set of privileges (drop, list, read, write, etc.) on an object, you can use the *full privilege* option. @@ -184,7 +182,7 @@ includes the following users: create service principals. She can also create catalogs and namespaces and configure access control for Polaris resources. -- **Bob:** A data engineer who uses Apache Spark™ to +- **Bob:** A data engineer who uses Apache Spark™ to interact with Polaris. - Alice has created a service principal for Bob. It has been diff --git a/site/content/in-dev/unreleased/admin-tool.md b/site/content/in-dev/unreleased/admin-tool.md index 14f37b6f0..8af100127 100644 --- a/site/content/in-dev/unreleased/admin-tool.md +++ b/site/content/in-dev/unreleased/admin-tool.md @@ -117,8 +117,9 @@ java -jar runtime/admin/build/polaris-admin-*-runner.jar bootstrap -r realm1 -c The `purge` command is used to remove realms and principal credentials from the Polaris server. -> Warning: Running the `purge` command will remove all data associated with the specified realms! - This includes all entities (catalogs, namespaces, tables, views, roles), all principal +> [!WARNING] +> Running the `purge` command will remove all data associated with the specified realms! + This includes all entities (catalogs, namespaces, tables, views, roles), all principal credentials, grants, and any other data associated with the realms. ```shell @@ -139,4 +140,4 @@ For example, to purge the `realm1` realm, you can run the following command: ```shell java -jar runtime/admin/build/polaris-admin-*-runner.jar purge -r realm1 -``` \ No newline at end of file +``` diff --git a/site/content/in-dev/unreleased/command-line-interface.md b/site/content/in-dev/unreleased/command-line-interface.md index 308601082..8b53166b0 100644 --- a/site/content/in-dev/unreleased/command-line-interface.md +++ b/site/content/in-dev/unreleased/command-line-interface.md @@ -48,7 +48,7 @@ options: 6. privileges 7. profiles -Each _command_ supports several _subcommands_, and some _subcommands_ have _actions_ that come after the subcommand in turn. Finally, _arguments_ follow to form a full invocation. Within a set of named arguments at the end of an invocation ordering is generally not important. Many invocations also have a required positional argument of the type that the _command_ refers to. Again, the ordering of this positional argument relative to named arguments is not important. +Each _command_ supports several _subcommands_, and some _subcommands_ have _actions_ that come after the subcommand in turn. Finally, _arguments_ follow to form a full invocation. Within a set of named arguments at the end of an invocation ordering is generally not important. Many invocations also have a required positional argument of the type that the _command_ refers to. Again, the ordering of this positional argument relative to named arguments is not important. Some example full invocations: @@ -159,7 +159,7 @@ polaris catalogs create \ --allowed-location s3://other-bucket/third_location \ --role-arn ${ROLE_ARN} \ my_other_catalog - + polaris catalogs create \ --storage-type file \ --default-base-location file:///example/tmp \ @@ -250,7 +250,7 @@ polaris catalogs update --default-base-location s3://new-bucket/my_data my_catal ### Principals -The `principals` command is used to manage principals within Polaris. +The `principals` command is used to manage principals within Polaris. `principals` supports the following subcommands: @@ -572,7 +572,7 @@ The catalog-roles command is used to create, discover, and manage catalog roles 4. list 5. update 6. grant -7. revoke +7. revoke #### create @@ -734,7 +734,7 @@ polaris catalog-roles revoke --catalog sales_data contains_cc_info_catalog_role ### Namespaces -The `namespaces` command is used to manage namespaces within Polaris. +The `namespaces` command is used to manage namespaces within Polaris. `namespaces` supports the following subcommands: @@ -786,7 +786,7 @@ options: ##### Examples ``` -polaris namespaces delete outer_namespace.inner_namespace --catalog my_catalog +polaris namespaces delete outer_namespace.inner_namespace --catalog my_catalog polaris namespaces delete --catalog my_catalog outer_namespace ``` @@ -1164,7 +1164,7 @@ polaris profiles update dev ## Examples -This section outlines example code for a few common operations as well as for some more complex ones. +This section outlines example code for a few common operations as well as for some more complex ones. For especially complex operations, you may wish to instead directly use the Python API. diff --git a/site/content/in-dev/unreleased/configuration.md b/site/content/in-dev/unreleased/configuration.md index 7ba1a97c1..fec8940d6 100644 --- a/site/content/in-dev/unreleased/configuration.md +++ b/site/content/in-dev/unreleased/configuration.md @@ -28,7 +28,8 @@ This page provides information on how to configure Apache Polaris (Incubating). otherwise, this information is valid both for Polaris Docker images (and Kubernetes deployments) as well as for Polaris binary distributions. -> Note: for Production tips and best practices, refer to [Configuring Polaris for Production]({{% ref "configuring-polaris-for-production.md" %}}). +> [!NOTE] +> For Production tips and best practices, refer to [Configuring Polaris for Production]({{% ref "configuring-polaris-for-production.md" %}}). First off, Polaris server runs on Quarkus, and uses its configuration mechanisms. Read Quarkus [configuration guide](https://quarkus.io/guides/config) to get familiar with the basics. @@ -48,7 +49,7 @@ The sources are listed below, from highest to lowest priority: When using environment variables, there are two naming conventions: 1. If possible, just use the property name as the environment variable name. This works fine in most - cases, e.g. in Kubernetes deployments. For example, `polaris.realm-context.realms` can be + cases, e.g. in Kubernetes deployments. For example, `polaris.realm-context.realms` can be included as is in a container YAML definition: ```yaml env: @@ -79,7 +80,7 @@ read-only mode, as Polaris only reads the configuration file once, at startup. | Configuration Property | Default Value | Description | |----------------------------------------------------------------------------------------|-----------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `polaris.persistence.type` | `relational-jdbc` | Define the persistence backend used by Polaris (`in-memory`, `relational-jdbc`, `eclipse-link` (deprecated)). See [Configuring Apache Polaris for Production)[{{% ref "configuring-polaris-for-production.md" %}}) | +| `polaris.persistence.type` | `relational-jdbc` | Define the persistence backend used by Polaris (`in-memory`, `relational-jdbc`, `eclipse-link` (deprecated)). See [Configuring Apache Polaris for Production)[{{% ref "configuring-polaris-for-production.md" %}}) | | `polaris.persistence.relational.jdbc.max-retries` | `1` | Total number of retries JDBC persistence will attempt on connection resets or serialization failures before giving up. | | `polaris.persistence.relational.jdbc.max_duaration_in_ms` | `5000 ms` | Max time interval (ms) since the start of a transaction when retries can be attempted. | | `polaris.persistence.relational.jdbc.initial_delay_in_ms` | `100 ms` | Initial delay before retrying. The delay is doubled after each retry. | @@ -124,7 +125,7 @@ There are non Polaris configuration properties that can be useful: |------------------------------------------------------|---------------------------------|-----------------------------------------------------------------------------| | `quarkus.log.level` | `INFO` | Define the root log level. | | `quarkus.log.category."org.apache.polaris".level` | | Define the log level for a specific category. | -| `quarkus.default-locale` | System locale | Force the use of a specific locale, for instance `en_US`. | +| `quarkus.default-locale` | System locale | Force the use of a specific locale, for instance `en_US`. | | `quarkus.http.port` | `8181` | Define the HTTP port number. | | `quarkus.http.auth.basic` | `false` | Enable the HTTP basic authentication. | | `quarkus.http.limits.max-body-size` | `10240K` | Define the HTTP max body size limit. | @@ -139,7 +140,8 @@ There are non Polaris configuration properties that can be useful: | `quarkus.management.root-path` | | Define the root path where `/metrics` and `/health` endpoints are based on. | | `quarkus.otel.sdk.disabled` | `true` | Enable the OpenTelemetry layer. | -> Note: This section is only relevant for Polaris Docker images and Kubernetes deployments. +> [!NOTE] +> This section is only relevant for Polaris Docker images and Kubernetes deployments. There are many other actionable environment variables available in the official Polaris Docker image; they come from the base image used by Polaris, [ubi9/openjdk-21-runtime]. They should be used @@ -169,8 +171,8 @@ Here are some examples: | Example | `docker run` option | |--------------------------------------------|---------------------------------------------------------------------------------------------------------------------| | Using another GC | `-e GC_CONTAINER_OPTIONS="-XX:+UseShenandoahGC"` lets Polaris use Shenandoah GC instead of the default parallel GC. | -| Set the Java heap size to a _fixed_ amount | `-e JAVA_OPTS_APPEND="-Xms8g -Xmx8g"` lets Polaris use a Java heap of 8g. | -| Set the maximum heap percentage | `-e JAVA_MAX_MEM_RATIO="70"` lets Polaris use 70% percent of the available memory. | +| Set the Java heap size to a _fixed_ amount | `-e JAVA_OPTS_APPEND="-Xms8g -Xmx8g"` lets Polaris use a Java heap of 8g. | +| Set the maximum heap percentage | `-e JAVA_MAX_MEM_RATIO="70"` lets Polaris use 70% percent of the available memory. | ## Troubleshooting Configuration Issues @@ -184,5 +186,5 @@ quarkus.log.console.level=DEBUG quarkus.log.category."io.smallrye.config".level=DEBUG ``` -> [!IMPORTANT] This will print out all configuration values, including sensitive ones like +> [!IMPORTANT] This will print out all configuration values, including sensitive ones like > passwords. Don't do this in production, and don't share this output with > anyone you don't trust! diff --git a/site/content/in-dev/unreleased/configuring-polaris-for-production.md b/site/content/in-dev/unreleased/configuring-polaris-for-production.md index 47d25a9c3..479bc7d58 100644 --- a/site/content/in-dev/unreleased/configuring-polaris-for-production.md +++ b/site/content/in-dev/unreleased/configuring-polaris-for-production.md @@ -218,4 +218,3 @@ Leave out `FILE` to prevent its use. Only include the storage types your setup n The [Polaris Evolution](../evolution) page discusses backward compatibility and upgrade concerns. - diff --git a/site/content/in-dev/unreleased/entities.md b/site/content/in-dev/unreleased/entities.md index 04d625bb9..4ae1ce918 100644 --- a/site/content/in-dev/unreleased/entities.md +++ b/site/content/in-dev/unreleased/entities.md @@ -72,19 +72,19 @@ For information on managing principal roles with the REST API or for more inform ## Catalog Role -Polaris catalog roles are labels that may be granted to [catalogs](#catalog). Each catalog may have one or more catalog roles, and the same catalog role may be granted to multiple catalogs. Catalog roles may be assigned based on the nature of data that will reside in a catalog, or by the groups of users and services that might need to access that data. +Polaris catalog roles are labels that may be granted to [catalogs](#catalog). Each catalog may have one or more catalog roles, and the same catalog role may be granted to multiple catalogs. Catalog roles may be assigned based on the nature of data that will reside in a catalog, or by the groups of users and services that might need to access that data. Each catalog role may have multiple [privileges](#privilege) granted to it, and each catalog role can be granted to one or more [principal roles](#principal-role). This is the mechanism by which principals are granted access to entities inside a catalog such as namespaces and tables. ## Policy -Polaris policy is a set of rules governing actions on specified resources under predefined conditions. Polaris support policy for Iceberg table compaction, snapshot expiry, row-level access control, and custom policy definitions. +Polaris policy is a set of rules governing actions on specified resources under predefined conditions. Polaris support policy for Iceberg table compaction, snapshot expiry, row-level access control, and custom policy definitions. -Policy can be applied at catalog level, namespace level, or table level. Policy inheritance can be achieved by attaching one to a higher-level scope, such as namespace or catalog. As a result, tables registered under those entities do not need to be declared individually for the same policy. If a table or a namespace requires a different policy, user can assign a different policy, hence overriding policy of the same type declared at the higher level entities. +Policy can be applied at catalog level, namespace level, or table level. Policy inheritance can be achieved by attaching one to a higher-level scope, such as namespace or catalog. As a result, tables registered under those entities do not need to be declared individually for the same policy. If a table or a namespace requires a different policy, user can assign a different policy, hence overriding policy of the same type declared at the higher level entities. ## Privilege -Polaris privileges are granted to [catalog roles](#catalog-role) in order to grant principals with a given principal role some degree of access to catalogs with a given catalog role. When a privilege is granted to a catalog role, any principal roles granted that catalog role receive the privilege. In turn, any principals who are granted that principal role receive it. +Polaris privileges are granted to [catalog roles](#catalog-role) in order to grant principals with a given principal role some degree of access to catalogs with a given catalog role. When a privilege is granted to a catalog role, any principal roles granted that catalog role receive the privilege. In turn, any principals who are granted that principal role receive it. A privilege can be scoped to any entity inside a catalog, including the catalog itself. diff --git a/site/content/in-dev/unreleased/evolution.md b/site/content/in-dev/unreleased/evolution.md index ea29badc8..b3a57c752 100644 --- a/site/content/in-dev/unreleased/evolution.md +++ b/site/content/in-dev/unreleased/evolution.md @@ -26,7 +26,7 @@ This page discusses what can be expected from Apache Polaris as the project evol ## Using Polaris as a Catalog -Polaris is primarily intended to be used as a Catalog of Tables and Views. As such, +Polaris is primarily intended to be used as a Catalog of Tables and Views. As such, it implements the Iceberg REST Catalog API and its own REST APIs. Revisions of the Iceberg REST Catalog API are controlled by the [Apache Iceberg](https://iceberg.apache.org/) @@ -35,7 +35,7 @@ optional REST Catalog features may or may not be supported immediately. In gener there is no guarantee that Polaris releases always implement the latest version of the Iceberg REST Catalog API. -Any API under Polaris control that is not in an "experimental" or "beta" state +Any API under Polaris control that is not in an "experimental" or "beta" state (e.g. the Management API) is maintained as a versioned REST API. New releases of Polaris may include changes to the current version of the API. When that happens those changes are intended to be compatible with prior versions of Polaris clients. Certain endpoints @@ -43,13 +43,13 @@ and parameters may be deprecated. In case a major change is required to an API that cannot be implemented in a backward-compatible way, new endpoints (URI paths) may be introduced. New URI "roots" may -be introduced too (e.g. `api/catalog/v2`). +be introduced too (e.g. `api/catalog/v2`). Note that those "v1", "v2", etc. URI path segments are not meant to be 1:1 with Polaris releases or Polaris project version numbers (e.g. a "v2" path segment does not mean that it is added in Polaris 2.0). -Polaris servers will support deprecated API endpoints / parameters / versions / etc. +Polaris servers will support deprecated API endpoints / parameters / versions / etc. for some transition period to allow clients to migrate. ### Managing Polaris Database @@ -83,9 +83,9 @@ whether the class / method is `public` or not. This approach is not meant to discourage the use of Polaris code in downstream projects, but to allow more flexibility in evolving the codebase to support new catalog-level features -and improve code efficiency. Maintainers of downstream projects are encouraged to join Polaris +and improve code efficiency. Maintainers of downstream projects are encouraged to join Polaris mailing lists to monitor project changes, suggest improvements, and engage with the Polaris -community in case of specific compatibility concerns. +community in case of specific compatibility concerns. ## Semantic Versioning @@ -112,4 +112,4 @@ compatible way (e.g. removing or renaming a request parameter) is a major change * Dropping support for any previously defined [Policy](../policy/) type or property is a major change. * Upgrading Quarkus Runtime to its next major version is a major change (because -Quarkus-managed configuration may change). +Quarkus-managed configuration may change). diff --git a/site/content/in-dev/unreleased/generic-table.md b/site/content/in-dev/unreleased/generic-table.md index 2e0e3fe8e..63ef38a1d 100644 --- a/site/content/in-dev/unreleased/generic-table.md +++ b/site/content/in-dev/unreleased/generic-table.md @@ -24,7 +24,7 @@ weight: 435 The Generic Table in Apache Polaris is designed to provide support for non-Iceberg tables across different table formats includes delta, csv etc. It currently provides the following capabilities: - Create a generic table under a namespace -- Load a generic table +- Load a generic table - Drop a generic table - List all generic tables under a namespace @@ -85,7 +85,7 @@ request body looks like the following: } ``` -Here is an example to create a generic table with name `delta_table` and format as `delta` under a namespace `delta_ns` +Here is an example to create a generic table with name `delta_table` and format as `delta` under a namespace `delta_ns` for catalog `delta_catalog` using curl: ```shell @@ -125,7 +125,7 @@ And the response looks like the following: ``` ### List Generic Tables -The REST endpoint for listing the generic tables under a given +The REST endpoint for listing the generic tables under a given namespace is `GET /polaris/v1/{prefix}/namespaces/{namespace}/generic-tables/`. Following curl command lists all tables under namespace delta_namespace: @@ -160,10 +160,10 @@ For the complete and up-to-date API specification, see the [Catalog API Spec](ht ## Limitations Current limitations of Generic Table support: -1) Limited spec information. Currently, there is no spec for information like Schema, Partition etc. +1) Limited spec information. Currently, there is no spec for information like Schema, Partition etc. 2) No commit coordination or update capability provided at the catalog service level. -Therefore, the catalog itself is unaware of anything about the underlying table except some of the loosely defined metadata. +Therefore, the catalog itself is unaware of anything about the underlying table except some of the loosely defined metadata. It is the responsibility of the engine (and plugins used by the engine) to determine exactly how loading or commiting data -should look like based on the metadata. For example, with the delta support, th delta log serialization, deserialization +should look like based on the metadata. For example, with the delta support, th delta log serialization, deserialization and update all happens at client side. diff --git a/site/content/in-dev/unreleased/getting-started/_index.md b/site/content/in-dev/unreleased/getting-started/_index.md index d4f13e6f6..ee0f9b687 100644 --- a/site/content/in-dev/unreleased/getting-started/_index.md +++ b/site/content/in-dev/unreleased/getting-started/_index.md @@ -22,4 +22,4 @@ type: docs weight: 101 build: render: never ---- \ No newline at end of file +--- diff --git a/site/content/in-dev/unreleased/getting-started/deploying-polaris/_index.md b/site/content/in-dev/unreleased/getting-started/deploying-polaris/_index.md index 32fd5dafd..c6b293d29 100644 --- a/site/content/in-dev/unreleased/getting-started/deploying-polaris/_index.md +++ b/site/content/in-dev/unreleased/getting-started/deploying-polaris/_index.md @@ -24,4 +24,4 @@ weight: 300 We will now demonstrate how to deploy Polaris locally, as well as with all supported Cloud Providers: Amazon Web Services (AWS), Azure, and Google Cloud Platform (GCP). -Locally, Polaris can be deployed using both Docker and local build. On the cloud, this tutorial will deploy Polaris using Docker only - but local builds can also be executed. \ No newline at end of file +Locally, Polaris can be deployed using both Docker and local build. On the cloud, this tutorial will deploy Polaris using Docker only - but local builds can also be executed. diff --git a/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-aws.md b/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-aws.md index 8aa3b34a7..c83166ceb 100644 --- a/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-aws.md +++ b/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-aws.md @@ -26,7 +26,7 @@ Build and launch Polaris using the AWS Startup Script at the location provided i Additionally, Polaris will be bootstrapped to use this database and Docker containers will be spun up for Spark SQL and Trino. The requirements to run the script below are: -* There must be at least two subnets created in the VPC and region in which your EC2 instance reside. The span of subnets MUST include at least 2 availability zones (AZs) within the same region. +* There must be at least two subnets created in the VPC and region in which your EC2 instance reside. The span of subnets MUST include at least 2 availability zones (AZs) within the same region. * Your EC2 instance must be enabled with [IMDSv1 or IMDSv2 with 2+ hop limit](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-IMDS-new-instances.html#configure-IMDS-new-instances-instance-settings). * The AWS identity that you will use to run this script must have the following AWS permissions: * "ec2:DescribeInstances" @@ -54,4 +54,4 @@ export ASSETS_PATH=$(pwd)/getting-started/assets/ docker compose -p polaris -f getting-started/eclipselink/docker-compose.yml down ``` -To deploy Polaris in a production setting, please review further recommendations at the [Configuring Polaris for Production]({{% relref "../../configuring-polaris-for-production" %}}) page. \ No newline at end of file +To deploy Polaris in a production setting, please review further recommendations at the [Configuring Polaris for Production]({{% relref "../../configuring-polaris-for-production" %}}) page. diff --git a/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-azure.md b/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-azure.md index ff1f2c647..d82387f51 100644 --- a/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-azure.md +++ b/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-azure.md @@ -28,7 +28,7 @@ Additionally, Polaris will be bootstrapped to use this database and Docker conta The requirements to run the script below are: * Install the AZ CLI, if it is not already installed on the Azure VM. Instructions to download the AZ CLI can be found [here](https://learn.microsoft.com/en-us/cli/azure/install-azure-cli). * You must be logged into the AZ CLI. Please run `az account show` to ensure that you are logged in prior to running this script. -* Assign a System-Assigned Managed Identity to the Azure VM. +* Assign a System-Assigned Managed Identity to the Azure VM. ```shell chmod +x getting-started/assets/cloud_providers/deploy-azure.sh @@ -49,4 +49,4 @@ export ASSETS_PATH=$(pwd)/getting-started/assets/ docker compose -p polaris -f getting-started/eclipselink/docker-compose.yml down ``` -To deploy Polaris in a production setting, please review further recommendations at the [Configuring Polaris for Production]({{% relref "../../configuring-polaris-for-production" %}}) page. \ No newline at end of file +To deploy Polaris in a production setting, please review further recommendations at the [Configuring Polaris for Production]({{% relref "../../configuring-polaris-for-production" %}}) page. diff --git a/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-gcp.md b/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-gcp.md index cbf15a876..2f728c324 100644 --- a/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-gcp.md +++ b/site/content/in-dev/unreleased/getting-started/deploying-polaris/quickstart-deploy-gcp.md @@ -49,4 +49,4 @@ export ASSETS_PATH=$(pwd)/getting-started/assets/ docker compose -p polaris -f getting-started/eclipselink/docker-compose.yml down ``` -To deploy Polaris in a production setting, please review further recommendations at the [Configuring Polaris for Production]({{% relref "../../configuring-polaris-for-production" %}}) page. \ No newline at end of file +To deploy Polaris in a production setting, please review further recommendations at the [Configuring Polaris for Production]({{% relref "../../configuring-polaris-for-production" %}}) page. diff --git a/site/content/in-dev/unreleased/metastores.md b/site/content/in-dev/unreleased/metastores.md index 53d9ef144..224f35047 100644 --- a/site/content/in-dev/unreleased/metastores.md +++ b/site/content/in-dev/unreleased/metastores.md @@ -45,7 +45,8 @@ Please refer to the documentation here: Additionally the retries can be configured via `polaris.persistence.relational.jdbc.*` properties please ref [configuration](./configuration.md) ## EclipseLink (Deprecated) -> [!IMPORTANT] Eclipse link is deprecated, its recommend to use Relational JDBC as persistence instead. +> [!IMPORTANT] +> Eclipse link is deprecated, its recommend to use Relational JDBC as persistence instead. Polaris includes EclipseLink plugin by default with PostgresSQL driver. @@ -65,19 +66,22 @@ The `configuration-file` option must point to an [EclipseLink configuration file `persistence.xml`, is used to set up the database connection properties, which can differ depending on the type of database and its configuration. -> Note: You have to locate the `persistence.xml` at least two folders down to the root folder, e.g. `/deployments/config/persistence.xml` is OK, whereas `/deployments/persistence.xml` will cause an infinity loop. +> [!NOTE] +> You have to locate the `persistence.xml` at least two folders down to the root folder, e.g. `/deployments/config/persistence.xml` is OK, whereas `/deployments/persistence.xml` will cause an infinity loop. [Quarkus Configuration Reference]: https://quarkus.io/guides/config-reference [EclipseLink configuration file]: https://eclipse.dev/eclipselink/documentation/4.0/solutions/solutions.html#TESTINGJPA002 Polaris creates and connects to a separate database for each realm. Specifically, the `{realm}` placeholder in `jakarta.persistence.jdbc.url` is substituted with the actual realm name, allowing the Polaris server to connect to different databases based on the realm. -> Note: some database systems such as Postgres don't create databases automatically. Database admins need to create them manually before running Polaris server. +> [!NOTE] +> Some database systems such as Postgres don't create databases automatically. Database admins need to create them manually before running Polaris server. A single `persistence.xml` can describe multiple [persistence units](https://eclipse.dev/eclipselink/documentation/4.0/concepts/concepts.html#APPDEV001). For example, with both a `polaris-dev` and `polaris` persistence unit defined, you could use a single `persistence.xml` to easily switch between development and production databases. Use the `persistence-unit` option in the Polaris server configuration to easily switch between persistence units. ### Using H2 -> [!IMPORTANT] H2 is an in-memory database and is not suitable for production! +> [!IMPORTANT] +> H2 is an in-memory database and is not suitable for production! The default [persistence.xml] in Polaris is already configured for H2, but you can easily customize your H2 configuration using the persistence unit template below: @@ -147,4 +151,3 @@ The following shows a sample configuration for integrating Polaris with Postgres </properties> </persistence-unit> ``` - diff --git a/site/content/in-dev/unreleased/polaris-spark-client.md b/site/content/in-dev/unreleased/polaris-spark-client.md index 4ceb536a9..ffe75cb7b 100644 --- a/site/content/in-dev/unreleased/polaris-spark-client.md +++ b/site/content/in-dev/unreleased/polaris-spark-client.md @@ -22,10 +22,10 @@ type: docs weight: 650 --- -Apache Polaris now provides Catalog support for Generic Tables (non-Iceberg tables), please check out +Apache Polaris now provides Catalog support for Generic Tables (non-Iceberg tables), please check out the [Catalog API Spec]({{% ref "polaris-catalog-service" %}}) for Generic Table API specs. -Along with the Generic Table Catalog support, Polaris is also releasing a Spark client, which helps to +Along with the Generic Table Catalog support, Polaris is also releasing a Spark client, which helps to provide an end-to-end solution for Apache Spark to manage Delta tables using Polaris. Note the Polaris Spark client is able to handle both Iceberg and Delta tables, not just Delta. @@ -33,14 +33,13 @@ Note the Polaris Spark client is able to handle both Iceberg and Delta tables, n This page documents how to connect Spark with Polaris Service using the Polaris Spark client. ## Quick Start with Local Polaris service -If you want to quickly try out the functionality with a local Polaris service, simply check out the Polaris repo -and follow the instructions in the Spark plugin getting-started +If you want to quickly try out the functionality with a local Polaris service, simply check out the Polaris repo +and follow the instructions in the Spark plugin getting-started [README](https://github.com/apache/polaris/blob/main/plugins/spark/v3.5/getting-started/README.md). Check out the Polaris repo: ```shell -cd ~ -git clone https://github.com/apache/polaris.git +git clone https://github.com/apache/polaris.git ~/polaris ``` ## Start Spark against a deployed Polaris service @@ -48,7 +47,7 @@ Before starting, ensure that the deployed Polaris service supports Generic Table Spark 3.5.5 is recommended, and you can follow the instructions below to get a Spark 3.5.5 distribution. ```shell cd ~ -wget https://archive.apache.org/dist/spark/spark-3.5.5/spark-3.5.5-bin-hadoop3.tgz +wget https://archive.apache.org/dist/spark/spark-3.5.5/spark-3.5.5-bin-hadoop3.tgz mkdir spark-3.5 tar xzvf spark-3.5.5-bin-hadoop3.tgz -C spark-3.5 --strip-components=1 cd spark-3.5 @@ -74,13 +73,13 @@ bin/spark-shell \ Assume the released Polaris Spark client you want to use is `org.apache.polaris:polaris-spark-3.5_2.12:1.0.0`, replace the `polaris-spark-client-package` field with the release. -The `spark-catalog-name` is the catalog name you will use with Spark, and `polaris-catalog-name` is the catalog name used -by Polaris service, for simplicity, you can use the same name. +The `spark-catalog-name` is the catalog name you will use with Spark, and `polaris-catalog-name` is the catalog name used +by Polaris service, for simplicity, you can use the same name. Replace the `polaris-service-uri` with the uri of the deployed Polaris service. For example, with a locally deployed Polaris service, the uri would be `http://localhost:8181/api/catalog`. -For `client-id` and `client-secret` values, you can refer to [Using Polaris]({{% ref "getting-started/using-polaris" %}}) +For `client-id` and `client-secret` values, you can refer to [Using Polaris]({{% ref "getting-started/using-polaris" %}}) for more details. You can also start the connection by programmatically initialize a SparkSession, following is an example with PySpark: @@ -91,7 +90,7 @@ spark = SparkSession.builder .config("spark.jars.packages", "<polaris-spark-client-package>,org.apache.iceberg:iceberg-aws-bundle:1.9.0,io.delta:delta-spark_2.12:3.3.1") .config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog") .config("spark.sql.extensions", "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions,io.delta.sql.DeltaSparkSessionExtension") - .config("spark.sql.catalog.<spark-catalog-name>", "org.apache.polaris.spark.SparkCatalog") + .config("spark.sql.catalog.<spark-catalog-name>", "org.apache.polaris.spark.SparkCatalog") .config("spark.sql.catalog.<spark-catalog-name>.uri", <polaris-service-uri>) .config("spark.sql.catalog.<spark-catalog-name>.token-refresh-enabled", "true") .config("spark.sql.catalog.<spark-catalog-name>.credential", "<client-id>:<client_secret>") diff --git a/site/content/in-dev/unreleased/policy.md b/site/content/in-dev/unreleased/policy.md index 3f4935388..e96661f3f 100644 --- a/site/content/in-dev/unreleased/policy.md +++ b/site/content/in-dev/unreleased/policy.md @@ -19,10 +19,10 @@ # title: Policy type: docs -weight: 425 +weight: 425 --- -The Polaris Policy framework empowers organizations to centrally define, manage, and enforce fine-grained governance, lifecycle, and operational rules across all data resources in the catalog. +The Polaris Policy framework empowers organizations to centrally define, manage, and enforce fine-grained governance, lifecycle, and operational rules across all data resources in the catalog. With the policy API, you can: - Create and manage policies @@ -81,7 +81,8 @@ The inheritance follows an override mechanism: 1. Table-level policies override namespace and catalog policies 2. Namespace-level policies override parent namespace and catalog policies -> **Important:** Because an override completely replaces the same policy type at higher levels, +> [!IMPORTANT] +> Because an override completely replaces the same policy type at higher levels, > **only one instance of a given policy type can be attached to (and therefore > affect) a resource**. ## Working with Policies @@ -194,4 +195,4 @@ GET /polaris/v1/catalog/applicable-policies?namespace=finance%1Fquarterly&target ### API Reference -For the complete and up-to-date API specification, see the [policy-api.yaml](https://github.com/apache/polaris/blob/main/spec/polaris-catalog-apis/policy-apis.yaml). \ No newline at end of file +For the complete and up-to-date API specification, see the [policy-api.yaml](https://github.com/apache/polaris/blob/main/spec/polaris-catalog-apis/policy-apis.yaml). diff --git a/site/content/in-dev/unreleased/realm.md b/site/content/in-dev/unreleased/realm.md index 9da5e7e25..67465a4b7 100644 --- a/site/content/in-dev/unreleased/realm.md +++ b/site/content/in-dev/unreleased/realm.md @@ -38,8 +38,8 @@ A realm in Polaris serves as logical partitioning mechanism within the catalog s An example of this is: -`jdbc:postgresql://localhost:5432/{realm} -` +`jdbc:postgresql://localhost:5432/{realm}` + This ensures that each realm's data is stored separately. ### How is it used in the system? @@ -50,4 +50,4 @@ This ensures that each realm's data is stored separately. authorization. **Isolation:** In methods like `createEntityManagerFactory(@Nonnull RealmContext realmContext)` from `PolarisEclipseLinkPersistenceUnit` interface, the realm context influence how resources are created or managed based on the security policies of that realm. -An example of this is the way a realm name can be used to create a database connection url so that you have one database instance per realm, when applicable. Or it can be more granular and applied at primary key level (within the same database instance). \ No newline at end of file +An example of this is the way a realm name can be used to create a database connection url so that you have one database instance per realm, when applicable. Or it can be more granular and applied at primary key level (within the same database instance). diff --git a/site/content/in-dev/unreleased/telemetry.md b/site/content/in-dev/unreleased/telemetry.md index 8df97f505..5921586d5 100644 --- a/site/content/in-dev/unreleased/telemetry.md +++ b/site/content/in-dev/unreleased/telemetry.md @@ -49,7 +49,7 @@ setting the `polaris.metrics.tags.application=<new-value>` property. ### Realm ID Tag -Polaris can add the realm ID as a tag to all API and HTTP request metrics. This is disabled by +Polaris can add the realm ID as a tag to all API and HTTP request metrics. This is disabled by default to prevent high cardinality issues, but can be enabled by setting the following properties: ```properties @@ -61,14 +61,14 @@ You should be particularly careful when enabling the realm ID tag in HTTP reques metrics typically have a much higher cardinality than API request metrics. In order to prevent the number of tags from growing indefinitely and causing performance issues or -crashing the server, the number of unique realm IDs in HTTP request metrics is limited to 100 by +crashing the server, the number of unique realm IDs in HTTP request metrics is limited to 100 by default. If the number of unique realm IDs exceeds this value, a warning will be logged and no more -HTTP request metrics will be recorded. This threshold can be changed by setting the +HTTP request metrics will be recorded. This threshold can be changed by setting the `polaris.metrics.realm-id-tag.http-metrics-max-cardinality` property. ## Traces -Traces are published using [OpenTelemetry]. +Traces are published using [OpenTelemetry]. [OpenTelemetry]: https://quarkus.io/guides/opentelemetry-tracing @@ -110,7 +110,7 @@ quarkus.otel.resource.attributes[0]=service.name=Polaris quarkus.otel.resource.attributes[1]=deployment.environment=dev ``` -Finally, two additional span attributes are added to all request parent spans: +Finally, two additional span attributes are added to all request parent spans: - `polaris.request.id`: The unique identifier of the request, if set by the caller through the `Polaris-Request-Id` header. @@ -122,7 +122,7 @@ Finally, two additional span attributes are added to all request parent spans: If the server is unable to publish traces, check first for a log warning message like the following: ``` -SEVERE [io.ope.exp.int.grp.OkHttpGrpcExporter] (OkHttp http://localhost:4317/...) Failed to export spans. +SEVERE [io.ope.exp.int.grp.OkHttpGrpcExporter] (OkHttp http://localhost:4317/...) Failed to export spans. The request could not be executed. Full error message: Failed to connect to localhost/0:0:0:0:0:0:0:1:4317 ``` @@ -131,7 +131,7 @@ running and that the URL is correct. ## Logging -Polaris relies on [Quarkus](https://quarkus.io/guides/logging) for logging. +Polaris relies on [Quarkus](https://quarkus.io/guides/logging) for logging. By default, logs are written to the console and to a file located in the `./logs` directory. The log file is rotated daily and compressed. The maximum size of the log file is 10MB, and the maximum @@ -141,7 +141,7 @@ JSON logging can be enabled by setting the `quarkus.log.console.json` and `quark properties to `true`. By default, JSON logging is disabled. The log level can be set for the entire application or for specific packages. The default log level -is `INFO`. To set the log level for the entire application, use the `quarkus.log.level` property. +is `INFO`. To set the log level for the entire application, use the `quarkus.log.level` property. To set the log level for a specific package, use the `quarkus.log.category."package-name".level`, where `package-name` is the name of the package. For example, the package `io.smallrye.config` has a @@ -189,4 +189,4 @@ polaris.log.mdc.environment=prod polaris.log.mdc.region=us-west-2 ``` -MDC context is propagated across threads, including in `TaskExecutor` threads. \ No newline at end of file +MDC context is propagated across threads, including in `TaskExecutor` threads. diff --git a/site/layouts/_markup/render-blockquote.html b/site/layouts/_markup/render-blockquote.html new file mode 100644 index 000000000..1430f356e --- /dev/null +++ b/site/layouts/_markup/render-blockquote.html @@ -0,0 +1,53 @@ +{{/* +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + +http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +*/}} + +{{- /* Define emojis and handle the case where .AlertType is not found */ -}} +{{- $emojis := dict + "danger" ":stop_sign:" + "error" ":x:" + "caution" ":warning:" + "warning" ":warning:" + "important" ":exclamation:" + "info" ":information_source:" + "note" ":memo:" + "tip" ":bulb:" + "success" ":white_check_mark:" + "primary" ":information_source:" +-}} + +{{- if eq .Type "alert" -}} + <blockquote class="alert alert-{{ .AlertType }}"> + {{- /* Check for emoji, with a sensible default */ -}} + {{- $emoji := index $emojis .AlertType | default ":information_source:" -}} + <p class="alert-heading"> + {{- transform.Emojify $emoji -}} + {{- /* Check for custom title, otherwise use i18n or title case */ -}} + {{- with .AlertTitle -}} + {{- . -}} + {{- else -}} + {{- or (i18n .AlertType) (title .AlertType) -}} + {{- end -}} + </p> + {{- .Text | safeHTML -}} + </blockquote> +{{- else -}} + <blockquote> + {{- .Text | safeHTML -}} + </blockquote> +{{- end -}}