RussellSpitzer commented on code in PR #4433:
URL: https://github.com/apache/polaris/pull/4433#discussion_r3244367546


##########
SECURITY-THREAT-MODEL.md:
##########
@@ -0,0 +1,730 @@
+<!--
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+   http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# Apache Polaris Threat Model
+
+## Purpose
+
+This document defines the security boundaries for Apache Polaris. It is 
intended
+to guide maintainers, security reviewers, and automated analysis tools when
+evaluating potential vulnerabilities.
+
+Use this document to decide whether a finding affects a protected Polaris 
asset,
+crosses a Polaris trust boundary, violates a Polaris security invariant, and
+should be handled as a Polaris security issue.
+
+This document is guidance for analysis and triage. It does not make policy
+decisions, accept or reject vulnerability reports, assign ASF severity, 
allocate
+CVEs, or determine disclosure handling. Human project and ASF security review 
is
+required for those decisions.
+
+## Scope
+
+Apache Polaris is a catalog service for managing Apache Iceberg catalogs,
+namespaces, tables, views, principals, roles, policies, and related metadata.
+
+This threat model covers:
+
+- Polaris server and runtime components.
+- REST APIs exposed by Polaris.
+- Authentication and authorization behavior.
+- Catalog metadata access and mutation.
+- Persistence-layer interactions.
+- Integration points with storage, catalog, policy, and identity backends where
+  Polaris makes authentication, authorization, or trust decisions.
+
+This threat model does not cover:
+
+- Vulnerabilities in client applications using Polaris incorrectly.
+- Compromise of the underlying database, object store, identity provider,
+  container platform, host operating system, or deployment platform.
+- Denial of service caused solely by insufficient infrastructure sizing.
+- Bugs in third-party dependencies unless Polaris exposes them through unsafe
+  configuration or usage.
+
+## Component Families
+
+Polaris has several component families with different entry points, deployment
+models, and trust boundaries:
+
+| Component family | Representative entry points | Deployment model | 
Threat-model scope |
+| --- | --- | --- | --- |
+| Polaris server and runtime | Management APIs, catalog APIs, service runtime 
| Long-running service | In scope for authentication, authorization, metadata, 
persistence, storage, policy, and credential-handling decisions. |
+| Polaris admin tool | Administrative CLI commands and generated local 
profiles | Operator tool | In scope when handling credentials, configuration, 
administration, logs, or generated artifacts. |
+| Python CLI under `client/python/` | Client commands, local configuration, 
command output | User or operator CLI | In scope when handling credentials, 
tokens, catalog metadata, local profiles, logs, or generated artifacts. |
+| Release artifacts | Source release, runnable tarballs, container images, 
Helm charts | Distribution and deployment | In scope for packaged defaults, 
included modules, enabled features, generated configuration, and documented 
deployment paths. |
+| Reusable modules | `polaris-core`, `polaris-runtime-service`, extension 
modules, and selected Gradle project combinations | Embedded or customized 
downstream use | In scope when Polaris code makes security decisions; 
downstream-only integration code is evaluated separately. |
+| Optional integrations | Persistence backends, authentication modes, 
authorization services, storage providers, federation, and optional extensions 
| Runtime-selected or build-enabled features | In scope for coherent supported 
configurations; findings must identify the active variant and relevant 
configuration. |
+| Related Polaris tools | Tools published from `apache/polaris-tools` and 
referenced by the Polaris project or website | Developer, operator, migration, 
synchronization, MCP, benchmark, or web UI tools | Evaluated with tool-specific 
audience, deployment model, protected assets, and trust boundaries; not 
automatically Polaris server vulnerabilities. |
+
+## System Overview
+
+Polaris accepts requests from clients, authenticates callers through configured
+authentication mechanisms, authorizes operations against Polaris roles,
+privileges, policies, and catalog metadata, and persists metadata through the
+configured persistence backend.
+
+Polaris may integrate with external systems, including identity providers,
+object stores, policy decision points, catalog backends, and deployment
+infrastructure. These systems can affect Polaris security when Polaris relies 
on
+their output for authentication, authorization, routing, or persistence
+decisions.
+
+## Actors And Roles
+
+- Anonymous caller: A caller without valid authentication.
+- Authenticated principal: A valid user or service principal known to Polaris.
+- Catalog user: An authenticated principal with privileges on one or more
+  catalog resources.
+- Catalog administrator: A principal with administrative privileges over a
+  catalog.
+- Realm administrator: A principal with administrative privileges over the
+  Polaris realm.
+- Deployment operator: A person or system with access to deploy, configure, or
+  operate Polaris infrastructure.
+- External identity provider: A trusted authentication authority configured by
+  the deployment.
+- External policy decision point: A configured authorization service that may
+  participate in Polaris access decisions.
+- Persistence backend: The configured database or storage system used by
+  Polaris.
+
+## Protected Assets
+
+Polaris treats the following as security-sensitive:
+
+- Authentication credentials, bearer tokens, client secrets, signing keys, and
+  refresh tokens.
+- Principal, role, privilege, and policy metadata.
+- Principal, principal-role, and catalog-role names when a deployment treats
+  identity or role names as sensitive or personal data.
+- Catalog, namespace, table, and view metadata where access is restricted by
+  Polaris authorization.
+- Storage locations, table locations, metadata locations, manifest locations,
+  statistics locations, and other URI-bearing metadata that define where
+  catalog-managed data or metadata resides.
+- Temporary or delegated storage credentials, scoped storage policies,
+  credential access boundaries, session policies, and provider-specific policy
+  expressions.
+- Configuration values that affect authentication, authorization, token
+  validation, policy evaluation, credential vending, storage boundaries,
+  federation, or backend connectivity.
+- Audit-relevant request identity and authorization context.
+
+## Trust Boundaries
+
+Polaris assumes the following boundaries:
+
+- Network callers are untrusted until authenticated.
+- Authenticated principals are not inherently trusted to access resources; 
every
+  protected operation must pass authorization.
+- Request-provided identifiers, catalog names, namespace names, table names,
+  view names, role names, policy names, and principal names are untrusted 
input.
+- Identity-provider claims are trusted only after token validation according to
+  the configured authentication mechanism.
+- External policy decisions are trusted only according to the configured policy
+  integration and the request context Polaris supplies to that integration.
+- Management-plane APIs, catalog data-plane APIs, admin tools, and client tools
+  may expose different metadata and require separate authorization checks.
+- Catalog properties, entity properties, and configuration values may cross 
from
+  management-plane configuration into data-plane client-visible responses; they
+  must be classified by intended visibility before storing sensitive values.
+- Credential vending crosses from Polaris authorization into object-store or
+  external-system authorization; delegated credentials must be scoped to the
+  authorized actor, operation, and storage locations.
+- Storage locations and URI-bearing metadata cross from Polaris metadata into
+  object stores, metastores, filesystems, or external catalogs; caller-supplied
+  locations are untrusted until validated against the effective storage policy.
+- Provider-specific storage policy expressions, IAM policies, access 
boundaries,
+  and session policies are security boundaries and must safely encode
+  caller-controlled identifier or path material.
+- Externally configured endpoints, federation targets, identity-provider
+  metadata, object-store endpoints, and catalog backends are trusted only for
+  the configured purpose and must not silently redirect secrets or privileged
+  requests outside that trust relationship.
+- Persistence backends are trusted to store and return data, but Polaris must 
not
+  rely on callers to enforce authorization before persistence access.
+- Deployment operators are trusted with configuration and infrastructure-level
+  secrets.
+
+## Security Invariants
+
+The following properties must hold:
+
+- Protected API operations require authentication unless explicitly documented 
as
+  public.
+- Authorization checks must be performed before returning or mutating protected
+  catalog metadata.
+- A principal must not be able to grant itself privileges it does not already
+  have authority to grant.
+- Role, privilege, and policy changes must not bypass scope restrictions.
+- Realm, catalog, namespace, table, and view identifiers must not allow access
+  across authorization boundaries.
+- Token issuance, token exchange, credential reset, and credential rotation 
must
+  preserve the intended principal, realm, role scope, expiration, and 
revocation
+  semantics.
+- Tokens, credentials, and secrets must not be logged, returned in API
+  responses, or persisted in plaintext unless explicitly required and protected
+  by deployment controls.
+- Properties and configuration values that are returned to clients must be
+  treated as client-visible and must not be used as a secret store.
+- Storage locations and URI-bearing metadata must be validated against the
+  effective catalog, namespace, table, and storage-policy boundaries before 
they
+  are persisted, used to access storage, or used to mint delegated credentials.
+- Credential vending must not occur before the relevant location, 
authorization,
+  and scope checks have completed.
+- Temporary storage credentials must be scoped as narrowly as the configured
+  storage provider and documented mode allow. Provider-specific limitations 
that
+  broaden scope must be explicit to operators.
+- Reused, overlapping, or ambiguous storage locations must not allow a 
principal

Review Comment:
   I do wonder if this is too broad? It's probably good to keep it for now and 
change it if we get any reports that are "User A can access B with B' access 
controls but they are also allowed to access B"
   
   Probably fine for now though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to