Hi all,

Sorry for the late reply. I still have some concerns about Option 1's
implementation details, which IMO may render it unusable or functionally
handicapped - my comments are on the original design document. If we choose
Option 1 in the future, I think we will eventually need further scoping or
discussion on how APIs like CreateTable will work.

Could we potentially implement Option 2 in the short-term using the
approach in #3409 <https://github.com/apache/polaris/pull/3409>? Maybe that
will help us keep more of the storage configs in alignment with each other
(resolving the con about re-usability and solving some of the credential
rotation concerns as well).

Best,
Adnan Hemani

On Thu, Feb 19, 2026 at 8:58 AM Sung Yun <[email protected]> wrote:

> Hi Srinivas,
>
> Thanks for the recap.
>
> I generally agree that Option 1 is the most semantically sound long term
> approach, assuming credentials themselves live in a secrets manager and the
> storage configuration only holds references. That feels like the most
> extensible direction as Polaris evolves.
>
> I also agree with Dmitri that there are really two different concerns
> here. One is how storage configuration is modeled and persisted in Polaris
> as an Entity. The other is how the effective configuration is resolved for
> a given table across catalog, namespace, and table boundaries. Those do not
> have to be solved by the same abstraction.
>
> From that perspective, Option 4 is appealing from an implementation
> standpoint, but I share the concern about semantic confusion. Reusing the
> resolution and inheritance logic that Policy already has makes sense, but
> using the Policy entity itself to represent storage connectivity feels
> unintuitive and potentially confusing for future users and developers.
>
> Option 1 is IMHO probably the most correct model, but it also requires the
> most upfront investment. Building on Yufei’s point, it would really help to
> ground this in concrete user workflows. I think seeking answers to how
> common storage configuration reuse is across many tables, and how they are
> typically managed (at the namespace level, or at table level)  would help
> us decide whether to invest in Option 1 now or phase toward it over time.
>
> Cheers,
> Sung
>
> On 2026/02/17 23:44:23 Srinivas Rishindra wrote:
> > I agree with YuFei. Until we identify more concrete use cases, the
> *inline
> > model* seems to be the best starting point. It is particularly
> well-suited
> > for sparse configurations, where only a few tables in a namespace require
> > overrides while the rest remain unchanged.
> >
> > *Next Steps:* Unless there are any objections, I will update the design
> doc
> > to reflect this approach. Once approved, I will proceed with
> implementation.
> >
> > On Wed, Feb 11, 2026 at 3:49 PM Yufei Gu <[email protected]> wrote:
> >
> > > I’d suggest we start from concrete use cases.
> > >
> > > If the inline model(Option 2) works well for the primary scenarios,
> e.g.,
> > > relatively sparse table level storage overrides, we could adopt it as a
> > > first phase. It keeps the implementation simple and lets us validate
> real
> > > world needs before introducing additional abstractions.
> > >
> > > However, if we anticipate frequent configuration rotation or strong
> reuse
> > > requirements across many tables, Option 1 is more compelling. In that
> case,
> > > I'd recommend reusing the existing policy framework where possible,
> since
> > > it already provides inheritance and attachment semantics. That could
> help
> > > us avoid introducing significant new complexity into Polaris while
> still
> > > supporting the richer model.
> > > Yufei
> > >
> > >
> > > On Wed, Feb 11, 2026 at 9:12 AM Dmitri Bourlatchkov <[email protected]>
> > > wrote:
> > >
> > > > Hi Srinivas,
> > > >
> > > > Thanks for the discussion recap! It's very useful to keep the dev
> thread
> > > > and meetings aligned.
> > > >
> > > > Option 1:
> > > > Credential Rotation: Highly efficient. Because the configuration is
> > > > referenced by ID, rotating a cloud IAM role or secret requires
> updating
> > > > only the single StorageConfiguration entity. [...]
> > > >
> > > >
> > > > This seems to imply that credentials are stored as part of the
> Storage
> > > > Configuration Entity. If so, I do not think this approach is ideal. I
> > > > believe the secret data should ideally be accessed via the Secrets
> > > Manager
> > > > [1]. While that discussion is still in progress, I believe it
> > > interconnects
> > > > with this proposal.
> > > >
> > > > [...] All thousands of downstream
> > > > tables referencing it would immediately use the new credentials
> without
> > > > metadata updates.
> > > >
> > > >
> > > > Immediacy is probably from the end-user's perspective. Internally,
> > > > different Polaris processes may switch to the updated config at
> > > > different moments in time... I do not think it is a problem in this
> case,
> > > > just wanted to highlight it to make sure distributed system aspects
> are
> > > not
> > > > left out :)
> > > >
> > > > Option 2:
> > > > Credential Rotation: Credential rotation is difficult [...]
> > > >
> > > >
> > > > Again, I believe actual credentials should be accessed via the
> Secrets
> > > > Manager [1] so some indirection will be present.
> > > >
> > > > Config updates will need to happen individually in each case, but
> actual
> > > > secrets could be shared and updated centrally via the Secrets
> Manager.
> > > >
> > > > ATM, given the complexity points about option 1 that were brought up
> in
> > > the
> > > > community sync, I tend to favour this option for implementing this
> > > > proposal. However, this is not a strong requirement by any means,
> just my
> > > > personal opinion. Other opinions are welcome.
> > > >
> > > > Depending on how secret references are handled in code (needs a POC,
> I
> > > > guess), there could be some synergy with Tornike's approach from
> [3699].
> > > >
> > > > Option 3: Named Catalog-Level Configurations (Hybrid) [...]
> > > >
> > > >
> > > > I would like to clarify the UX story in this case. Do we expect end
> users
> > > > to manage Storage Configuration in this case or the Polaris owner?
> > > >
> > > > In the latter case, it seems similar to Tornike's proposal in [3699]
> but
> > > > generalized to all storage types. The Polaris Admin / Owner could
> use a
> > > > non-public API to work with this configuration (e.g. plain Quarkus
> > > > configuration or possibly Admin CLI).
> > > >
> > > > Option 4: Leverage Existing Policy Framework [...]
> > > >
> > > >
> > > > I tend to agree with the "semantic confusion" point.
> > > >
> > > > It should be fine to reuse policy-related code in the implementation
> (if
> > > > possible), but I believe Storage Configuration and related credential
> > > > management form a distinct use case / feature and deserve dedicated
> > > > handling in Polaris and the API / UX level.
> > > >
> > > > [1] https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f
> > > >
> > > > [3699] https://github.com/apache/polaris/pull/3699
> > > >
> > > > Thanks,
> > > > Dmitri.
> > > >
> > > > On Tue, Feb 10, 2026 at 10:19 PM Srinivas Rishindra <
> > > > [email protected]>
> > > > wrote:
> > > >
> > > > > Hi Everyone,
> > > > >
> > > > > We had an opportunity to discuss this feature and my recent
> proposal at
> > > > > the last community sync meeting. I would like to summarize our
> > > > discussion
> > > > > and enumerate the various options we considered to help us reach a
> > > > > consensus.
> > > > >
> > > > > To recap, storage configuration is currently restricted at the
> catalog
> > > > > level. This limits flexibility for users who need to organize
> tables
> > > > across
> > > > > different storage configurations or cloud providers within a single
> > > > > catalog. There appears to be general agreement on the utility of
> this
> > > > > feature; however, we still need to align on the specific
> implementation
> > > > > approach.
> > > > >
> > > > > Here are the various options that were considered.
> > > > > *Option 0: Make Credentials available as part of table properties.
> > > *(This
> > > > > was my original proposal, but abandoned after becoming aware of the
> > > > > security implications.)
> > > > >
> > > > > *Option 1: First-Class Storage Configuration Entity *
> > > > >
> > > > > This approach proposes elevating StorageConfiguration to a
> standalone,
> > > > > top-level resource in the Polaris backend (similar to a Principal,
> > > > > Namespace or Table), independent of the Catalog or Table. This is
> the
> > > > > approach in my most recent proposal doc.
> > > > > -
> > > > >
> > > > > Data Model: A new StorageConfiguration entity is created with its
> own
> > > > > unique identifier and lifecycle. Tables and Namespaces would store
> a
> > > > > reference ID pointing to this entity rather than embedding the
> > > > credentials
> > > > > directly.
> > > > > -
> > > > >
> > > > > Security: This model offers the cleanest security boundary. We can
> > > > > introduce a specific USAGE privilege on the configuration entity. A
> > > user
> > > > > would need both CREATE_TABLE on the Namespace *and* USAGE on the
> > > specific
> > > > > StorageConfiguration to link them.
> > > > > -
> > > > >
> > > > > Credential Rotation: Highly efficient. Because the configuration is
> > > > > referenced by ID, rotating a cloud IAM role or secret requires
> updating
> > > > > only the single StorageConfiguration entity. All thousands of
> > > downstream
> > > > > tables referencing it would immediately use the new credentials
> without
> > > > > metadata updates.
> > > > > -
> > > > >
> > > > > Inheritance: The reference could be set at the Catalog, Namespace,
> or
> > > > Table
> > > > > level. If a Table does not specify a reference, it would inherit
> the
> > > > > reference from its parent Namespace (and so on), preserving the
> current
> > > > > hierarchical behavior while adding granularity.
> > > > >
> > > > > • Pros: Maximum flexibility and reusability (Many-to-Many).
> Updating
> > > one
> > > > > config object propagates to all associated tables.
> > > > > -
> > > > >
> > > > > • Cons: Highest engineering cost. Requires new CRUD APIs, DB schema
> > > > changes
> > > > > (mapping tables), and complex authorization logic (two-stage auth
> > > > checks).
> > > > > Risk of accumulating "orphaned" configs
> > > > >
> > > > > Option 2: The "Embedded Field" Model
> > > > > -
> > > > >
> > > > > This approach extends the existing Table and Namespace entities to
> > > > include
> > > > > a storageConfig field. The parameter can be defaulted to 'null'
> and use
> > > > > parent's storageConfig at runtime.
> > > > >
> > > > > *Data Model:* No new top-level entity is created. The storage
> details
> > > > > (e.g., roleArn) are stored directly into a new, dedicated column or
> > > > > structure within the existing Table/Namespace entity.
> > > > >
> > > > > Complexity: This could reduce the engineering overhead
> significantly.
> > > > There
> > > > > are no new CRUD endpoints for configuration objects, no referential
> > > > > integrity checks (e.g., preventing the deletion of a config used by
> > > > active
> > > > > tables).
> > > > >
> > > > > Credential Rotation: Credential rotation is difficult. If an IAM
> role
> > > > > changes, an administrator must identify and issue UPDATE
> operations for
> > > > > every individual table or namespace that uses that specific
> > > > configuration,
> > > > > potentially affecting thousands of objects.
> > > > >
> > > > > • Pros: Lowest engineering cost. No new entities or complex
> mappings
> > > are
> > > > > required. Easy to reason about authorization (auth is tied
> strictly to
> > > > the
> > > > > entity).
> > > > >
> > > > > • Cons: No reusability. Configs must be duplicated across tables;
> > > > rotating
> > > > > credentials for 1,000 tables could require 1,000 update calls.
> > > > >
> > > > > Option 3: Named Catalog-Level Configurations (Hybrid)
> > > > >
> > > > > This can be a combination of Option1 and Option 2
> > > > > Admin can define a registry of "Named Storage Configurations"
> stored
> > > > within
> > > > > the Catalog. Sub-entities (Namespaces/Tables) reference these
> configs
> > > by
> > > > > name (e.g., storage-config: "finance-secure-role").
> > > > >
> > > > > *Data Model:* No separate top level entity is created. The Catalog
> > > Entity
> > > > > potentially needs to be modified to accommodate named storage
> > > > > configurations.
> > > > >
> > > > > Credential Rotation: Credential Rotation can be done at the catalog
> > > level
> > > > > for each named Storage Configuration.
> > > > >
> > > > > Inheritance: Works pretty much similar as proposed in option 1 &
> > > option2.
> > > > >
> > > > > Security: Not as secure as option1 but still useful. A principal
> with
> > > > > proper access can attach any named storage configuration defined
> at the
> > > > > catalog level to any arbitrary entity within the catalog.
> > > > >
> > > > > • Pros: Good balance of reusability and simplicity. Allows
> updating a
> > > > > config in one place (the Catalog definition) without needing a
> > > full-blown
> > > > > global entity system.
> > > > >
> > > > > • Cons: Scope is limited to the Catalog (cannot share configs
> across
> > > > > catalogs)
> > > > > Option 4: Leverage Existing Policy Framework
> > > > >
> > > > > This approach leverages the existing Apache Polaris Policy
> Framework
> > > > > (currently used for features like snapshot expiry) to manage
> storage
> > > > > settings.
> > > > >
> > > > > Data Model: Storage configurations are defined as "Policies" at the
> > > > Catalog
> > > > > level. These Policies contain the credential details and can be
> > > attached
> > > > to
> > > > > Namespaces or Tables using the existing policy attachment APIs.
> > > > >
> > > > > Inheritance:  This aligns naturally with Polaris's existing
> > > architecture,
> > > > > where policies cascade from Catalog → Namespace → Table. The
> vending
> > > > logic
> > > > > would simply resolve the "effective" storage policy for a table at
> > > query
> > > > > time.
> > > > >
> > > > > Security: This utilizes the existing Polaris Privileges and
> attachment
> > > > > privileges. Administrators can define authorized storage policies
> > > > > centrally, and users can only select from these pre-approved
> policies,
> > > > > preventing them from inputting arbitrary or insecure role ARNs.
> > > > >
> > > > > • Pros:
> > > > >   . Zero New Infrastructure: Reuses the existing "Policy" entity,
> > > > > persistence layer, and inheritance logic, significantly reducing
> > > > > engineering effort
> > > > >   . Proven Inheritance: The logic for resolving policies from
> child to
> > > > > parent is already implemented and tested
> > > > >
> > > > > • Cons:
> > > > >   . Semantic Confusion: Policies are typically used for "governance
> > > > rules"
> > > > > (e.g., snapshot expiry, compaction) rather than "connectivity
> > > > > configuration." Using them for credentials might be unintuitive
> > > > >   . Authorization Complexity: The authorizer would need to load and
> > > > > evaluate policies to determine how to access data, potentially
> coupling
> > > > > governance logic with data access paths
> > > > >
> > > > > We can potentially start with one of the options initially and as
> the
> > > > > feature and user needs develop we can migrate to other options as
> well.
> > > > > Please let me know your thoughts about the various options above
> or if
> > > on
> > > > > anything that I might have missed so that we can work towards a
> > > consensus
> > > > > on how to implement this feature.
> > > > >
> > > > >
> > > > > On Thu, Feb 5, 2026 at 8:08 AM Tornike Gurgenidze <
> > > > [email protected]>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > To follow up on Dmitri's point about credentials, there's
> already a
> > > PR
> > > > > > <https://github.com/apache/polaris/pull/3409> up that is going
> to
> > > > allow
> > > > > > predefining named storage credentials in polaris config like the
> > > > > following:
> > > > > >
> > > > > >    - polaris.storage.aws.<storage-name>.access-key
> > > > > >    - polaris.storage.aws.<storage-name>.secret-key
> > > > > >
> > > > > > then storage configuration will simply refer to it by name and
> > > > > > inherit credentials.
> > > > > >
> > > > > > I think that can go hand in hand with table-level overrides.
> > > Overriding
> > > > > > each and every aws property for every table doesn't sound ideal.
> > > > > Defining a
> > > > > > storage configuration upfront and referring to it by name should
> be a
> > > > > > simpler solution. I can extend the scope of the PR above to allow
> > > > > > predefining other aws properties as well like endpoint-url and
> > > region.
> > > > > >
> > > > > > Another point that came up in the discussion surrounding extra
> > > > > credentials
> > > > > > is how to make sure anyone can't just hijack pre configured
> > > > credentials.
> > > > > > The simplest solution I see there is to ship off properties to
> OPA
> > > > during
> > > > > > catalog (and table) creation and allow users to write policies
> based
> > > on
> > > > > > them. If we want to enable internal rbac to have a similar
> capability
> > > > we
> > > > > > can go further and move from config based storage definition to a
> > > > > separate
> > > > > > `/storage-config` rest resource in management API that will come
> with
> > > > > > necessary grants and permissions.
> > > > > >
> > > > > > On Thu, Feb 5, 2026 at 5:43 AM Dmitri Bourlatchkov <
> [email protected]
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Srinivas,
> > > > > > >
> > > > > > > Thanks for the proposal. It looks good to me overall, a very
> timely
> > > > > > feature
> > > > > > > to add to Polaris.
> > > > > > >
> > > > > > > I added some comments in the doc and I see this topic on the
> > > > Community
> > > > > > Sync
> > > > > > > agenda for Feb 5. Looking forward to discussing it online.
> > > > > > >
> > > > > > > I have three points to highlight:
> > > > > > >
> > > > > > > * Dealing with passwords probably connects to the Secrets
> Manager
> > > > > > > discussion [1]
> > > > > > >
> > > > > > > * Persistence needs to consider non-RDBMS backends. OSS code
> has
> > > both
> > > > > > > PostgreSQL and MongoDB, but private Persistence
> implementations are
> > > > > > > possible too. I believe we need a proper SPI for this, not
> just a
> > > > > > > relational schema example.
> > > > > > >
> > > > > > > * Associating entities (tables, namespaces) to Storage
> > > Configuration
> > > > is
> > > > > > > likely a plugin point that downstream projects may want to
> > > customize.
> > > > > I'd
> > > > > > > propose making another SPI for this. This SPI is probably
> different
> > > > > from
> > > > > > > the new Persistence SPI mentioned above since the concern here
> is
> > > not
> > > > > > > persistence per se, but the logic of finding the right storage
> > > > config.
> > > > > > >
> > > > > > > [1]
> > > https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Dmitri.
> > > > > > >
> > > > > > > On Mon, Feb 2, 2026 at 4:18 PM Srinivas Rishindra <
> > > > > > [email protected]>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > We had an opportunity to discuss the community sprint last
> week.
> > > > > Based
> > > > > > on
> > > > > > > > that discussion, I have created a new design doc which I am
> > > > attaching
> > > > > > > here.
> > > > > > > > In this design instead of passing credentials via table
> > > properties,
> > > > > > this
> > > > > > > > design introduces Inheritable Storage Configurations as a
> > > > first-class
> > > > > > > > feature. Please let me know your thoughts on the document.
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://docs.google.com/document/d/1hbDkE-w84Pn_112iW2vCnlDKPDtyg8flaYcFGjvD120/edit?usp=sharing
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jan 26, 2026 at 10:42 PM Yufei Gu <
> [email protected]>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Srinivas,
> > > > > > > > >
> > > > > > > > > Thanks for sharing this proposal. Persisting long lived
> > > > credentials
> > > > > > > such
> > > > > > > > as
> > > > > > > > > an S3 secret access key directly in table properties raises
> > > > > > significant
> > > > > > > > > security concerns. Here is an alternative approach
> previously
> > > > > > > discussed,
> > > > > > > > > which enables storage configuration at the table or
> namespace
> > > > > level,
> > > > > > > and
> > > > > > > > it
> > > > > > > > > is probably a more secure and promising direction overall.
> > > > > > > > >
> > > > > > > > > Yufei
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Jan 26, 2026 at 8:18 PM Srinivas Rishindra <
> > > > > > > > [email protected]
> > > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Dear All,
> > > > > > > > > >
> > > > > > > > > > I have developed a design proposal for Table-Level
> Storage
> > > > > > Credential
> > > > > > > > > > Overrides in Apache Polaris.
> > > > > > > > > >
> > > > > > > > > > The core objective is to allow specific storage
> properties to
> > > > be
> > > > > > > > defined
> > > > > > > > > at
> > > > > > > > > > the table level rather than the catalog level, enabling a
> > > > single
> > > > > > > > logical
> > > > > > > > > > catalog to support tables across disparate storage
> systems.
> > > > > > > Crucially,
> > > > > > > > > the
> > > > > > > > > > implementation ensures these overrides participate in the
> > > > > > credential
> > > > > > > > > > vending process to maintain secure, scoped access.
> > > > > > > > > >
> > > > > > > > > > I have also implemented a Proof of Concept (POC) pull
> request
> > > > to
> > > > > > > > > > demonstrate the idea. While the current MVP focuses on
> S3, I
> > > > > intend
> > > > > > > to
> > > > > > > > > > expand scope to include Azure and GCS pending community
> > > > feedback.
> > > > > > > > > >
> > > > > > > > > > I look forward to your thoughts and suggestions on this
> > > > proposal.
> > > > > > > > > >
> > > > > > > > > > Links:
> > > > > > > > > >
> > > > > > > > > > - Design Doc: Table-Level Storage Credential Overrides (
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> https://docs.google.com/document/d/1tf4N8GKeyAAYNoP0FQ1zT1Ba3P1nVGgdw3nmnhSm-u0/edit?usp=sharing
> > > > > > > > > > )
> > > > > > > > > > - POC PR: https://github.com/apache/polaris/pull/3563 (
> > > > > > > > > > https://github.com/apache/polaris/pull/3563)
> > > > > > > > > >
> > > > > > > > > > Best regards,
> > > > > > > > > >
> > > > > > > > > > Srinivas Rishindra Pothireddi
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to