Re: [Proposal] Table-Level Storage Credential Overrides

Adnan Hemani via dev Mon, 02 Mar 2026 22:19:47 -0800

Hey, I just took a quick look. Sorry, since it's a large PR, I do not have
time to review in depth at the moment, but I'm not sure that we're
completely on the same page based on the conversation in this thread.


I mentioned #3409 because it (and you also mentioned this regarding Option
2 in your message on 2/10) does not need any new CRUD endpoints - so I'm
not sure why those are being introduced in this PR. Personally, I don't
think the PR as it stands now accurately reflects the community consensus
on the ML.

Best,
Adnan Hemani

On Mon, Mar 2, 2026 at 10:25 AM Srinivas Rishindra <[email protected]>
wrote:

> Hi All,
>
> I have created a draft pull request to share the progress on this feature
> and gather early feedback:
> https://github.com/apache/polaris/pull/3923/changes .
>
> Please note that this is still a work in progress; additional efforts are
> required for comprehensive testing and code cleanup. I am sharing this
> draft now to ensure the current implementation aligns with the community's
> expectations and the general direction.
>
> I look forward to your thoughts and suggestions.
>
> Best regards,
> Srinivas Rishindra
>
> On Mon, Feb 23, 2026 at 8:26 AM Srinivas Rishindra <[email protected]>
> wrote:
>
>>
>> Hi Sung and Adnan,
>>
>> Thank you for your comments.
>>
>> *To Sung:*
>>
>> While I don't have concrete production workflows available to me at the
>> moment, I can offer an illustrative use case to highlight the broader
>> vision. The general idea is to make the catalog abstraction much more of a
>> logical construct, rather than one that tightly couples to a physical
>> storage configuration or an IAM policy. Currently, a catalog is restricted
>> to a single cloud provider or IAM role, forcing users into
>> infrastructure-driven boundaries.
>>
>> Consider an organization with multiple departments like Sales, Marketing,
>> and Engineering, where each gets its own catalog. Within the Sales catalog,
>> data governance mandates that US data resides in AWS, European data in GCP,
>> and Chinese data in Alibaba Cloud. Currently, these differing storage
>> configurations would force the admin to artificially create separate
>> catalogs per region. By decoupling storage from the catalog level, a sales
>> associate can interact with their accounts as a unified logical unit (e.g.,
>> a namespace per associate, tables per account), while the admin handles the
>> underlying geographic storage complexity behind the scenes.
>>
>> *To Adnan:*
>>
>> I understand your concerns regarding the implementation complexity of
>> Option 1, particularly how it would impact APIs like CreateTable. I
>> agree that starting with Option 2 is a pragmatic first step to make
>> progress, and we can evaluate migrating to Option 1 in the future as user
>> needs evolve.
>>
>> I also reviewed PR #3409 <https://github.com/apache/polaris/pull/3409>
>> and its corresponding issue, #2970 (Support Per-Catalog AWS Credentials
>> in MinIO Deployments) <https://github.com/apache/polaris/issues/2970>.
>> The discussion in that issue correctly highlighted the security risks of
>> persisting raw secrets directly in the configuration object. By leveraging
>> the approach from PR #3409—where named storage credentials are predefined
>> in the server config and referenced by a storageName property—we can
>> cleanly implement Option 2. Embedding just the storageName reference at
>> the table or namespace level elegantly resolves the primary drawbacks I
>> initially listed for Option 2: it prevents duplicating sensitive
>> credentials, allows admins to rotate credentials centrally, and offers
>> reusability without requiring a new top-level entity.
>>
>> Unless there are any objections, I will work on implementing option2 and
>> publish a PR. Please let me know if this sounds like a reasonable path
>> forward.
>>
>> Best regards,
>>
>> Srinivas
>>
>> On Fri, Feb 20, 2026 at 3:22 AM Adnan Hemani via dev <
>> [email protected]> wrote:
>>
>>> Hi all,
>>>
>>> Sorry for the late reply. I still have some concerns about Option 1's
>>> implementation details, which IMO may render it unusable or functionally
>>> handicapped - my comments are on the original design document. If we
>>> choose
>>> Option 1 in the future, I think we will eventually need further scoping
>>> or
>>> discussion on how APIs like CreateTable will work.
>>>
>>> Could we potentially implement Option 2 in the short-term using the
>>> approach in #3409 <https://github.com/apache/polaris/pull/3409>? Maybe
>>> that
>>> will help us keep more of the storage configs in alignment with each
>>> other
>>> (resolving the con about re-usability and solving some of the credential
>>> rotation concerns as well).
>>>
>>> Best,
>>> Adnan Hemani
>>>
>>> On Thu, Feb 19, 2026 at 8:58 AM Sung Yun <[email protected]> wrote:
>>>
>>> > Hi Srinivas,
>>> >
>>> > Thanks for the recap.
>>> >
>>> > I generally agree that Option 1 is the most semantically sound long
>>> term
>>> > approach, assuming credentials themselves live in a secrets manager
>>> and the
>>> > storage configuration only holds references. That feels like the most
>>> > extensible direction as Polaris evolves.
>>> >
>>> > I also agree with Dmitri that there are really two different concerns
>>> > here. One is how storage configuration is modeled and persisted in
>>> Polaris
>>> > as an Entity. The other is how the effective configuration is resolved
>>> for
>>> > a given table across catalog, namespace, and table boundaries. Those
>>> do not
>>> > have to be solved by the same abstraction.
>>> >
>>> > From that perspective, Option 4 is appealing from an implementation
>>> > standpoint, but I share the concern about semantic confusion. Reusing
>>> the
>>> > resolution and inheritance logic that Policy already has makes sense,
>>> but
>>> > using the Policy entity itself to represent storage connectivity feels
>>> > unintuitive and potentially confusing for future users and developers.
>>> >
>>> > Option 1 is IMHO probably the most correct model, but it also requires
>>> the
>>> > most upfront investment. Building on Yufei’s point, it would really
>>> help to
>>> > ground this in concrete user workflows. I think seeking answers to how
>>> > common storage configuration reuse is across many tables, and how they
>>> are
>>> > typically managed (at the namespace level, or at table level)  would
>>> help
>>> > us decide whether to invest in Option 1 now or phase toward it over
>>> time.
>>> >
>>> > Cheers,
>>> > Sung
>>> >
>>> > On 2026/02/17 23:44:23 Srinivas Rishindra wrote:
>>> > > I agree with YuFei. Until we identify more concrete use cases, the
>>> > *inline
>>> > > model* seems to be the best starting point. It is particularly
>>> > well-suited
>>> > > for sparse configurations, where only a few tables in a namespace
>>> require
>>> > > overrides while the rest remain unchanged.
>>> > >
>>> > > *Next Steps:* Unless there are any objections, I will update the
>>> design
>>> > doc
>>> > > to reflect this approach. Once approved, I will proceed with
>>> > implementation.
>>> > >
>>> > > On Wed, Feb 11, 2026 at 3:49 PM Yufei Gu <[email protected]>
>>> wrote:
>>> > >
>>> > > > I’d suggest we start from concrete use cases.
>>> > > >
>>> > > > If the inline model(Option 2) works well for the primary scenarios,
>>> > e.g.,
>>> > > > relatively sparse table level storage overrides, we could adopt it
>>> as a
>>> > > > first phase. It keeps the implementation simple and lets us
>>> validate
>>> > real
>>> > > > world needs before introducing additional abstractions.
>>> > > >
>>> > > > However, if we anticipate frequent configuration rotation or strong
>>> > reuse
>>> > > > requirements across many tables, Option 1 is more compelling. In
>>> that
>>> > case,
>>> > > > I'd recommend reusing the existing policy framework where possible,
>>> > since
>>> > > > it already provides inheritance and attachment semantics. That
>>> could
>>> > help
>>> > > > us avoid introducing significant new complexity into Polaris while
>>> > still
>>> > > > supporting the richer model.
>>> > > > Yufei
>>> > > >
>>> > > >
>>> > > > On Wed, Feb 11, 2026 at 9:12 AM Dmitri Bourlatchkov <
>>> [email protected]>
>>> > > > wrote:
>>> > > >
>>> > > > > Hi Srinivas,
>>> > > > >
>>> > > > > Thanks for the discussion recap! It's very useful to keep the dev
>>> > thread
>>> > > > > and meetings aligned.
>>> > > > >
>>> > > > > Option 1:
>>> > > > > Credential Rotation: Highly efficient. Because the configuration
>>> is
>>> > > > > referenced by ID, rotating a cloud IAM role or secret requires
>>> > updating
>>> > > > > only the single StorageConfiguration entity. [...]
>>> > > > >
>>> > > > >
>>> > > > > This seems to imply that credentials are stored as part of the
>>> > Storage
>>> > > > > Configuration Entity. If so, I do not think this approach is
>>> ideal. I
>>> > > > > believe the secret data should ideally be accessed via the
>>> Secrets
>>> > > > Manager
>>> > > > > [1]. While that discussion is still in progress, I believe it
>>> > > > interconnects
>>> > > > > with this proposal.
>>> > > > >
>>> > > > > [...] All thousands of downstream
>>> > > > > tables referencing it would immediately use the new credentials
>>> > without
>>> > > > > metadata updates.
>>> > > > >
>>> > > > >
>>> > > > > Immediacy is probably from the end-user's perspective.
>>> Internally,
>>> > > > > different Polaris processes may switch to the updated config at
>>> > > > > different moments in time... I do not think it is a problem in
>>> this
>>> > case,
>>> > > > > just wanted to highlight it to make sure distributed system
>>> aspects
>>> > are
>>> > > > not
>>> > > > > left out :)
>>> > > > >
>>> > > > > Option 2:
>>> > > > > Credential Rotation: Credential rotation is difficult [...]
>>> > > > >
>>> > > > >
>>> > > > > Again, I believe actual credentials should be accessed via the
>>> > Secrets
>>> > > > > Manager [1] so some indirection will be present.
>>> > > > >
>>> > > > > Config updates will need to happen individually in each case, but
>>> > actual
>>> > > > > secrets could be shared and updated centrally via the Secrets
>>> > Manager.
>>> > > > >
>>> > > > > ATM, given the complexity points about option 1 that were
>>> brought up
>>> > in
>>> > > > the
>>> > > > > community sync, I tend to favour this option for implementing
>>> this
>>> > > > > proposal. However, this is not a strong requirement by any means,
>>> > just my
>>> > > > > personal opinion. Other opinions are welcome.
>>> > > > >
>>> > > > > Depending on how secret references are handled in code (needs a
>>> POC,
>>> > I
>>> > > > > guess), there could be some synergy with Tornike's approach from
>>> > [3699].
>>> > > > >
>>> > > > > Option 3: Named Catalog-Level Configurations (Hybrid) [...]
>>> > > > >
>>> > > > >
>>> > > > > I would like to clarify the UX story in this case. Do we expect
>>> end
>>> > users
>>> > > > > to manage Storage Configuration in this case or the Polaris
>>> owner?
>>> > > > >
>>> > > > > In the latter case, it seems similar to Tornike's proposal in
>>> [3699]
>>> > but
>>> > > > > generalized to all storage types. The Polaris Admin / Owner could
>>> > use a
>>> > > > > non-public API to work with this configuration (e.g. plain
>>> Quarkus
>>> > > > > configuration or possibly Admin CLI).
>>> > > > >
>>> > > > > Option 4: Leverage Existing Policy Framework [...]
>>> > > > >
>>> > > > >
>>> > > > > I tend to agree with the "semantic confusion" point.
>>> > > > >
>>> > > > > It should be fine to reuse policy-related code in the
>>> implementation
>>> > (if
>>> > > > > possible), but I believe Storage Configuration and related
>>> credential
>>> > > > > management form a distinct use case / feature and deserve
>>> dedicated
>>> > > > > handling in Polaris and the API / UX level.
>>> > > > >
>>> > > > > [1]
>>> https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f
>>> > > > >
>>> > > > > [3699] https://github.com/apache/polaris/pull/3699
>>> > > > >
>>> > > > > Thanks,
>>> > > > > Dmitri.
>>> > > > >
>>> > > > > On Tue, Feb 10, 2026 at 10:19 PM Srinivas Rishindra <
>>> > > > > [email protected]>
>>> > > > > wrote:
>>> > > > >
>>> > > > > > Hi Everyone,
>>> > > > > >
>>> > > > > > We had an opportunity to discuss this feature and my recent
>>> > proposal at
>>> > > > > > the last community sync meeting. I would like to summarize our
>>> > > > > discussion
>>> > > > > > and enumerate the various options we considered to help us
>>> reach a
>>> > > > > > consensus.
>>> > > > > >
>>> > > > > > To recap, storage configuration is currently restricted at the
>>> > catalog
>>> > > > > > level. This limits flexibility for users who need to organize
>>> > tables
>>> > > > > across
>>> > > > > > different storage configurations or cloud providers within a
>>> single
>>> > > > > > catalog. There appears to be general agreement on the utility
>>> of
>>> > this
>>> > > > > > feature; however, we still need to align on the specific
>>> > implementation
>>> > > > > > approach.
>>> > > > > >
>>> > > > > > Here are the various options that were considered.
>>> > > > > > *Option 0: Make Credentials available as part of table
>>> properties.
>>> > > > *(This
>>> > > > > > was my original proposal, but abandoned after becoming aware
>>> of the
>>> > > > > > security implications.)
>>> > > > > >
>>> > > > > > *Option 1: First-Class Storage Configuration Entity *
>>> > > > > >
>>> > > > > > This approach proposes elevating StorageConfiguration to a
>>> > standalone,
>>> > > > > > top-level resource in the Polaris backend (similar to a
>>> Principal,
>>> > > > > > Namespace or Table), independent of the Catalog or Table. This
>>> is
>>> > the
>>> > > > > > approach in my most recent proposal doc.
>>> > > > > > -
>>> > > > > >
>>> > > > > > Data Model: A new StorageConfiguration entity is created with
>>> its
>>> > own
>>> > > > > > unique identifier and lifecycle. Tables and Namespaces would
>>> store
>>> > a
>>> > > > > > reference ID pointing to this entity rather than embedding the
>>> > > > > credentials
>>> > > > > > directly.
>>> > > > > > -
>>> > > > > >
>>> > > > > > Security: This model offers the cleanest security boundary. We
>>> can
>>> > > > > > introduce a specific USAGE privilege on the configuration
>>> entity. A
>>> > > > user
>>> > > > > > would need both CREATE_TABLE on the Namespace *and* USAGE on
>>> the
>>> > > > specific
>>> > > > > > StorageConfiguration to link them.
>>> > > > > > -
>>> > > > > >
>>> > > > > > Credential Rotation: Highly efficient. Because the
>>> configuration is
>>> > > > > > referenced by ID, rotating a cloud IAM role or secret requires
>>> > updating
>>> > > > > > only the single StorageConfiguration entity. All thousands of
>>> > > > downstream
>>> > > > > > tables referencing it would immediately use the new credentials
>>> > without
>>> > > > > > metadata updates.
>>> > > > > > -
>>> > > > > >
>>> > > > > > Inheritance: The reference could be set at the Catalog,
>>> Namespace,
>>> > or
>>> > > > > Table
>>> > > > > > level. If a Table does not specify a reference, it would
>>> inherit
>>> > the
>>> > > > > > reference from its parent Namespace (and so on), preserving the
>>> > current
>>> > > > > > hierarchical behavior while adding granularity.
>>> > > > > >
>>> > > > > > • Pros: Maximum flexibility and reusability (Many-to-Many).
>>> > Updating
>>> > > > one
>>> > > > > > config object propagates to all associated tables.
>>> > > > > > -
>>> > > > > >
>>> > > > > > • Cons: Highest engineering cost. Requires new CRUD APIs, DB
>>> schema
>>> > > > > changes
>>> > > > > > (mapping tables), and complex authorization logic (two-stage
>>> auth
>>> > > > > checks).
>>> > > > > > Risk of accumulating "orphaned" configs
>>> > > > > >
>>> > > > > > Option 2: The "Embedded Field" Model
>>> > > > > > -
>>> > > > > >
>>> > > > > > This approach extends the existing Table and Namespace
>>> entities to
>>> > > > > include
>>> > > > > > a storageConfig field. The parameter can be defaulted to 'null'
>>> > and use
>>> > > > > > parent's storageConfig at runtime.
>>> > > > > >
>>> > > > > > *Data Model:* No new top-level entity is created. The storage
>>> > details
>>> > > > > > (e.g., roleArn) are stored directly into a new, dedicated
>>> column or
>>> > > > > > structure within the existing Table/Namespace entity.
>>> > > > > >
>>> > > > > > Complexity: This could reduce the engineering overhead
>>> > significantly.
>>> > > > > There
>>> > > > > > are no new CRUD endpoints for configuration objects, no
>>> referential
>>> > > > > > integrity checks (e.g., preventing the deletion of a config
>>> used by
>>> > > > > active
>>> > > > > > tables).
>>> > > > > >
>>> > > > > > Credential Rotation: Credential rotation is difficult. If an
>>> IAM
>>> > role
>>> > > > > > changes, an administrator must identify and issue UPDATE
>>> > operations for
>>> > > > > > every individual table or namespace that uses that specific
>>> > > > > configuration,
>>> > > > > > potentially affecting thousands of objects.
>>> > > > > >
>>> > > > > > • Pros: Lowest engineering cost. No new entities or complex
>>> > mappings
>>> > > > are
>>> > > > > > required. Easy to reason about authorization (auth is tied
>>> > strictly to
>>> > > > > the
>>> > > > > > entity).
>>> > > > > >
>>> > > > > > • Cons: No reusability. Configs must be duplicated across
>>> tables;
>>> > > > > rotating
>>> > > > > > credentials for 1,000 tables could require 1,000 update calls.
>>> > > > > >
>>> > > > > > Option 3: Named Catalog-Level Configurations (Hybrid)
>>> > > > > >
>>> > > > > > This can be a combination of Option1 and Option 2
>>> > > > > > Admin can define a registry of "Named Storage Configurations"
>>> > stored
>>> > > > > within
>>> > > > > > the Catalog. Sub-entities (Namespaces/Tables) reference these
>>> > configs
>>> > > > by
>>> > > > > > name (e.g., storage-config: "finance-secure-role").
>>> > > > > >
>>> > > > > > *Data Model:* No separate top level entity is created. The
>>> Catalog
>>> > > > Entity
>>> > > > > > potentially needs to be modified to accommodate named storage
>>> > > > > > configurations.
>>> > > > > >
>>> > > > > > Credential Rotation: Credential Rotation can be done at the
>>> catalog
>>> > > > level
>>> > > > > > for each named Storage Configuration.
>>> > > > > >
>>> > > > > > Inheritance: Works pretty much similar as proposed in option 1
>>> &
>>> > > > option2.
>>> > > > > >
>>> > > > > > Security: Not as secure as option1 but still useful. A
>>> principal
>>> > with
>>> > > > > > proper access can attach any named storage configuration
>>> defined
>>> > at the
>>> > > > > > catalog level to any arbitrary entity within the catalog.
>>> > > > > >
>>> > > > > > • Pros: Good balance of reusability and simplicity. Allows
>>> > updating a
>>> > > > > > config in one place (the Catalog definition) without needing a
>>> > > > full-blown
>>> > > > > > global entity system.
>>> > > > > >
>>> > > > > > • Cons: Scope is limited to the Catalog (cannot share configs
>>> > across
>>> > > > > > catalogs)
>>> > > > > > Option 4: Leverage Existing Policy Framework
>>> > > > > >
>>> > > > > > This approach leverages the existing Apache Polaris Policy
>>> > Framework
>>> > > > > > (currently used for features like snapshot expiry) to manage
>>> > storage
>>> > > > > > settings.
>>> > > > > >
>>> > > > > > Data Model: Storage configurations are defined as "Policies"
>>> at the
>>> > > > > Catalog
>>> > > > > > level. These Policies contain the credential details and can be
>>> > > > attached
>>> > > > > to
>>> > > > > > Namespaces or Tables using the existing policy attachment APIs.
>>> > > > > >
>>> > > > > > Inheritance:  This aligns naturally with Polaris's existing
>>> > > > architecture,
>>> > > > > > where policies cascade from Catalog → Namespace → Table. The
>>> > vending
>>> > > > > logic
>>> > > > > > would simply resolve the "effective" storage policy for a
>>> table at
>>> > > > query
>>> > > > > > time.
>>> > > > > >
>>> > > > > > Security: This utilizes the existing Polaris Privileges and
>>> > attachment
>>> > > > > > privileges. Administrators can define authorized storage
>>> policies
>>> > > > > > centrally, and users can only select from these pre-approved
>>> > policies,
>>> > > > > > preventing them from inputting arbitrary or insecure role ARNs.
>>> > > > > >
>>> > > > > > • Pros:
>>> > > > > >   . Zero New Infrastructure: Reuses the existing "Policy"
>>> entity,
>>> > > > > > persistence layer, and inheritance logic, significantly
>>> reducing
>>> > > > > > engineering effort
>>> > > > > >   . Proven Inheritance: The logic for resolving policies from
>>> > child to
>>> > > > > > parent is already implemented and tested
>>> > > > > >
>>> > > > > > • Cons:
>>> > > > > >   . Semantic Confusion: Policies are typically used for
>>> "governance
>>> > > > > rules"
>>> > > > > > (e.g., snapshot expiry, compaction) rather than "connectivity
>>> > > > > > configuration." Using them for credentials might be unintuitive
>>> > > > > >   . Authorization Complexity: The authorizer would need to
>>> load and
>>> > > > > > evaluate policies to determine how to access data, potentially
>>> > coupling
>>> > > > > > governance logic with data access paths
>>> > > > > >
>>> > > > > > We can potentially start with one of the options initially and
>>> as
>>> > the
>>> > > > > > feature and user needs develop we can migrate to other options
>>> as
>>> > well.
>>> > > > > > Please let me know your thoughts about the various options
>>> above
>>> > or if
>>> > > > on
>>> > > > > > anything that I might have missed so that we can work towards a
>>> > > > consensus
>>> > > > > > on how to implement this feature.
>>> > > > > >
>>> > > > > >
>>> > > > > > On Thu, Feb 5, 2026 at 8:08 AM Tornike Gurgenidze <
>>> > > > > [email protected]>
>>> > > > > > wrote:
>>> > > > > >
>>> > > > > > > Hi,
>>> > > > > > >
>>> > > > > > > To follow up on Dmitri's point about credentials, there's
>>> > already a
>>> > > > PR
>>> > > > > > > <https://github.com/apache/polaris/pull/3409> up that is
>>> going
>>> > to
>>> > > > > allow
>>> > > > > > > predefining named storage credentials in polaris config like
>>> the
>>> > > > > > following:
>>> > > > > > >
>>> > > > > > >    - polaris.storage.aws.<storage-name>.access-key
>>> > > > > > >    - polaris.storage.aws.<storage-name>.secret-key
>>> > > > > > >
>>> > > > > > > then storage configuration will simply refer to it by name
>>> and
>>> > > > > > > inherit credentials.
>>> > > > > > >
>>> > > > > > > I think that can go hand in hand with table-level overrides.
>>> > > > Overriding
>>> > > > > > > each and every aws property for every table doesn't sound
>>> ideal.
>>> > > > > > Defining a
>>> > > > > > > storage configuration upfront and referring to it by name
>>> should
>>> > be a
>>> > > > > > > simpler solution. I can extend the scope of the PR above to
>>> allow
>>> > > > > > > predefining other aws properties as well like endpoint-url
>>> and
>>> > > > region.
>>> > > > > > >
>>> > > > > > > Another point that came up in the discussion surrounding
>>> extra
>>> > > > > > credentials
>>> > > > > > > is how to make sure anyone can't just hijack pre configured
>>> > > > > credentials.
>>> > > > > > > The simplest solution I see there is to ship off properties
>>> to
>>> > OPA
>>> > > > > during
>>> > > > > > > catalog (and table) creation and allow users to write
>>> policies
>>> > based
>>> > > > on
>>> > > > > > > them. If we want to enable internal rbac to have a similar
>>> > capability
>>> > > > > we
>>> > > > > > > can go further and move from config based storage definition
>>> to a
>>> > > > > > separate
>>> > > > > > > `/storage-config` rest resource in management API that will
>>> come
>>> > with
>>> > > > > > > necessary grants and permissions.
>>> > > > > > >
>>> > > > > > > On Thu, Feb 5, 2026 at 5:43 AM Dmitri Bourlatchkov <
>>> > [email protected]
>>> > > > >
>>> > > > > > > wrote:
>>> > > > > > >
>>> > > > > > > > Hi Srinivas,
>>> > > > > > > >
>>> > > > > > > > Thanks for the proposal. It looks good to me overall, a
>>> very
>>> > timely
>>> > > > > > > feature
>>> > > > > > > > to add to Polaris.
>>> > > > > > > >
>>> > > > > > > > I added some comments in the doc and I see this topic on
>>> the
>>> > > > > Community
>>> > > > > > > Sync
>>> > > > > > > > agenda for Feb 5. Looking forward to discussing it online.
>>> > > > > > > >
>>> > > > > > > > I have three points to highlight:
>>> > > > > > > >
>>> > > > > > > > * Dealing with passwords probably connects to the Secrets
>>> > Manager
>>> > > > > > > > discussion [1]
>>> > > > > > > >
>>> > > > > > > > * Persistence needs to consider non-RDBMS backends. OSS
>>> code
>>> > has
>>> > > > both
>>> > > > > > > > PostgreSQL and MongoDB, but private Persistence
>>> > implementations are
>>> > > > > > > > possible too. I believe we need a proper SPI for this, not
>>> > just a
>>> > > > > > > > relational schema example.
>>> > > > > > > >
>>> > > > > > > > * Associating entities (tables, namespaces) to Storage
>>> > > > Configuration
>>> > > > > is
>>> > > > > > > > likely a plugin point that downstream projects may want to
>>> > > > customize.
>>> > > > > > I'd
>>> > > > > > > > propose making another SPI for this. This SPI is probably
>>> > different
>>> > > > > > from
>>> > > > > > > > the new Persistence SPI mentioned above since the concern
>>> here
>>> > is
>>> > > > not
>>> > > > > > > > persistence per se, but the logic of finding the right
>>> storage
>>> > > > > config.
>>> > > > > > > >
>>> > > > > > > > [1]
>>> > > > https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f
>>> > > > > > > >
>>> > > > > > > > Cheers,
>>> > > > > > > > Dmitri.
>>> > > > > > > >
>>> > > > > > > > On Mon, Feb 2, 2026 at 4:18 PM Srinivas Rishindra <
>>> > > > > > > [email protected]>
>>> > > > > > > > wrote:
>>> > > > > > > >
>>> > > > > > > > > Hi all,
>>> > > > > > > > >
>>> > > > > > > > > We had an opportunity to discuss the community sprint
>>> last
>>> > week.
>>> > > > > > Based
>>> > > > > > > on
>>> > > > > > > > > that discussion, I have created a new design doc which I
>>> am
>>> > > > > attaching
>>> > > > > > > > here.
>>> > > > > > > > > In this design instead of passing credentials via table
>>> > > > properties,
>>> > > > > > > this
>>> > > > > > > > > design introduces Inheritable Storage Configurations as a
>>> > > > > first-class
>>> > > > > > > > > feature. Please let me know your thoughts on the
>>> document.
>>> > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> >
>>> https://docs.google.com/document/d/1hbDkE-w84Pn_112iW2vCnlDKPDtyg8flaYcFGjvD120/edit?usp=sharing
>>> > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > > > On Mon, Jan 26, 2026 at 10:42 PM Yufei Gu <
>>> > [email protected]>
>>> > > > > > > wrote:
>>> > > > > > > > >
>>> > > > > > > > > > Hi Srinivas,
>>> > > > > > > > > >
>>> > > > > > > > > > Thanks for sharing this proposal. Persisting long lived
>>> > > > > credentials
>>> > > > > > > > such
>>> > > > > > > > > as
>>> > > > > > > > > > an S3 secret access key directly in table properties
>>> raises
>>> > > > > > > significant
>>> > > > > > > > > > security concerns. Here is an alternative approach
>>> > previously
>>> > > > > > > > discussed,
>>> > > > > > > > > > which enables storage configuration at the table or
>>> > namespace
>>> > > > > > level,
>>> > > > > > > > and
>>> > > > > > > > > it
>>> > > > > > > > > > is probably a more secure and promising direction
>>> overall.
>>> > > > > > > > > >
>>> > > > > > > > > > Yufei
>>> > > > > > > > > >
>>> > > > > > > > > >
>>> > > > > > > > > > On Mon, Jan 26, 2026 at 8:18 PM Srinivas Rishindra <
>>> > > > > > > > > [email protected]
>>> > > > > > > > > > >
>>> > > > > > > > > > wrote:
>>> > > > > > > > > >
>>> > > > > > > > > > > Dear All,
>>> > > > > > > > > > >
>>> > > > > > > > > > > I have developed a design proposal for Table-Level
>>> > Storage
>>> > > > > > > Credential
>>> > > > > > > > > > > Overrides in Apache Polaris.
>>> > > > > > > > > > >
>>> > > > > > > > > > > The core objective is to allow specific storage
>>> > properties to
>>> > > > > be
>>> > > > > > > > > defined
>>> > > > > > > > > > at
>>> > > > > > > > > > > the table level rather than the catalog level,
>>> enabling a
>>> > > > > single
>>> > > > > > > > > logical
>>> > > > > > > > > > > catalog to support tables across disparate storage
>>> > systems.
>>> > > > > > > > Crucially,
>>> > > > > > > > > > the
>>> > > > > > > > > > > implementation ensures these overrides participate
>>> in the
>>> > > > > > > credential
>>> > > > > > > > > > > vending process to maintain secure, scoped access.
>>> > > > > > > > > > >
>>> > > > > > > > > > > I have also implemented a Proof of Concept (POC) pull
>>> > request
>>> > > > > to
>>> > > > > > > > > > > demonstrate the idea. While the current MVP focuses
>>> on
>>> > S3, I
>>> > > > > > intend
>>> > > > > > > > to
>>> > > > > > > > > > > expand scope to include Azure and GCS pending
>>> community
>>> > > > > feedback.
>>> > > > > > > > > > >
>>> > > > > > > > > > > I look forward to your thoughts and suggestions on
>>> this
>>> > > > > proposal.
>>> > > > > > > > > > >
>>> > > > > > > > > > > Links:
>>> > > > > > > > > > >
>>> > > > > > > > > > > - Design Doc: Table-Level Storage Credential
>>> Overrides (
>>> > > > > > > > > > >
>>> > > > > > > > > > >
>>> > > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> >
>>> https://docs.google.com/document/d/1tf4N8GKeyAAYNoP0FQ1zT1Ba3P1nVGgdw3nmnhSm-u0/edit?usp=sharing
>>> > > > > > > > > > > )
>>> > > > > > > > > > > - POC PR:
>>> https://github.com/apache/polaris/pull/3563 (
>>> > > > > > > > > > > https://github.com/apache/polaris/pull/3563)
>>> > > > > > > > > > >
>>> > > > > > > > > > > Best regards,
>>> > > > > > > > > > >
>>> > > > > > > > > > > Srinivas Rishindra Pothireddi
>>> > > > > > > > > > >
>>> > > > > > > > > >
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>

Re: [Proposal] Table-Level Storage Credential Overrides

Reply via email to