Hi Sung and Adnan, Thank you for your comments.
*To Sung:* While I don't have concrete production workflows available to me at the moment, I can offer an illustrative use case to highlight the broader vision. The general idea is to make the catalog abstraction much more of a logical construct, rather than one that tightly couples to a physical storage configuration or an IAM policy. Currently, a catalog is restricted to a single cloud provider or IAM role, forcing users into infrastructure-driven boundaries. Consider an organization with multiple departments like Sales, Marketing, and Engineering, where each gets its own catalog. Within the Sales catalog, data governance mandates that US data resides in AWS, European data in GCP, and Chinese data in Alibaba Cloud. Currently, these differing storage configurations would force the admin to artificially create separate catalogs per region. By decoupling storage from the catalog level, a sales associate can interact with their accounts as a unified logical unit (e.g., a namespace per associate, tables per account), while the admin handles the underlying geographic storage complexity behind the scenes. *To Adnan:* I understand your concerns regarding the implementation complexity of Option 1, particularly how it would impact APIs like CreateTable. I agree that starting with Option 2 is a pragmatic first step to make progress, and we can evaluate migrating to Option 1 in the future as user needs evolve. I also reviewed PR #3409 <https://github.com/apache/polaris/pull/3409> and its corresponding issue, #2970 (Support Per-Catalog AWS Credentials in MinIO Deployments) <https://github.com/apache/polaris/issues/2970>. The discussion in that issue correctly highlighted the security risks of persisting raw secrets directly in the configuration object. By leveraging the approach from PR #3409—where named storage credentials are predefined in the server config and referenced by a storageName property—we can cleanly implement Option 2. Embedding just the storageName reference at the table or namespace level elegantly resolves the primary drawbacks I initially listed for Option 2: it prevents duplicating sensitive credentials, allows admins to rotate credentials centrally, and offers reusability without requiring a new top-level entity. Unless there are any objections, I will work on implementing option2 and publish a PR. Please let me know if this sounds like a reasonable path forward. Best regards, Srinivas On Fri, Feb 20, 2026 at 3:22 AM Adnan Hemani via dev <[email protected]> wrote: > Hi all, > > Sorry for the late reply. I still have some concerns about Option 1's > implementation details, which IMO may render it unusable or functionally > handicapped - my comments are on the original design document. If we choose > Option 1 in the future, I think we will eventually need further scoping or > discussion on how APIs like CreateTable will work. > > Could we potentially implement Option 2 in the short-term using the > approach in #3409 <https://github.com/apache/polaris/pull/3409>? Maybe > that > will help us keep more of the storage configs in alignment with each other > (resolving the con about re-usability and solving some of the credential > rotation concerns as well). > > Best, > Adnan Hemani > > On Thu, Feb 19, 2026 at 8:58 AM Sung Yun <[email protected]> wrote: > > > Hi Srinivas, > > > > Thanks for the recap. > > > > I generally agree that Option 1 is the most semantically sound long term > > approach, assuming credentials themselves live in a secrets manager and > the > > storage configuration only holds references. That feels like the most > > extensible direction as Polaris evolves. > > > > I also agree with Dmitri that there are really two different concerns > > here. One is how storage configuration is modeled and persisted in > Polaris > > as an Entity. The other is how the effective configuration is resolved > for > > a given table across catalog, namespace, and table boundaries. Those do > not > > have to be solved by the same abstraction. > > > > From that perspective, Option 4 is appealing from an implementation > > standpoint, but I share the concern about semantic confusion. Reusing the > > resolution and inheritance logic that Policy already has makes sense, but > > using the Policy entity itself to represent storage connectivity feels > > unintuitive and potentially confusing for future users and developers. > > > > Option 1 is IMHO probably the most correct model, but it also requires > the > > most upfront investment. Building on Yufei’s point, it would really help > to > > ground this in concrete user workflows. I think seeking answers to how > > common storage configuration reuse is across many tables, and how they > are > > typically managed (at the namespace level, or at table level) would help > > us decide whether to invest in Option 1 now or phase toward it over time. > > > > Cheers, > > Sung > > > > On 2026/02/17 23:44:23 Srinivas Rishindra wrote: > > > I agree with YuFei. Until we identify more concrete use cases, the > > *inline > > > model* seems to be the best starting point. It is particularly > > well-suited > > > for sparse configurations, where only a few tables in a namespace > require > > > overrides while the rest remain unchanged. > > > > > > *Next Steps:* Unless there are any objections, I will update the design > > doc > > > to reflect this approach. Once approved, I will proceed with > > implementation. > > > > > > On Wed, Feb 11, 2026 at 3:49 PM Yufei Gu <[email protected]> wrote: > > > > > > > I’d suggest we start from concrete use cases. > > > > > > > > If the inline model(Option 2) works well for the primary scenarios, > > e.g., > > > > relatively sparse table level storage overrides, we could adopt it > as a > > > > first phase. It keeps the implementation simple and lets us validate > > real > > > > world needs before introducing additional abstractions. > > > > > > > > However, if we anticipate frequent configuration rotation or strong > > reuse > > > > requirements across many tables, Option 1 is more compelling. In that > > case, > > > > I'd recommend reusing the existing policy framework where possible, > > since > > > > it already provides inheritance and attachment semantics. That could > > help > > > > us avoid introducing significant new complexity into Polaris while > > still > > > > supporting the richer model. > > > > Yufei > > > > > > > > > > > > On Wed, Feb 11, 2026 at 9:12 AM Dmitri Bourlatchkov < > [email protected]> > > > > wrote: > > > > > > > > > Hi Srinivas, > > > > > > > > > > Thanks for the discussion recap! It's very useful to keep the dev > > thread > > > > > and meetings aligned. > > > > > > > > > > Option 1: > > > > > Credential Rotation: Highly efficient. Because the configuration is > > > > > referenced by ID, rotating a cloud IAM role or secret requires > > updating > > > > > only the single StorageConfiguration entity. [...] > > > > > > > > > > > > > > > This seems to imply that credentials are stored as part of the > > Storage > > > > > Configuration Entity. If so, I do not think this approach is > ideal. I > > > > > believe the secret data should ideally be accessed via the Secrets > > > > Manager > > > > > [1]. While that discussion is still in progress, I believe it > > > > interconnects > > > > > with this proposal. > > > > > > > > > > [...] All thousands of downstream > > > > > tables referencing it would immediately use the new credentials > > without > > > > > metadata updates. > > > > > > > > > > > > > > > Immediacy is probably from the end-user's perspective. Internally, > > > > > different Polaris processes may switch to the updated config at > > > > > different moments in time... I do not think it is a problem in this > > case, > > > > > just wanted to highlight it to make sure distributed system aspects > > are > > > > not > > > > > left out :) > > > > > > > > > > Option 2: > > > > > Credential Rotation: Credential rotation is difficult [...] > > > > > > > > > > > > > > > Again, I believe actual credentials should be accessed via the > > Secrets > > > > > Manager [1] so some indirection will be present. > > > > > > > > > > Config updates will need to happen individually in each case, but > > actual > > > > > secrets could be shared and updated centrally via the Secrets > > Manager. > > > > > > > > > > ATM, given the complexity points about option 1 that were brought > up > > in > > > > the > > > > > community sync, I tend to favour this option for implementing this > > > > > proposal. However, this is not a strong requirement by any means, > > just my > > > > > personal opinion. Other opinions are welcome. > > > > > > > > > > Depending on how secret references are handled in code (needs a > POC, > > I > > > > > guess), there could be some synergy with Tornike's approach from > > [3699]. > > > > > > > > > > Option 3: Named Catalog-Level Configurations (Hybrid) [...] > > > > > > > > > > > > > > > I would like to clarify the UX story in this case. Do we expect end > > users > > > > > to manage Storage Configuration in this case or the Polaris owner? > > > > > > > > > > In the latter case, it seems similar to Tornike's proposal in > [3699] > > but > > > > > generalized to all storage types. The Polaris Admin / Owner could > > use a > > > > > non-public API to work with this configuration (e.g. plain Quarkus > > > > > configuration or possibly Admin CLI). > > > > > > > > > > Option 4: Leverage Existing Policy Framework [...] > > > > > > > > > > > > > > > I tend to agree with the "semantic confusion" point. > > > > > > > > > > It should be fine to reuse policy-related code in the > implementation > > (if > > > > > possible), but I believe Storage Configuration and related > credential > > > > > management form a distinct use case / feature and deserve dedicated > > > > > handling in Polaris and the API / UX level. > > > > > > > > > > [1] > https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f > > > > > > > > > > [3699] https://github.com/apache/polaris/pull/3699 > > > > > > > > > > Thanks, > > > > > Dmitri. > > > > > > > > > > On Tue, Feb 10, 2026 at 10:19 PM Srinivas Rishindra < > > > > > [email protected]> > > > > > wrote: > > > > > > > > > > > Hi Everyone, > > > > > > > > > > > > We had an opportunity to discuss this feature and my recent > > proposal at > > > > > > the last community sync meeting. I would like to summarize our > > > > > discussion > > > > > > and enumerate the various options we considered to help us reach > a > > > > > > consensus. > > > > > > > > > > > > To recap, storage configuration is currently restricted at the > > catalog > > > > > > level. This limits flexibility for users who need to organize > > tables > > > > > across > > > > > > different storage configurations or cloud providers within a > single > > > > > > catalog. There appears to be general agreement on the utility of > > this > > > > > > feature; however, we still need to align on the specific > > implementation > > > > > > approach. > > > > > > > > > > > > Here are the various options that were considered. > > > > > > *Option 0: Make Credentials available as part of table > properties. > > > > *(This > > > > > > was my original proposal, but abandoned after becoming aware of > the > > > > > > security implications.) > > > > > > > > > > > > *Option 1: First-Class Storage Configuration Entity * > > > > > > > > > > > > This approach proposes elevating StorageConfiguration to a > > standalone, > > > > > > top-level resource in the Polaris backend (similar to a > Principal, > > > > > > Namespace or Table), independent of the Catalog or Table. This is > > the > > > > > > approach in my most recent proposal doc. > > > > > > - > > > > > > > > > > > > Data Model: A new StorageConfiguration entity is created with its > > own > > > > > > unique identifier and lifecycle. Tables and Namespaces would > store > > a > > > > > > reference ID pointing to this entity rather than embedding the > > > > > credentials > > > > > > directly. > > > > > > - > > > > > > > > > > > > Security: This model offers the cleanest security boundary. We > can > > > > > > introduce a specific USAGE privilege on the configuration > entity. A > > > > user > > > > > > would need both CREATE_TABLE on the Namespace *and* USAGE on the > > > > specific > > > > > > StorageConfiguration to link them. > > > > > > - > > > > > > > > > > > > Credential Rotation: Highly efficient. Because the configuration > is > > > > > > referenced by ID, rotating a cloud IAM role or secret requires > > updating > > > > > > only the single StorageConfiguration entity. All thousands of > > > > downstream > > > > > > tables referencing it would immediately use the new credentials > > without > > > > > > metadata updates. > > > > > > - > > > > > > > > > > > > Inheritance: The reference could be set at the Catalog, > Namespace, > > or > > > > > Table > > > > > > level. If a Table does not specify a reference, it would inherit > > the > > > > > > reference from its parent Namespace (and so on), preserving the > > current > > > > > > hierarchical behavior while adding granularity. > > > > > > > > > > > > • Pros: Maximum flexibility and reusability (Many-to-Many). > > Updating > > > > one > > > > > > config object propagates to all associated tables. > > > > > > - > > > > > > > > > > > > • Cons: Highest engineering cost. Requires new CRUD APIs, DB > schema > > > > > changes > > > > > > (mapping tables), and complex authorization logic (two-stage auth > > > > > checks). > > > > > > Risk of accumulating "orphaned" configs > > > > > > > > > > > > Option 2: The "Embedded Field" Model > > > > > > - > > > > > > > > > > > > This approach extends the existing Table and Namespace entities > to > > > > > include > > > > > > a storageConfig field. The parameter can be defaulted to 'null' > > and use > > > > > > parent's storageConfig at runtime. > > > > > > > > > > > > *Data Model:* No new top-level entity is created. The storage > > details > > > > > > (e.g., roleArn) are stored directly into a new, dedicated column > or > > > > > > structure within the existing Table/Namespace entity. > > > > > > > > > > > > Complexity: This could reduce the engineering overhead > > significantly. > > > > > There > > > > > > are no new CRUD endpoints for configuration objects, no > referential > > > > > > integrity checks (e.g., preventing the deletion of a config used > by > > > > > active > > > > > > tables). > > > > > > > > > > > > Credential Rotation: Credential rotation is difficult. If an IAM > > role > > > > > > changes, an administrator must identify and issue UPDATE > > operations for > > > > > > every individual table or namespace that uses that specific > > > > > configuration, > > > > > > potentially affecting thousands of objects. > > > > > > > > > > > > • Pros: Lowest engineering cost. No new entities or complex > > mappings > > > > are > > > > > > required. Easy to reason about authorization (auth is tied > > strictly to > > > > > the > > > > > > entity). > > > > > > > > > > > > • Cons: No reusability. Configs must be duplicated across tables; > > > > > rotating > > > > > > credentials for 1,000 tables could require 1,000 update calls. > > > > > > > > > > > > Option 3: Named Catalog-Level Configurations (Hybrid) > > > > > > > > > > > > This can be a combination of Option1 and Option 2 > > > > > > Admin can define a registry of "Named Storage Configurations" > > stored > > > > > within > > > > > > the Catalog. Sub-entities (Namespaces/Tables) reference these > > configs > > > > by > > > > > > name (e.g., storage-config: "finance-secure-role"). > > > > > > > > > > > > *Data Model:* No separate top level entity is created. The > Catalog > > > > Entity > > > > > > potentially needs to be modified to accommodate named storage > > > > > > configurations. > > > > > > > > > > > > Credential Rotation: Credential Rotation can be done at the > catalog > > > > level > > > > > > for each named Storage Configuration. > > > > > > > > > > > > Inheritance: Works pretty much similar as proposed in option 1 & > > > > option2. > > > > > > > > > > > > Security: Not as secure as option1 but still useful. A principal > > with > > > > > > proper access can attach any named storage configuration defined > > at the > > > > > > catalog level to any arbitrary entity within the catalog. > > > > > > > > > > > > • Pros: Good balance of reusability and simplicity. Allows > > updating a > > > > > > config in one place (the Catalog definition) without needing a > > > > full-blown > > > > > > global entity system. > > > > > > > > > > > > • Cons: Scope is limited to the Catalog (cannot share configs > > across > > > > > > catalogs) > > > > > > Option 4: Leverage Existing Policy Framework > > > > > > > > > > > > This approach leverages the existing Apache Polaris Policy > > Framework > > > > > > (currently used for features like snapshot expiry) to manage > > storage > > > > > > settings. > > > > > > > > > > > > Data Model: Storage configurations are defined as "Policies" at > the > > > > > Catalog > > > > > > level. These Policies contain the credential details and can be > > > > attached > > > > > to > > > > > > Namespaces or Tables using the existing policy attachment APIs. > > > > > > > > > > > > Inheritance: This aligns naturally with Polaris's existing > > > > architecture, > > > > > > where policies cascade from Catalog → Namespace → Table. The > > vending > > > > > logic > > > > > > would simply resolve the "effective" storage policy for a table > at > > > > query > > > > > > time. > > > > > > > > > > > > Security: This utilizes the existing Polaris Privileges and > > attachment > > > > > > privileges. Administrators can define authorized storage policies > > > > > > centrally, and users can only select from these pre-approved > > policies, > > > > > > preventing them from inputting arbitrary or insecure role ARNs. > > > > > > > > > > > > • Pros: > > > > > > . Zero New Infrastructure: Reuses the existing "Policy" entity, > > > > > > persistence layer, and inheritance logic, significantly reducing > > > > > > engineering effort > > > > > > . Proven Inheritance: The logic for resolving policies from > > child to > > > > > > parent is already implemented and tested > > > > > > > > > > > > • Cons: > > > > > > . Semantic Confusion: Policies are typically used for > "governance > > > > > rules" > > > > > > (e.g., snapshot expiry, compaction) rather than "connectivity > > > > > > configuration." Using them for credentials might be unintuitive > > > > > > . Authorization Complexity: The authorizer would need to load > and > > > > > > evaluate policies to determine how to access data, potentially > > coupling > > > > > > governance logic with data access paths > > > > > > > > > > > > We can potentially start with one of the options initially and as > > the > > > > > > feature and user needs develop we can migrate to other options as > > well. > > > > > > Please let me know your thoughts about the various options above > > or if > > > > on > > > > > > anything that I might have missed so that we can work towards a > > > > consensus > > > > > > on how to implement this feature. > > > > > > > > > > > > > > > > > > On Thu, Feb 5, 2026 at 8:08 AM Tornike Gurgenidze < > > > > > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > To follow up on Dmitri's point about credentials, there's > > already a > > > > PR > > > > > > > <https://github.com/apache/polaris/pull/3409> up that is going > > to > > > > > allow > > > > > > > predefining named storage credentials in polaris config like > the > > > > > > following: > > > > > > > > > > > > > > - polaris.storage.aws.<storage-name>.access-key > > > > > > > - polaris.storage.aws.<storage-name>.secret-key > > > > > > > > > > > > > > then storage configuration will simply refer to it by name and > > > > > > > inherit credentials. > > > > > > > > > > > > > > I think that can go hand in hand with table-level overrides. > > > > Overriding > > > > > > > each and every aws property for every table doesn't sound > ideal. > > > > > > Defining a > > > > > > > storage configuration upfront and referring to it by name > should > > be a > > > > > > > simpler solution. I can extend the scope of the PR above to > allow > > > > > > > predefining other aws properties as well like endpoint-url and > > > > region. > > > > > > > > > > > > > > Another point that came up in the discussion surrounding extra > > > > > > credentials > > > > > > > is how to make sure anyone can't just hijack pre configured > > > > > credentials. > > > > > > > The simplest solution I see there is to ship off properties to > > OPA > > > > > during > > > > > > > catalog (and table) creation and allow users to write policies > > based > > > > on > > > > > > > them. If we want to enable internal rbac to have a similar > > capability > > > > > we > > > > > > > can go further and move from config based storage definition > to a > > > > > > separate > > > > > > > `/storage-config` rest resource in management API that will > come > > with > > > > > > > necessary grants and permissions. > > > > > > > > > > > > > > On Thu, Feb 5, 2026 at 5:43 AM Dmitri Bourlatchkov < > > [email protected] > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > > > > > Thanks for the proposal. It looks good to me overall, a very > > timely > > > > > > > feature > > > > > > > > to add to Polaris. > > > > > > > > > > > > > > > > I added some comments in the doc and I see this topic on the > > > > > Community > > > > > > > Sync > > > > > > > > agenda for Feb 5. Looking forward to discussing it online. > > > > > > > > > > > > > > > > I have three points to highlight: > > > > > > > > > > > > > > > > * Dealing with passwords probably connects to the Secrets > > Manager > > > > > > > > discussion [1] > > > > > > > > > > > > > > > > * Persistence needs to consider non-RDBMS backends. OSS code > > has > > > > both > > > > > > > > PostgreSQL and MongoDB, but private Persistence > > implementations are > > > > > > > > possible too. I believe we need a proper SPI for this, not > > just a > > > > > > > > relational schema example. > > > > > > > > > > > > > > > > * Associating entities (tables, namespaces) to Storage > > > > Configuration > > > > > is > > > > > > > > likely a plugin point that downstream projects may want to > > > > customize. > > > > > > I'd > > > > > > > > propose making another SPI for this. This SPI is probably > > different > > > > > > from > > > > > > > > the new Persistence SPI mentioned above since the concern > here > > is > > > > not > > > > > > > > persistence per se, but the logic of finding the right > storage > > > > > config. > > > > > > > > > > > > > > > > [1] > > > > https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f > > > > > > > > > > > > > > > > Cheers, > > > > > > > > Dmitri. > > > > > > > > > > > > > > > > On Mon, Feb 2, 2026 at 4:18 PM Srinivas Rishindra < > > > > > > > [email protected]> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > > > We had an opportunity to discuss the community sprint last > > week. > > > > > > Based > > > > > > > on > > > > > > > > > that discussion, I have created a new design doc which I am > > > > > attaching > > > > > > > > here. > > > > > > > > > In this design instead of passing credentials via table > > > > properties, > > > > > > > this > > > > > > > > > design introduces Inheritable Storage Configurations as a > > > > > first-class > > > > > > > > > feature. Please let me know your thoughts on the document. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1hbDkE-w84Pn_112iW2vCnlDKPDtyg8flaYcFGjvD120/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jan 26, 2026 at 10:42 PM Yufei Gu < > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > > > > > > > > > Thanks for sharing this proposal. Persisting long lived > > > > > credentials > > > > > > > > such > > > > > > > > > as > > > > > > > > > > an S3 secret access key directly in table properties > raises > > > > > > > significant > > > > > > > > > > security concerns. Here is an alternative approach > > previously > > > > > > > > discussed, > > > > > > > > > > which enables storage configuration at the table or > > namespace > > > > > > level, > > > > > > > > and > > > > > > > > > it > > > > > > > > > > is probably a more secure and promising direction > overall. > > > > > > > > > > > > > > > > > > > > Yufei > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jan 26, 2026 at 8:18 PM Srinivas Rishindra < > > > > > > > > > [email protected] > > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Dear All, > > > > > > > > > > > > > > > > > > > > > > I have developed a design proposal for Table-Level > > Storage > > > > > > > Credential > > > > > > > > > > > Overrides in Apache Polaris. > > > > > > > > > > > > > > > > > > > > > > The core objective is to allow specific storage > > properties to > > > > > be > > > > > > > > > defined > > > > > > > > > > at > > > > > > > > > > > the table level rather than the catalog level, > enabling a > > > > > single > > > > > > > > > logical > > > > > > > > > > > catalog to support tables across disparate storage > > systems. > > > > > > > > Crucially, > > > > > > > > > > the > > > > > > > > > > > implementation ensures these overrides participate in > the > > > > > > > credential > > > > > > > > > > > vending process to maintain secure, scoped access. > > > > > > > > > > > > > > > > > > > > > > I have also implemented a Proof of Concept (POC) pull > > request > > > > > to > > > > > > > > > > > demonstrate the idea. While the current MVP focuses on > > S3, I > > > > > > intend > > > > > > > > to > > > > > > > > > > > expand scope to include Azure and GCS pending community > > > > > feedback. > > > > > > > > > > > > > > > > > > > > > > I look forward to your thoughts and suggestions on this > > > > > proposal. > > > > > > > > > > > > > > > > > > > > > > Links: > > > > > > > > > > > > > > > > > > > > > > - Design Doc: Table-Level Storage Credential Overrides > ( > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1tf4N8GKeyAAYNoP0FQ1zT1Ba3P1nVGgdw3nmnhSm-u0/edit?usp=sharing > > > > > > > > > > > ) > > > > > > > > > > > - POC PR: https://github.com/apache/polaris/pull/3563 > ( > > > > > > > > > > > https://github.com/apache/polaris/pull/3563) > > > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > > > > > > > > > Srinivas Rishindra Pothireddi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
