Hi all, Sorry for the late reply. I still have some concerns about Option 1's implementation details, which IMO may render it unusable or functionally handicapped - my comments are on the original design document. If we choose Option 1 in the future, I think we will eventually need further scoping or discussion on how APIs like CreateTable will work.
Could we potentially implement Option 2 in the short-term using the approach in #3409 <https://github.com/apache/polaris/pull/3409>? Maybe that will help us keep more of the storage configs in alignment with each other (resolving the con about re-usability and solving some of the credential rotation concerns as well). Best, Adnan Hemani On Thu, Feb 19, 2026 at 8:58 AM Sung Yun <[email protected]> wrote: > Hi Srinivas, > > Thanks for the recap. > > I generally agree that Option 1 is the most semantically sound long term > approach, assuming credentials themselves live in a secrets manager and the > storage configuration only holds references. That feels like the most > extensible direction as Polaris evolves. > > I also agree with Dmitri that there are really two different concerns > here. One is how storage configuration is modeled and persisted in Polaris > as an Entity. The other is how the effective configuration is resolved for > a given table across catalog, namespace, and table boundaries. Those do not > have to be solved by the same abstraction. > > From that perspective, Option 4 is appealing from an implementation > standpoint, but I share the concern about semantic confusion. Reusing the > resolution and inheritance logic that Policy already has makes sense, but > using the Policy entity itself to represent storage connectivity feels > unintuitive and potentially confusing for future users and developers. > > Option 1 is IMHO probably the most correct model, but it also requires the > most upfront investment. Building on Yufei’s point, it would really help to > ground this in concrete user workflows. I think seeking answers to how > common storage configuration reuse is across many tables, and how they are > typically managed (at the namespace level, or at table level) would help > us decide whether to invest in Option 1 now or phase toward it over time. > > Cheers, > Sung > > On 2026/02/17 23:44:23 Srinivas Rishindra wrote: > > I agree with YuFei. Until we identify more concrete use cases, the > *inline > > model* seems to be the best starting point. It is particularly > well-suited > > for sparse configurations, where only a few tables in a namespace require > > overrides while the rest remain unchanged. > > > > *Next Steps:* Unless there are any objections, I will update the design > doc > > to reflect this approach. Once approved, I will proceed with > implementation. > > > > On Wed, Feb 11, 2026 at 3:49 PM Yufei Gu <[email protected]> wrote: > > > > > I’d suggest we start from concrete use cases. > > > > > > If the inline model(Option 2) works well for the primary scenarios, > e.g., > > > relatively sparse table level storage overrides, we could adopt it as a > > > first phase. It keeps the implementation simple and lets us validate > real > > > world needs before introducing additional abstractions. > > > > > > However, if we anticipate frequent configuration rotation or strong > reuse > > > requirements across many tables, Option 1 is more compelling. In that > case, > > > I'd recommend reusing the existing policy framework where possible, > since > > > it already provides inheritance and attachment semantics. That could > help > > > us avoid introducing significant new complexity into Polaris while > still > > > supporting the richer model. > > > Yufei > > > > > > > > > On Wed, Feb 11, 2026 at 9:12 AM Dmitri Bourlatchkov <[email protected]> > > > wrote: > > > > > > > Hi Srinivas, > > > > > > > > Thanks for the discussion recap! It's very useful to keep the dev > thread > > > > and meetings aligned. > > > > > > > > Option 1: > > > > Credential Rotation: Highly efficient. Because the configuration is > > > > referenced by ID, rotating a cloud IAM role or secret requires > updating > > > > only the single StorageConfiguration entity. [...] > > > > > > > > > > > > This seems to imply that credentials are stored as part of the > Storage > > > > Configuration Entity. If so, I do not think this approach is ideal. I > > > > believe the secret data should ideally be accessed via the Secrets > > > Manager > > > > [1]. While that discussion is still in progress, I believe it > > > interconnects > > > > with this proposal. > > > > > > > > [...] All thousands of downstream > > > > tables referencing it would immediately use the new credentials > without > > > > metadata updates. > > > > > > > > > > > > Immediacy is probably from the end-user's perspective. Internally, > > > > different Polaris processes may switch to the updated config at > > > > different moments in time... I do not think it is a problem in this > case, > > > > just wanted to highlight it to make sure distributed system aspects > are > > > not > > > > left out :) > > > > > > > > Option 2: > > > > Credential Rotation: Credential rotation is difficult [...] > > > > > > > > > > > > Again, I believe actual credentials should be accessed via the > Secrets > > > > Manager [1] so some indirection will be present. > > > > > > > > Config updates will need to happen individually in each case, but > actual > > > > secrets could be shared and updated centrally via the Secrets > Manager. > > > > > > > > ATM, given the complexity points about option 1 that were brought up > in > > > the > > > > community sync, I tend to favour this option for implementing this > > > > proposal. However, this is not a strong requirement by any means, > just my > > > > personal opinion. Other opinions are welcome. > > > > > > > > Depending on how secret references are handled in code (needs a POC, > I > > > > guess), there could be some synergy with Tornike's approach from > [3699]. > > > > > > > > Option 3: Named Catalog-Level Configurations (Hybrid) [...] > > > > > > > > > > > > I would like to clarify the UX story in this case. Do we expect end > users > > > > to manage Storage Configuration in this case or the Polaris owner? > > > > > > > > In the latter case, it seems similar to Tornike's proposal in [3699] > but > > > > generalized to all storage types. The Polaris Admin / Owner could > use a > > > > non-public API to work with this configuration (e.g. plain Quarkus > > > > configuration or possibly Admin CLI). > > > > > > > > Option 4: Leverage Existing Policy Framework [...] > > > > > > > > > > > > I tend to agree with the "semantic confusion" point. > > > > > > > > It should be fine to reuse policy-related code in the implementation > (if > > > > possible), but I believe Storage Configuration and related credential > > > > management form a distinct use case / feature and deserve dedicated > > > > handling in Polaris and the API / UX level. > > > > > > > > [1] https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f > > > > > > > > [3699] https://github.com/apache/polaris/pull/3699 > > > > > > > > Thanks, > > > > Dmitri. > > > > > > > > On Tue, Feb 10, 2026 at 10:19 PM Srinivas Rishindra < > > > > [email protected]> > > > > wrote: > > > > > > > > > Hi Everyone, > > > > > > > > > > We had an opportunity to discuss this feature and my recent > proposal at > > > > > the last community sync meeting. I would like to summarize our > > > > discussion > > > > > and enumerate the various options we considered to help us reach a > > > > > consensus. > > > > > > > > > > To recap, storage configuration is currently restricted at the > catalog > > > > > level. This limits flexibility for users who need to organize > tables > > > > across > > > > > different storage configurations or cloud providers within a single > > > > > catalog. There appears to be general agreement on the utility of > this > > > > > feature; however, we still need to align on the specific > implementation > > > > > approach. > > > > > > > > > > Here are the various options that were considered. > > > > > *Option 0: Make Credentials available as part of table properties. > > > *(This > > > > > was my original proposal, but abandoned after becoming aware of the > > > > > security implications.) > > > > > > > > > > *Option 1: First-Class Storage Configuration Entity * > > > > > > > > > > This approach proposes elevating StorageConfiguration to a > standalone, > > > > > top-level resource in the Polaris backend (similar to a Principal, > > > > > Namespace or Table), independent of the Catalog or Table. This is > the > > > > > approach in my most recent proposal doc. > > > > > - > > > > > > > > > > Data Model: A new StorageConfiguration entity is created with its > own > > > > > unique identifier and lifecycle. Tables and Namespaces would store > a > > > > > reference ID pointing to this entity rather than embedding the > > > > credentials > > > > > directly. > > > > > - > > > > > > > > > > Security: This model offers the cleanest security boundary. We can > > > > > introduce a specific USAGE privilege on the configuration entity. A > > > user > > > > > would need both CREATE_TABLE on the Namespace *and* USAGE on the > > > specific > > > > > StorageConfiguration to link them. > > > > > - > > > > > > > > > > Credential Rotation: Highly efficient. Because the configuration is > > > > > referenced by ID, rotating a cloud IAM role or secret requires > updating > > > > > only the single StorageConfiguration entity. All thousands of > > > downstream > > > > > tables referencing it would immediately use the new credentials > without > > > > > metadata updates. > > > > > - > > > > > > > > > > Inheritance: The reference could be set at the Catalog, Namespace, > or > > > > Table > > > > > level. If a Table does not specify a reference, it would inherit > the > > > > > reference from its parent Namespace (and so on), preserving the > current > > > > > hierarchical behavior while adding granularity. > > > > > > > > > > • Pros: Maximum flexibility and reusability (Many-to-Many). > Updating > > > one > > > > > config object propagates to all associated tables. > > > > > - > > > > > > > > > > • Cons: Highest engineering cost. Requires new CRUD APIs, DB schema > > > > changes > > > > > (mapping tables), and complex authorization logic (two-stage auth > > > > checks). > > > > > Risk of accumulating "orphaned" configs > > > > > > > > > > Option 2: The "Embedded Field" Model > > > > > - > > > > > > > > > > This approach extends the existing Table and Namespace entities to > > > > include > > > > > a storageConfig field. The parameter can be defaulted to 'null' > and use > > > > > parent's storageConfig at runtime. > > > > > > > > > > *Data Model:* No new top-level entity is created. The storage > details > > > > > (e.g., roleArn) are stored directly into a new, dedicated column or > > > > > structure within the existing Table/Namespace entity. > > > > > > > > > > Complexity: This could reduce the engineering overhead > significantly. > > > > There > > > > > are no new CRUD endpoints for configuration objects, no referential > > > > > integrity checks (e.g., preventing the deletion of a config used by > > > > active > > > > > tables). > > > > > > > > > > Credential Rotation: Credential rotation is difficult. If an IAM > role > > > > > changes, an administrator must identify and issue UPDATE > operations for > > > > > every individual table or namespace that uses that specific > > > > configuration, > > > > > potentially affecting thousands of objects. > > > > > > > > > > • Pros: Lowest engineering cost. No new entities or complex > mappings > > > are > > > > > required. Easy to reason about authorization (auth is tied > strictly to > > > > the > > > > > entity). > > > > > > > > > > • Cons: No reusability. Configs must be duplicated across tables; > > > > rotating > > > > > credentials for 1,000 tables could require 1,000 update calls. > > > > > > > > > > Option 3: Named Catalog-Level Configurations (Hybrid) > > > > > > > > > > This can be a combination of Option1 and Option 2 > > > > > Admin can define a registry of "Named Storage Configurations" > stored > > > > within > > > > > the Catalog. Sub-entities (Namespaces/Tables) reference these > configs > > > by > > > > > name (e.g., storage-config: "finance-secure-role"). > > > > > > > > > > *Data Model:* No separate top level entity is created. The Catalog > > > Entity > > > > > potentially needs to be modified to accommodate named storage > > > > > configurations. > > > > > > > > > > Credential Rotation: Credential Rotation can be done at the catalog > > > level > > > > > for each named Storage Configuration. > > > > > > > > > > Inheritance: Works pretty much similar as proposed in option 1 & > > > option2. > > > > > > > > > > Security: Not as secure as option1 but still useful. A principal > with > > > > > proper access can attach any named storage configuration defined > at the > > > > > catalog level to any arbitrary entity within the catalog. > > > > > > > > > > • Pros: Good balance of reusability and simplicity. Allows > updating a > > > > > config in one place (the Catalog definition) without needing a > > > full-blown > > > > > global entity system. > > > > > > > > > > • Cons: Scope is limited to the Catalog (cannot share configs > across > > > > > catalogs) > > > > > Option 4: Leverage Existing Policy Framework > > > > > > > > > > This approach leverages the existing Apache Polaris Policy > Framework > > > > > (currently used for features like snapshot expiry) to manage > storage > > > > > settings. > > > > > > > > > > Data Model: Storage configurations are defined as "Policies" at the > > > > Catalog > > > > > level. These Policies contain the credential details and can be > > > attached > > > > to > > > > > Namespaces or Tables using the existing policy attachment APIs. > > > > > > > > > > Inheritance: This aligns naturally with Polaris's existing > > > architecture, > > > > > where policies cascade from Catalog → Namespace → Table. The > vending > > > > logic > > > > > would simply resolve the "effective" storage policy for a table at > > > query > > > > > time. > > > > > > > > > > Security: This utilizes the existing Polaris Privileges and > attachment > > > > > privileges. Administrators can define authorized storage policies > > > > > centrally, and users can only select from these pre-approved > policies, > > > > > preventing them from inputting arbitrary or insecure role ARNs. > > > > > > > > > > • Pros: > > > > > . Zero New Infrastructure: Reuses the existing "Policy" entity, > > > > > persistence layer, and inheritance logic, significantly reducing > > > > > engineering effort > > > > > . Proven Inheritance: The logic for resolving policies from > child to > > > > > parent is already implemented and tested > > > > > > > > > > • Cons: > > > > > . Semantic Confusion: Policies are typically used for "governance > > > > rules" > > > > > (e.g., snapshot expiry, compaction) rather than "connectivity > > > > > configuration." Using them for credentials might be unintuitive > > > > > . Authorization Complexity: The authorizer would need to load and > > > > > evaluate policies to determine how to access data, potentially > coupling > > > > > governance logic with data access paths > > > > > > > > > > We can potentially start with one of the options initially and as > the > > > > > feature and user needs develop we can migrate to other options as > well. > > > > > Please let me know your thoughts about the various options above > or if > > > on > > > > > anything that I might have missed so that we can work towards a > > > consensus > > > > > on how to implement this feature. > > > > > > > > > > > > > > > On Thu, Feb 5, 2026 at 8:08 AM Tornike Gurgenidze < > > > > [email protected]> > > > > > wrote: > > > > > > > > > > > Hi, > > > > > > > > > > > > To follow up on Dmitri's point about credentials, there's > already a > > > PR > > > > > > <https://github.com/apache/polaris/pull/3409> up that is going > to > > > > allow > > > > > > predefining named storage credentials in polaris config like the > > > > > following: > > > > > > > > > > > > - polaris.storage.aws.<storage-name>.access-key > > > > > > - polaris.storage.aws.<storage-name>.secret-key > > > > > > > > > > > > then storage configuration will simply refer to it by name and > > > > > > inherit credentials. > > > > > > > > > > > > I think that can go hand in hand with table-level overrides. > > > Overriding > > > > > > each and every aws property for every table doesn't sound ideal. > > > > > Defining a > > > > > > storage configuration upfront and referring to it by name should > be a > > > > > > simpler solution. I can extend the scope of the PR above to allow > > > > > > predefining other aws properties as well like endpoint-url and > > > region. > > > > > > > > > > > > Another point that came up in the discussion surrounding extra > > > > > credentials > > > > > > is how to make sure anyone can't just hijack pre configured > > > > credentials. > > > > > > The simplest solution I see there is to ship off properties to > OPA > > > > during > > > > > > catalog (and table) creation and allow users to write policies > based > > > on > > > > > > them. If we want to enable internal rbac to have a similar > capability > > > > we > > > > > > can go further and move from config based storage definition to a > > > > > separate > > > > > > `/storage-config` rest resource in management API that will come > with > > > > > > necessary grants and permissions. > > > > > > > > > > > > On Thu, Feb 5, 2026 at 5:43 AM Dmitri Bourlatchkov < > [email protected] > > > > > > > > > > wrote: > > > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > > > Thanks for the proposal. It looks good to me overall, a very > timely > > > > > > feature > > > > > > > to add to Polaris. > > > > > > > > > > > > > > I added some comments in the doc and I see this topic on the > > > > Community > > > > > > Sync > > > > > > > agenda for Feb 5. Looking forward to discussing it online. > > > > > > > > > > > > > > I have three points to highlight: > > > > > > > > > > > > > > * Dealing with passwords probably connects to the Secrets > Manager > > > > > > > discussion [1] > > > > > > > > > > > > > > * Persistence needs to consider non-RDBMS backends. OSS code > has > > > both > > > > > > > PostgreSQL and MongoDB, but private Persistence > implementations are > > > > > > > possible too. I believe we need a proper SPI for this, not > just a > > > > > > > relational schema example. > > > > > > > > > > > > > > * Associating entities (tables, namespaces) to Storage > > > Configuration > > > > is > > > > > > > likely a plugin point that downstream projects may want to > > > customize. > > > > > I'd > > > > > > > propose making another SPI for this. This SPI is probably > different > > > > > from > > > > > > > the new Persistence SPI mentioned above since the concern here > is > > > not > > > > > > > persistence per se, but the logic of finding the right storage > > > > config. > > > > > > > > > > > > > > [1] > > > https://lists.apache.org/thread/68r3gcx70f0qhbtz3w4zhb8f9s4vvw1f > > > > > > > > > > > > > > Cheers, > > > > > > > Dmitri. > > > > > > > > > > > > > > On Mon, Feb 2, 2026 at 4:18 PM Srinivas Rishindra < > > > > > > [email protected]> > > > > > > > wrote: > > > > > > > > > > > > > > > Hi all, > > > > > > > > > > > > > > > > We had an opportunity to discuss the community sprint last > week. > > > > > Based > > > > > > on > > > > > > > > that discussion, I have created a new design doc which I am > > > > attaching > > > > > > > here. > > > > > > > > In this design instead of passing credentials via table > > > properties, > > > > > > this > > > > > > > > design introduces Inheritable Storage Configurations as a > > > > first-class > > > > > > > > feature. Please let me know your thoughts on the document. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1hbDkE-w84Pn_112iW2vCnlDKPDtyg8flaYcFGjvD120/edit?usp=sharing > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jan 26, 2026 at 10:42 PM Yufei Gu < > [email protected]> > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > > > > > > > Thanks for sharing this proposal. Persisting long lived > > > > credentials > > > > > > > such > > > > > > > > as > > > > > > > > > an S3 secret access key directly in table properties raises > > > > > > significant > > > > > > > > > security concerns. Here is an alternative approach > previously > > > > > > > discussed, > > > > > > > > > which enables storage configuration at the table or > namespace > > > > > level, > > > > > > > and > > > > > > > > it > > > > > > > > > is probably a more secure and promising direction overall. > > > > > > > > > > > > > > > > > > Yufei > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Jan 26, 2026 at 8:18 PM Srinivas Rishindra < > > > > > > > > [email protected] > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Dear All, > > > > > > > > > > > > > > > > > > > > I have developed a design proposal for Table-Level > Storage > > > > > > Credential > > > > > > > > > > Overrides in Apache Polaris. > > > > > > > > > > > > > > > > > > > > The core objective is to allow specific storage > properties to > > > > be > > > > > > > > defined > > > > > > > > > at > > > > > > > > > > the table level rather than the catalog level, enabling a > > > > single > > > > > > > > logical > > > > > > > > > > catalog to support tables across disparate storage > systems. > > > > > > > Crucially, > > > > > > > > > the > > > > > > > > > > implementation ensures these overrides participate in the > > > > > > credential > > > > > > > > > > vending process to maintain secure, scoped access. > > > > > > > > > > > > > > > > > > > > I have also implemented a Proof of Concept (POC) pull > request > > > > to > > > > > > > > > > demonstrate the idea. While the current MVP focuses on > S3, I > > > > > intend > > > > > > > to > > > > > > > > > > expand scope to include Azure and GCS pending community > > > > feedback. > > > > > > > > > > > > > > > > > > > > I look forward to your thoughts and suggestions on this > > > > proposal. > > > > > > > > > > > > > > > > > > > > Links: > > > > > > > > > > > > > > > > > > > > - Design Doc: Table-Level Storage Credential Overrides ( > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://docs.google.com/document/d/1tf4N8GKeyAAYNoP0FQ1zT1Ba3P1nVGgdw3nmnhSm-u0/edit?usp=sharing > > > > > > > > > > ) > > > > > > > > > > - POC PR: https://github.com/apache/polaris/pull/3563 ( > > > > > > > > > > https://github.com/apache/polaris/pull/3563) > > > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > > > > > > > > > Srinivas Rishindra Pothireddi > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
