Thanks for the discussion. I will open a separate refactor PR, following Dennis's sketch and applied to the persistences already on main. Concretely, the PR will:
1. Decouple BasePersistence — remove extends PolicyMappingPersistence and extends MetricsPersistence so all four (BasePersistence, PolicyMappingPersistence, MetricsPersistence, IntegrationPersistence) are disjoint top-level SPIs. 2. Update TransactionalPersistence — add extends PolicyMappingPersistence and extends MetricsPersistence so it keeps aggregating the same set of behaviors after step 1. 3. Update concrete impls — add PolicyMappingPersistence and MetricsPersistence to the implements list of JdbcBasePersistenceImpl and NonFunctionalBasePersistence / NoSqlMetaStore. Method bodies don't change, only the declared interfaces. 4. Add factory accessors on MetaStoreManagerFactory — getOrCreateBasePersistence, getOrCreatePolicyMappingPersistence, getOrCreateMetricsPersistence, getOrCreateIntegrationPersistence. Rename the existing getOrCreateSession() accordingly. 5. Mirror accessors on PolarisCallContext — getBasePersistence, getPolicyMappingPersistence, getMetricsPersistence, getIntegrationPersistence. Rename the existing getMetaStore() accordingly. 6. Remove the IntegrationPersistence blind cast — replace ((IntegrationPersistence) ms) in AtomicOperationMetaStoreManager with callCtx.getIntegrationPersistence(). Once that lands, I'll rebase my idempotency PR on top so IdempotencyPersistence follows the same pattern from the start (added as a fifth disjoint top-level SPI, with one new accessor on each of MetaStoreManagerFactory and PolarisCallContext, and JdbcBasePersistenceImpl getting one additional implements entry — no BasePersistence extends IdempotencyPersistence). I'll start a draft of the refactor PR and loop your folks in once it's open. Thanks, Huaxin On Wed, May 6, 2026 at 6:09 PM Dennis Huo <[email protected]> wrote: > TL;DR: I agree with needing to fix the interface decomposition and that it > shouldn't be too hard. I could go either way on whether it's done in the > existing PR though -- we probably don't want to introduce yet another > pattern if we might forget to refactor the existing MetricsPersistence and > PolicyMappingPersistence to match the same pattern. Re: coupling, I'm not > sure they're entirely "unrelated" but the coupling should be expressed > differently than making BasePersistence a big interface-aggregator anyways, > so I still agree with the refactor. > > Regarding the IdempotencyPersistence/BasePersistence hierarchy, though I > don't feel too strongly either way in the short term for expediency (since > internal layout consistency is generally good, and proper interface > decomposition could be done across all the "modular" *Persistence > interfaces at once in a consistent way), I do agree with Dmitri's point > that right now making the "BasePersistence" interface be an aggregator > interface that is directly used at callsites feels like we're sort of stuck > in a one-foot-in-the-door state in terms of interface decomposition. > > It seems the splitting of factory and accessor methods could potentially be > lightweight initially, since as I understand it, our first step would just > be to push down the multiple-interface-inheritance into the concrete impls > like "JdbcBasePersistenceImpl implements BasePersistence, > MetricsPersistence, PolicyMappingPersistence, IdempotencyPersistence, > IntegrationPersistence". > > At a glance, it looks like we'd need to split out some > "getOrCreateMetricsPersistence", "getOrCreateIdempotencyPersistence", etc., > in MetaStoreManagerFactory and probably also mirror that set of accessors > into PolarisCallContext alongside the current "BasePersistence > getMetaStore()". > > It would probably also be a good time to rename the "getOrCreateSession" > method to better capture what it's doing. And also rename > "PolarisCallContext.getMetaStore()". > > I remember this decomposition was the plan back when I split > BasePersistence vs IntegrationPersistence already -- ideally the > introduction of PolicyMappingPersistence and MetricsPersistence should've > followed the pattern of BasePersistence vs IntegrationPersistence already > (though we didn't quite finish pulling IntegrationPersistence into the > factory methods, instead just using a blind cast of > ((IntegrationPersistence) ms) in the relevant callsites as a "temporary" > placeholder that accidentally became permanent). Basically impls could > always multi-inherit BasePersistence and FooPersistence, etc., but > BasePersistence itself did *not* directly extend IntegrationPersistence. > > There are some caveats though: > > So, if we can agree > > that IdempotencyPersistence is not conceptually related to Metastore > > Persistence (my points above) then introducing tight coupling between > > BasePersistence and IdempotencyPersistence is pure tech debt. > > > I'm not sure I agree they're not conceptually related, though maybe I need > to deep dive review the overall design. I agree that "coincidental" tight > coupling doesn't help anything (so yes we should decompose the interfaces), > but I think we need to define exactly what type of coupling we intend to > support the correct semantics. > > The main thing: These idempotency keys are fundamentally to support > ACID-related semantics, unlike best-effort caching types of use cases. > Thus, the existence or nonexistence of idempotency records must be coupled > in *some way* to their associated request "side effects", which in this > case means core metastore persistence state. > > If we extend the original transactionality semantics, this is actually > somewhat well defined: > > 1. TransactionalPersistence will continue to be an aggregator of > BasePersistence and IdempotencyPersistence, with the implication that its > own "runInTransaction" block covers both the BasePersistence and > IdempotencyPersistence calls in a single atomic unit. > 2. AtomicOperationMetaStoreManager intentionally does *not* hold a > TransactionalPersistence, so we know that either we're sacrificing certain > race-condition cases without transactions, *or* we define some higher-level > "loose coupling" that enables a multi-phase commit semantic to ensure > cohesive state between PolarisEntity and IdempotencyRecords. > > If we wanted to be more "formal" with that "loose coupling" it might mean > adding a different interface like > "NonTransactionalIdempotencyToEntityCoordinator" with methods like > "PolarisEntity addIdempotencyBookkeepingToEntityBeforeChange(IdempotencyKey > key, PolarisEntity entityBeingChanged)" and maybe different > postEntityCommit or abortEntityCommitForIdempotencyKey methods related to > the multi-phase commit handling. > > > On Wed, May 6, 2026 at 2:27 PM Dmitri Bourlatchkov <[email protected]> > wrote: > > > Hi All, > > > > As far as I can tell, what I proposed in my previous email on this thread > > aligns with the SPI evolution direction in Metrics Persistence [1]. > > > > [1] https://lists.apache.org/thread/7kp36t8fykg2hvzkr96r5m7cdnnyopj9 > > > > Cheers, > > Dmitri. > > > > On Wed, May 6, 2026 at 4:55 PM Dmitri Bourlatchkov <[email protected]> > > wrote: > > > > > Hi Huaxin, > > > > > > I agree with your analysis. However, regarding the design direction > part, > > > from my POV, it would be preferable to introduce a parallel factory > > > dedicated to IdempotencyPersistence. > > > > > > Idempotency Persistence is not conceptually connected to Metastore > > > Persistence. > > > > > > Therefore, a BasePersistence (Metastore) implementation using a > > particular > > > database technology does not have to implement IdempotencyPersistence > > using > > > the same technology. > > > > > > Moreover, it is conceivable that a Polaris admin may want to use > > different > > > database technologies for IdempotencyPersistence and Metastore > > Persistence. > > > I'd even suggest that this approach may be advisable because the > > > performance and data consistency characteristics expected > > > of IdempotencyPersistence and the Metastore are very different due to > > their > > > distinct use cases. > > > > > > When SPIs are separated, a particular implementation of Metastore > > > Persistence can also implement IdempotencyPersistence and vice versa. > > > > > > Leaving `default` methods unimplemented (current state of [4269]) may > > > permit some Metastore implemetation to ignore the > IdempotencyPersistence > > > SPI, but in that situation the Polaris admin user would have no option > to > > > support IdempotencyPersistence via a different technology. > > > > > > All in all, from my POV I see no disadvantage to isolating these SPIs, > > > whereas I do see disadvantages in their coupling. > > > > > > This is also the direction I'd advocate in a more general Polaris SPI > > > discussion (which has not fully started on `dev` yet). > > > > > > Re: PR [4269] specifically, I believe isolating IdempotencyPersistence > > > from BasePersistence is not technically difficult. So, if we can agree > > > that IdempotencyPersistence is not conceptually related to Metastore > > > Persistence (my points above) then introducing tight coupling between > > > BasePersistence and IdempotencyPersistence is pure tech debt. Instead > of > > > adding this coupling only to remove it in a follow-up PR, I believe it > > > would require less work to restructure the code and avoid introducing > the > > > coupling upfront. > > > > > > [4269] https://github.com/apache/polaris/pull/4269 > > > > > > Cheers, > > > Dmitri. > > > > > > On Tue, May 5, 2026 at 7:45 PM huaxin gao <[email protected]> > > wrote: > > > > > >> Hi Dimitri, > > >> > > >> Thanks for raising this. I'd prefer email so it's async and easier for > > >> everyone to participate at their own pace. A dedicated sync can come > > later > > >> if email gets stuck. > > >> > > >> My understanding is that BasePersistence already extends > > >> PolicyMappingPersistence and MetricsPersistence, plus has event > > >> persistence > > >> folded in directly. That's because AtomicOperationMetaStoreManager > > >> operates > > >> against a single BasePersistence instance per realm, and each backend > > >> (in-memory, JDBC, NoSQL) has one factory chain producing it. So to > > follow > > >> the current architecture, we need to make BasePersistence extend > > >> IdempotencyPersistence too. Splitting IdempotencyPersistence out as a > > >> top-level SPI would mean adding parallel factory plumbing in every > > >> backend. > > >> > > >> Happy to discuss the broader SPI shape on this thread. I'd prefer to > > land > > >> #4269 with the current shape since it follows the agreed-upon > > >> architecture, > > >> and revisit the layering uniformly in a follow-up once we converge — > > that > > >> broader discussion may take a while. > > >> > > >> Thanks, > > >> > > >> Huaxin > > >> > > >> On Tue, May 5, 2026 at 2:29 PM Dmitri Bourlatchkov <[email protected]> > > >> wrote: > > >> > > >> > Hi Huaxin, Yufei, > > >> > > > >> > I left some comments on [4269] mostly about SPI design concerns. > > >> > > > >> > Since we wanted to reopen a more general SPI discussion anyway, it > > >> might be > > >> > worth having a dedicated sync about that... > > >> > > > >> > Alternatively, we could have that discussion by email now, if you > > >> prefer. I > > >> > have some starting points to share for discussion. > > >> > > > >> > WDYT? > > >> > > > >> > [4269] https://github.com/apache/polaris/pull/4269 > > >> > > > >> > Thanks, > > >> > Dmitri. > > >> > > > >> > On Fri, Apr 17, 2026 at 7:07 PM huaxin gao <[email protected]> > > >> wrote: > > >> > > > >> > > Hi Robert, > > >> > > > > >> > > Thanks, and I’m glad we’re aligned on the handler-level > direction. I > > >> also > > >> > > appreciate the detailed review and feedback throughout this > > >> discussion. > > >> > > > > >> > > Since the old implementation is going away and the handler-level > > >> > > implementation will be in a fresh PR, would you be OK removing the > > -1 > > >> > from > > >> > > PR #3803 <https://github.com/apache/polaris/pull/3803>? If so, > I’ll > > >> > close > > >> > > the old PR and open a new one for the handler-level approach. > > >> > > > > >> > > Best, > > >> > > > > >> > > Huaxin > > >> > > > > >> > > On Thu, Apr 16, 2026 at 12:57 AM Robert Stupp <[email protected]> > > wrote: > > >> > > > > >> > > > i Yufei, Huaxin, > > >> > > > > > >> > > > Thanks for the thoughtful replies, and for opening the issue for > > >> > > > `createTableStaged` [1]. I’m glad we’re converging on the > > >> handler-level > > >> > > > direction. > > >> > > > > > >> > > > I agree benchmarking can run in parallel with implementation. > For > > my > > >> > own > > >> > > > review, I’d prefer to wait until reproducible benchmark results > > are > > >> > > posted > > >> > > > (using the scope from my April 14 note, or with any scope > changes > > >> > called > > >> > > > out). > > >> > > > > > >> > > > Happy to review once that is available. > > >> > > > > > >> > > > Best, > > >> > > > Robert > > >> > > > > > >> > > > [1] https://github.com/apache/polaris/issues/4200 > > >> > > > > > >> > > > > > >> > > > On Tue, Apr 14, 2026 at 7:54 PM huaxin gao < > > [email protected]> > > >> > > wrote: > > >> > > > > > >> > > > > Hi Robert, > > >> > > > > > > >> > > > > Thanks for the feedback. Glad the handler-level design > addresses > > >> the > > >> > > > > concerns. > > >> > > > > > > >> > > > > I also feel we should converge on the design and > implementation > > >> > first, > > >> > > > and > > >> > > > > run benchmarks in parallel. The access patterns will be > > different > > >> > with > > >> > > > the > > >> > > > > handler-level approach, so benchmarking the current > filter-based > > >> code > > >> > > > > wouldn't be representative. > > >> > > > > > > >> > > > > I'll open a GitHub issue to track createTableStaged support, > and > > >> I'll > > >> > > > open > > >> > > > > a fresh PR for the handler-level implementation. > > >> > > > > > > >> > > > > Best, > > >> > > > > > > >> > > > > Huaxin > > >> > > > > > > >> > > > > On Tue, Apr 14, 2026 at 10:30 AM Yufei Gu < > [email protected] > > > > > >> > > wrote: > > >> > > > > > > >> > > > > > Hi Robert, > > >> > > > > > > > >> > > > > > Thanks for the thorough review and for clearly laying out > the > > >> > > concerns, > > >> > > > > > this is very helpful. > > >> > > > > > > > >> > > > > > For point 1(Q3), I do not think this needs to be a hard > > >> > prerequisite > > >> > > > > before > > >> > > > > > looking at the implementation. > > >> > > > > > > > >> > > > > > The handler level design and correctness concerns are > largely > > >> > > > independent > > >> > > > > > from the exact performance characteristics. It would be more > > >> > > productive > > >> > > > > to > > >> > > > > > first converge on the right architecture, then validate and > > >> iterate > > >> > > on > > >> > > > > > performance with concrete measurements. Also, while the > store > > is > > >> > > > reused, > > >> > > > > > the access pattern and critical sections change with the > > handler > > >> > > level > > >> > > > > > approach. Measuring the current implementation in isolation > > may > > >> not > > >> > > > > reflect > > >> > > > > > the actual behavior of the final design. > > >> > > > > > > > >> > > > > > We already have a similar situation with the NoSQL > benchmark, > > >> where > > >> > > the > > >> > > > > > code has been merged for a while but the benchmarking work > is > > >> still > > >> > > in > > >> > > > > > progress. That has not blocked design or code review, and we > > are > > >> > > > > iterating > > >> > > > > > on performance as we go. > > >> > > > > > > > >> > > > > > I agree that realistic benchmarking is important. But I > would > > >> > suggest > > >> > > > we > > >> > > > > > treat this as a parallel track, not a gate. Otherwise we > risk > > >> > > blocking > > >> > > > > > progress on the design without knowing whether the measured > > >> > overhead > > >> > > is > > >> > > > > > even representative of the final system. > > >> > > > > > > > >> > > > > > Happy to help define a benchmark plan once the design is > > >> settled. > > >> > > > > > > > >> > > > > > Yufei > > >> > > > > > > > >> > > > > > > > >> > > > > > On Tue, Apr 14, 2026 at 6:28 AM Robert Stupp < > [email protected]> > > >> > wrote: > > >> > > > > > > > >> > > > > > > Hi Huaxin, > > >> > > > > > > > > >> > > > > > > yeah, the handler-level approach is the right call, I'm > glad > > >> you > > >> > > went > > >> > > > > > with > > >> > > > > > > it. > > >> > > > > > > > > >> > > > > > > To be explicit about what your proposal aims to resolve: > > >> > > > > > > > > >> > > > > > > - Authorization bypass is fixed. The check now runs after > > >> > > > > resource-level > > >> > > > > > > authorization. > > >> > > > > > > - No credentials in the database. Fresh credentials > > re-vended > > >> on > > >> > > > replay > > >> > > > > > via > > >> > > > > > > the normal vending path. > > >> > > > > > > - Phase 2 architecture problem gone. The handler has the > > >> > principal, > > >> > > > > > storage > > >> > > > > > > config, and IAM access that the filter never could have. > > >> > > > > > > - Caller identity is in the binding. > > >> > > > > > > > > >> > > > > > > Three things still need adressing: > > >> > > > > > > > > >> > > > > > > First, and this is a hard prerequisite before I look at > any > > >> > > > > > implementation > > >> > > > > > > code: Q3 from April 5 is still open. The store is being > > reused > > >> > > as-is, > > >> > > > > so > > >> > > > > > > the overhead can be measured right now without new handler > > >> code. > > >> > 3+ > > >> > > > DB > > >> > > > > > > round-trips per mutation on the catalog connection pool, > > >> > write-hot > > >> > > > > table, > > >> > > > > > > polling loops under duplicate traffic - those numbers need > > to > > >> > come > > >> > > > > from a > > >> > > > > > > realistic setup: networked database under concurrent load, > > >> not a > > >> > > > local > > >> > > > > > > Postgres. The latency concerns are specifically about > Aurora > > >> and > > >> > > > > > > CockroachDB, and the polling storm only shows up under > > >> concurrent > > >> > > > > > duplicate > > >> > > > > > > traffic. Please share the methodology so the results are > > >> > > > reproducible. > > >> > > > > > > > > >> > > > > > > Second: please open a GitHub issue for the > createTableStaged > > >> > > > deferral. > > >> > > > > > > > > >> > > > > > > Third: since the filter is going away entirely and the > > >> critical > > >> > > > > sections > > >> > > > > > > move to the handlers, does it make sense to close #3803 > and > > >> open > > >> > a > > >> > > > > fresh > > >> > > > > > > PR? The old diff is mostly code that's being deleted, and > it > > >> > > carries > > >> > > > a > > >> > > > > > lot > > >> > > > > > > of review history. A clean PR would be easier for everyone > > to > > >> > > review > > >> > > > - > > >> > > > > > just > > >> > > > > > > reference the old one for context. > > >> > > > > > > > > >> > > > > > > Happy to do a fresh review once the new implementation is > > up. > > >> > > > > > > > > >> > > > > > > Best, > > >> > > > > > > Robert > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > On Tue, Apr 14, 2026 at 1:44 AM huaxin gao < > > >> > [email protected] > > >> > > > > > >> > > > > > wrote: > > >> > > > > > > > > >> > > > > > > > Hi Robert, > > >> > > > > > > > > > >> > > > > > > > Here is a revised design proposal for the idempotency > > replay > > >> > > > > mechanism. > > >> > > > > > > I'd > > >> > > > > > > > appreciate your feedback. Thanks! > > >> > > > > > > > > > >> > > > > > > > *Moving idempotency to handler level* > > >> > > > > > > > > > >> > > > > > > > The idempotency check moves from the HTTP filter into > the > > >> > handler > > >> > > > > > > methods, > > >> > > > > > > > after authorization. The filter is removed. The caller's > > >> > > principal > > >> > > > > hash > > >> > > > > > > is > > >> > > > > > > > included in the idempotency binding and checked on > replay. > > >> > > > > > > > > > >> > > > > > > > *Credential-bearing mutations (createTable)* > > >> > > > > > > > > > >> > > > > > > > On replay, instead of returning a stored response, the > > >> handler > > >> > > > calls > > >> > > > > > > > buildLoadTableResponseWithDelegationCredentials with the > > >> > existing > > >> > > > > > table's > > >> > > > > > > > metadata. This vends fresh credentials for the current > > >> caller > > >> > > > through > > >> > > > > > the > > >> > > > > > > > normal credential vending path. No credentials are > stored > > in > > >> > the > > >> > > > > > > database — > > >> > > > > > > > the idempotency record only stores the key, principal > > hash, > > >> > table > > >> > > > > > > > identifier, and status code. > > >> > > > > > > > > > >> > > > > > > > *Non-credential mutations (dropTable, renameTable, > > >> > namespace/view > > >> > > > > > > > operations)* > > >> > > > > > > > > > >> > > > > > > > These do not vend credentials, so there is no leakage > > >> concern. > > >> > On > > >> > > > > > replay, > > >> > > > > > > > the handler returns the stored status code and minimal > > >> response > > >> > > > body. > > >> > > > > > > > > > >> > > > > > > > *createTableStaged* > > >> > > > > > > > > > >> > > > > > > > Staged tables are not persisted to the catalog, so the > > >> method > > >> > can > > >> > > > > > safely > > >> > > > > > > > re-run on retry. Idempotency support for this endpoint > can > > >> be > > >> > > > > deferred. > > >> > > > > > > > > > >> > > > > > > > *What is reused from the current implementation* > > >> > > > > > > > > > >> > > > > > > > The IdempotencyStore interface, database schema, > > >> configuration, > > >> > > and > > >> > > > > > > > background purge logic are reused. The main change is > > moving > > >> > the > > >> > > > > > > > check/finalize logic from the filter into the handler. > > >> > > > > > > > > > >> > > > > > > > Best, > > >> > > > > > > > > > >> > > > > > > > Huaxin > > >> > > > > > > > > > >> > > > > > > > On Sun, Apr 5, 2026 at 11:53 AM huaxin gao < > > >> > > [email protected] > > >> > > > > > > >> > > > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > Hi Robert, > > >> > > > > > > > > > > >> > > > > > > > > Thanks for the follow-up. > > >> > > > > > > > > > > >> > > > > > > > > I appreciate the clarification. I read the listed > items > > as > > >> > fix > > >> > > > > > options > > >> > > > > > > > and > > >> > > > > > > > > picked option 2, and the Phase 1/Phase 2 split was > > purely > > >> to > > >> > > keep > > >> > > > > the > > >> > > > > > > PR > > >> > > > > > > > at > > >> > > > > > > > > a reviewable size, not a design choice. > > >> > > > > > > > > > > >> > > > > > > > > I agree that the caller's identity should be part of > the > > >> > > binding. > > >> > > > > > That > > >> > > > > > > is > > >> > > > > > > > > straightforward to add. > > >> > > > > > > > > > > >> > > > > > > > > On the architecture question, whether re-vending can > > work > > >> > > within > > >> > > > > the > > >> > > > > > > > > filter or needs a handler-level approach, I want to > give > > >> this > > >> > > > > proper > > >> > > > > > > > > thought rather than rush an answer. I may not be able > to > > >> get > > >> > > back > > >> > > > > to > > >> > > > > > > you > > >> > > > > > > > on > > >> > > > > > > > > this until next week (the week of 4/13). > > >> > > > > > > > > > > >> > > > > > > > > Thanks, > > >> > > > > > > > > > > >> > > > > > > > > Huaxin > > >> > > > > > > > > > > >> > > > > > > > > On Sat, Apr 4, 2026 at 11:42 PM Robert Stupp < > > >> [email protected] > > >> > > > > >> > > > > wrote: > > >> > > > > > > > > > > >> > > > > > > > >> Hi Huaxin, > > >> > > > > > > > >> > > >> > > > > > > > >> Thanks for responding. A few corrections and > > >> clarifications. > > >> > > > > > > > >> > > >> > > > > > > > >> My PR review described three consequences of the > > missing > > >> > > caller > > >> > > > > > > > >> identity binding and the credential leakage attack > > >> vector. > > >> > > Those > > >> > > > > > > > >> were not suggestions for a Phase 1/Phase 2 split. The > > >> > offline > > >> > > > > > > > >> decision to split into phases, and what each phase > > >> contains, > > >> > > was > > >> > > > > > > > >> yours [1]. > > >> > > > > > > > >> > > >> > > > > > > > >> On the phasing: the concern is not PR size. The > concern > > >> is > > >> > > that > > >> > > > > > > > >> Phase 1 alone ships a full authorization bypass. The > > >> filter > > >> > > > > replays > > >> > > > > > > > >> cached responses before Polaris's resource-level > > >> > authorization > > >> > > > > > > > >> runs. Stripping credentials does not fix that. Any > > >> > > authenticated > > >> > > > > > > > >> user with a valid idempotency key can receive a > cached > > >> > > response > > >> > > > > > > > >> for a resource they are not authorized to access. > That > > is > > >> > the > > >> > > > case > > >> > > > > > > > >> with or without credentials in the response. > > >> > > > > > > > >> > > >> > > > > > > > >> "We will not release it until entire idempotency > coding > > >> is > > >> > > > > > > > >> finished" is reassuring, but the architecture > question > > >> > > remains: > > >> > > > > > > > >> can the generic filter approach support re-vending at > > >> all? > > >> > My > > >> > > > > > > > >> email explained why I believe it cannot, since it > would > > >> > > require > > >> > > > > > > > >> the filter to authorize the caller, resolve storage > > >> > locations, > > >> > > > > > > > >> call cloud IAM, and manage credential caching. That > is > > >> the > > >> > > > catalog > > >> > > > > > > > >> handler's job, not a filter's. > > >> > > > > > > > >> > > >> > > > > > > > >> The questions from my original email that still need > > >> > answers: > > >> > > > > > > > >> > > >> > > > > > > > >> How does the binding prevent cross-principal cache > hits > > >> > > without > > >> > > > > > > > >> including the caller's identity? > > >> > > > > > > > >> > > >> > > > > > > > >> How does Phase 2 re-vend credentials from a generic > > HTTP > > >> > > filter > > >> > > > > > > > >> without reimplementing the catalog handler's > > >> authorization > > >> > and > > >> > > > > > > > >> credential vending logic? > > >> > > > > > > > >> > > >> > > > > > > > >> Has the operational overhead (3 DB round-trips per > > >> mutation > > >> > on > > >> > > > > > > > >> the catalog's connection pool) been benchmarked on > any > > >> > > backend? > > >> > > > > > > > >> > > >> > > > > > > > >> I'd like to resolve these on this thread before more > > >> code is > > >> > > > > > > > >> written. > > >> > > > > > > > >> > > >> > > > > > > > >> Best, > > >> > > > > > > > >> Robert > > >> > > > > > > > >> > > >> > > > > > > > >> [1] > > >> > > > > > > > > >> > > > https://github.com/apache/polaris/pull/3803#issuecomment-4180969838 > > >> > > > > > > > >> > > >> > > > > > > > >> On Sun, Apr 5, 2026 at 7:55 AM huaxin gao < > > >> > > > [email protected] > > >> > > > > > > > >> > > > > > > > wrote: > > >> > > > > > > > >> > > >> > > > > > > > >> > Hi Robert, > > >> > > > > > > > >> > > > >> > > > > > > > >> > Thanks for raising this on the mailing list. I > > >> acknowledge > > >> > > the > > >> > > > > > > design > > >> > > > > > > > >> > concerns you listed. > > >> > > > > > > > >> > > > >> > > > > > > > >> > To clarify: I picked your option 2 (strip > credentials > > >> and > > >> > > > > re-vend > > >> > > > > > on > > >> > > > > > > > >> > replay) you suggested in the PR review. The offline > > >> > > discussion > > >> > > > > > with > > >> > > > > > > > >> Yufei > > >> > > > > > > > >> > and Prashant was simply to verify that I picked the > > >> best > > >> > > > option > > >> > > > > > from > > >> > > > > > > > all > > >> > > > > > > > >> > the three suggestions you listed in the review. > > >> > > > > > > > >> > > > >> > > > > > > > >> > The reason I split it into two phases is that the > > >> current > > >> > PR > > >> > > > is > > >> > > > > > > > already > > >> > > > > > > > >> > around 3000 lines. Adding re-vending logic on top > > would > > >> > make > > >> > > > it > > >> > > > > > very > > >> > > > > > > > >> hard > > >> > > > > > > > >> > to review. This is a big feature and we need to do > it > > >> > piece > > >> > > by > > >> > > > > > > piece. > > >> > > > > > > > We > > >> > > > > > > > >> > will not release it until entire idempotency coding > > is > > >> > > > finished. > > >> > > > > > > > >> > > > >> > > > > > > > >> > I'm happy to discuss the architecture further on > this > > >> > > thread, > > >> > > > > > > whether > > >> > > > > > > > >> the > > >> > > > > > > > >> > filter approach can support re-vending, or whether > we > > >> > need a > > >> > > > > > > > >> handler-level > > >> > > > > > > > >> > design. I may be slow to respond next week but I'll > > >> follow > > >> > > up > > >> > > > > > after > > >> > > > > > > > >> that. > > >> > > > > > > > >> > > > >> > > > > > > > >> > Thanks, > > >> > > > > > > > >> > > > >> > > > > > > > >> > Huaxin > > >> > > > > > > > >> > > > >> > > > > > > > >> > On Sat, Apr 4, 2026 at 9:34 AM Robert Stupp < > > >> > [email protected] > > >> > > > > > >> > > > > > wrote: > > >> > > > > > > > >> > > > >> > > > > > > > >> > > Hi all, > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > I need to raise some fundamental design concerns > > with > > >> > the > > >> > > > > > current > > >> > > > > > > > >> > > Idempotency-Key implementation in PR #3803 [1]. > > Some > > >> of > > >> > > > these > > >> > > > > > were > > >> > > > > > > > >> > > raised in November 2025 [2] and were acknowledged > > [3] > > >> > but > > >> > > > > never > > >> > > > > > > > >> > > addressed in the implementation. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > I raised the caller identity concern on the > > >> > > > > > > > >> > > mailing list in November 2025 [2]. In December > 2025 > > >> > [4], I > > >> > > > > > > > >> > > reiterated that all existing implementations I > > could > > >> > find > > >> > > do > > >> > > > > > > > >> > > request fingerprinting including the request > body, > > >> and > > >> > > > warned > > >> > > > > > that > > >> > > > > > > > >> > > the idempotency subsystem itself must not become > a > > >> > source > > >> > > of > > >> > > > > > > > >> > > failures. The PR was opened in February 2026 > > without > > >> > > > > addressing > > >> > > > > > > > >> > > either point. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > The credential leakage consequences surfaced > during > > >> code > > >> > > > > review > > >> > > > > > > > >> > > [5]. The response was an offline decision between > > >> three > > >> > > > > > > > contributors, > > >> > > > > > > > >> > > without mailing list discussion [6]. The decision > > >> was to > > >> > > > split > > >> > > > > > > > >> > > into Phase 1 (strip credentials, which produces > > >> broken > > >> > > > > responses > > >> > > > > > > > >> > > but does not fix the authorization bypass) and > > Phase > > >> 2 > > >> > > > > (re-vend > > >> > > > > > > > >> > > on replay, which would require a complete > redesign > > >> and > > >> > > > > > > > >> > > reimplementation). Details below. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > I think we need to resolve these at the design > > level > > >> > > before > > >> > > > > more > > >> > > > > > > > >> > > code is written. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > I will not do further code reviews on this PR > until > > >> > these > > >> > > > > design > > >> > > > > > > > >> > > concerns are resolved. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > 1. Caller identity is not part of the idempotency > > >> > binding > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > The caller's identity is not part of the binding. > > The > > >> > > > > > > > >> > > filter replays cached responses before Polaris's > > >> > > > > resource-level > > >> > > > > > > > >> > > authorization runs. Any authenticated user with a > > >> valid > > >> > > > > > > idempotency > > >> > > > > > > > >> > > key can receive a cached response for a resource > > they > > >> > are > > >> > > > not > > >> > > > > > > > >> > > authorized to access, including vended storage > > >> > > credentials. > > >> > > > > This > > >> > > > > > > > >> > > is a full authorization bypass. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > Idempotency keys are HTTP headers that end up in > > >> access > > >> > > > logs, > > >> > > > > > > > >> > > observability tooling, debug output. They cannot > > be a > > >> > > > security > > >> > > > > > > > >> > > boundary. I described the attack vector in my PR > > >> review > > >> > > [5]. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > The proposed Phase 2 [6] says it will > "re-authorize > > >> the > > >> > > user > > >> > > > > and > > >> > > > > > > > >> > > re-vend fresh credentials on replay." That > > addresses > > >> > > > > credentials > > >> > > > > > > > >> > > but not the binding itself. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > 2. Credential stripping breaks Polaris's > deployment > > >> > model > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > Without mailing list discussion, the decision was > > >> made > > >> > > > offline > > >> > > > > > > > >> > > to strip credentials before storing ("Phase 1") > > [6]. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > Polaris's purpose is to vend down-scoped > > credentials. > > >> > > > Clients > > >> > > > > > have > > >> > > > > > > > >> > > no ambient storage access. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > A stripped response returns a successful response > > >> > ("table > > >> > > > > > > created") > > >> > > > > > > > >> > > to the client, which then fails hard because it > > >> cannot > > >> > > > access > > >> > > > > > the > > >> > > > > > > > >> > > object storage. Query engines (Spark, Trino, > etc.) > > >> will > > >> > > fail > > >> > > > > on > > >> > > > > > > > >> > > the first storage operation against that table. > > Query > > >> > > engine > > >> > > > > > users > > >> > > > > > > > >> > > have no way to resolve this, they don't control > the > > >> REST > > >> > > > > catalog > > >> > > > > > > > >> > > interaction. For stage-create, the client gets a > > >> > > successful > > >> > > > > > > > >> > > response telling it to write data but has no > > >> credentials > > >> > > to > > >> > > > > > write > > >> > > > > > > > >> > > with. No fallback exists. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > The stripped response is spec-compliant, but > > >> > > spec-compliant > > >> > > > > and > > >> > > > > > > > >> > > operationally correct are not the same thing. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > 3. Phase 2 is architecturally incompatible with > > >> Phase 1 > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > The filter is a generic HTTP filter below the > > >> > application > > >> > > > > layer. > > >> > > > > > > It > > >> > > > > > > > >> > > treats responses as opaque JSON. Credential > vending > > >> > > happens > > >> > > > > deep > > >> > > > > > > > >> > > inside IcebergCatalogHandler, scoped to specific > > >> table > > >> > > > > locations > > >> > > > > > > > >> > > and principals, through cloud IAM calls (STS > > >> AssumeRole, > > >> > > GCS > > >> > > > > > token > > >> > > > > > > > >> > > exchange). > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > Re-vending on replay would require the filter to > > >> resolve > > >> > > the > > >> > > > > > > > >> > > table's storage location, authorize the caller, > > call > > >> > cloud > > >> > > > IAM > > >> > > > > > to > > >> > > > > > > > >> > > mint scoped credentials, and manage credential > > >> caching. > > >> > > That > > >> > > > > is > > >> > > > > > > > >> > > not a filter, that is reimplementing the catalog > > >> > handler's > > >> > > > > > > > >> > > authorization and credential vending logic. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > Phase 1 and Phase 2 need to be designed and > shipped > > >> > > > together. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > 4. Operational overhead > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > Every idempotent mutation adds 3 database > > round-trips > > >> > > > (INSERT, > > >> > > > > > > > UPDATE, > > >> > > > > > > > >> > > heartbeat UPDATEs), each borrowing its own > > connection > > >> > > > > > > > >> > > (autoCommit=true, no reuse), sharing the > catalog's > > >> > > > connection > > >> > > > > > pool > > >> > > > > > > > >> > > and RDBMS. This added latency increases the > > >> probability > > >> > of > > >> > > > > > client > > >> > > > > > > > >> > > timeouts, which are exactly the transient > failures > > >> that > > >> > > > > > > idempotency > > >> > > > > > > > >> > > is supposed to prevent. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > The idempotency table is high-churn and > write-hot, > > >> > > competing > > >> > > > > > with > > >> > > > > > > > >> > > catalog operations for connection pool capacity. > > The > > >> > > > overhead > > >> > > > > > gets > > >> > > > > > > > >> > > worse on managed and distributed database > backends. > > >> > > > Duplicate > > >> > > > > > > > >> > > requests trigger polling loops with repeated > > database > > >> > > reads. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > No benchmark data has been presented for any > > backend. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > What needs to change > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > The current generic-filter approach cannot > support > > >> these > > >> > > > > > > > >> > > requirements. This needs a different design. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > A new design must address the following: > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > a) The idempotency binding MUST include the > > caller's > > >> > > > principal > > >> > > > > > > > >> > > identity. > > >> > > > > > > > >> > > b) Credential-bearing responses must not be > stored > > >> > as-is. > > >> > > On > > >> > > > > > > > >> > > replay, the handler must build a fresh > response > > >> with > > >> > > > fresh > > >> > > > > > > > >> > > credentials for the current caller. > > >> > > > > > > > >> > > c) The operational overhead needs to be > > benchmarked. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > I am happy to review code once the design > addresses > > >> > these > > >> > > > > > points. > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > Best, > > >> > > > > > > > >> > > Robert > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > [1] https://github.com/apache/polaris/pull/3803 > > >> > > > > > > > >> > > [2] > > >> > > > > > > > > > >> > https://lists.apache.org/thread/qqrgr2bzsp69629jdj8kf39m10pzwy6l > > >> > > > > > > > >> > > (Nov 24, 2025: "Building a service-internal > key > > >> from > > >> > > the > > >> > > > > > > client > > >> > > > > > > > >> > > provided idempotency key plus more data from > > the > > >> > > request > > >> > > > > > gives > > >> > > > > > > > >> > > better uniqueness. That more data can come > > from: > > >> > > > operation > > >> > > > > > id, > > >> > > > > > > > >> > > all operation parameters, including the whole > > >> > payload, > > >> > > > > > quite a > > >> > > > > > > > >> > > few HTTP request headers like user-agent, > > >> > > authorization, > > >> > > > > > host, > > >> > > > > > > > >> > > accept-* and also the client's address.") > > >> > > > > > > > >> > > [3] > > >> > > > > > > > > > >> > https://lists.apache.org/thread/7vy44mxsq2cgtp8gsj0r8pk8blp37h5b > > >> > > > > > > > >> > > (Dec 8, 2025: "I agree that a bare > > >> Idempotency-Key + > > >> > > > > entity > > >> > > > > > > > >> > > identifier is not enough to protect against > > >> buggy or > > >> > > > > > malicious > > >> > > > > > > > >> > > clients; fingerprinting the full request > would > > be > > >> > > > > > stronger.") > > >> > > > > > > > >> > > [4] > > >> > > > > > > > > > >> > https://lists.apache.org/thread/jt3gmpnjh6z490dvxorjs6ms00kvo264 > > >> > > > > > > > >> > > (Dec 8, 2025: "We should avoid failing > Polaris > > if > > >> > this > > >> > > > > > > subsystem > > >> > > > > > > > >> > > fails, or letting this subsystem be a reason > > for > > >> its > > >> > > > > > existence > > >> > > > > > > > >> > > (aka retry due to timeouts because the > > >> > > > idempotent-request > > >> > > > > > > > >> > > subsystem hangs).") > > >> > > > > > > > >> > > [5] > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > >> > > > > > > > > > >> > > > > > > > >> > > > > > >> > > > >> > > https://github.com/apache/polaris/pull/3803#pullrequestreview-4051690055 > > >> > > > > > > > >> > > [6] > > >> > > > > > > > >> > > >> > > > > > > >> https://github.com/apache/polaris/pull/3803#issuecomment-4180969838 > > >> > > > > > > > >> > > > > >> > > > > > > > >> > > > >> > > > > > > > >> > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > >
