Hi all, thanks for the thoughtful feedback (JB, Dmitri, Yufei). Sorry for the late reply. I missed the emails because I wasn’t subscribed to the dev list.
The companion Iceberg REST Catalog proposal is here (discussion + spec text): https://docs.google.com/document/d/1WyiIk08JRe8AjWh63txIP4i2xcIUHYQWFrF_1CCS3uw/edit?tab=t.0#heading=h.jfecktgonj1i The discussion thread is here: https://lists.apache.org/[email protected] The intent of the Polaris proposal is to implement those Iceberg client-visible semantics. I see the conceptual link to tasks/operations. For this proposal, I’d like to keep the scope small and client-visible. Concretely: Same key + same payload -> return the original result (no re-run) Same key + different payload -> 422 If the first request is still running -> don’t start another (409 or wait-then-replay) Only replay finalized results (2xx or terminal 4xx) Advertise support via GET /v1/config (idempotency-key-supported, idempotency-key-lifetime) This aligns with the IETF Idempotency-Key draft. It’s intentionally lightweight, so I’d prefer to keep the scope simple for now rather than tie it to a tasks framework. Thanks, Huaxin On 2025/09/16 19:12:31 Yufei Gu wrote: > Hi Dimitri, > > Thanks for sharing your feedback. I see the analogy you’re making with > tasks/operations, but I don’t think it holds in this case. > > Idempotency and tasks solve very different problems: > > - Idempotency is about safely retrying a synchronous request without > causing side effects or inconsistencies. See more details in the IETF > doc[1]. > - Tasks are about orchestrating and tracking asynchronous execution. > > Merging the two conceptually risks overcomplicating the problem. The client > doesn’t need or want to manage a long-lived task here, only to ensure that > a retry of the same payload doesn’t break consistency. > > I don’t think we should couple the idempotency proposal to tasks. Doing so > creates unnecessary dependencies and delays for a feature that is valuable > in its own right. Tasks may be useful for other scenarios, but idempotency > should remain a lightweight mechanism, independent of whether the server > executes work synchronously or asynchronously. > > So while there might be some surface-level similarities, I believe treating > idempotency as a standalone concern is the more pragmatic path forward. > > [1] > https://www.ietf.org/archive/id/draft-ietf-httpapi-idempotency-key-header-06.txt > > Yufei > > > On Tue, Sep 16, 2025 at 11:08 AM Dmitri Bourlatchkov <[email protected]> > wrote: > > > Hi Yufei, > > > > As I understand, the proposal is to allow re-submitted requests to succeed > > when the previous request with the same key (ID) completed on the server > > side while the client was in the process of re-trying. > > > > I agree that this is a valuable feature. > > > > However, my point about connecting it to tasks / operations is completely > > at the conceptual level. > > > > If one considers the server as a black box, the client effectively submits > > a request with some key and payload. Then the client submits another > > request with the same key and payload and expects the second request to > > return the result of the first request (success or failure). This is > > conceptually equivalent to the server executing requests as asynchronous > > tasks (the key being the task's ID). > > > > Therefore, before we consider any implementation for the "idempotency" > > feature, I believe it would be wise to consider possible synergies with the > > tasks proposal [1]. > > > > I personally think these synergies are in fact quite large and powerful, > > especially considering Polaris Servers in a distributed environment. > > > > To put it another way, I believe implementing the "idempotency" proposal on > > top of the "tasks" proposal is going to be quite easy (if not trivial). > > > > [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l > > > > Cheers, > > Dmitri. > > > > On Sat, Sep 13, 2025 at 11:44 AM Yufei Gu <[email protected]> wrote: > > > > > Hi Dimitri, > > > > > > The idempotency key is mainly to reduce the chance of inconsistencies of > > > table commit. Here is a typical scenario: A commit succeeds but the > > client > > > misses the response (due to timeout, restart, etc.), client retries and > > > hits 409 Conflict, then mistakenly treats this as a failed commit and > > > cleans up related files, leading to inconsistency. Without an idempotency > > > key, the best a client could do is to treat most of the non-200 response > > as > > > unknown status, and reload the table to verify, which is also > > inefficient. > > > > > > It could enable more use cases, as you mentioned, getting status of an > > > async task, but I would not complicate it at this point. That probably > > > deserves a separate proposal, since task status goes beyond simple HTTP > > > codes and would require clients to interpret them correctly. For now, > > > http_status is sufficient, as it’s already well-supported by all existing > > > clients. > > > > > > Yufei > > > > > > > > > On Fri, Sep 12, 2025 at 7:58 PM Dmitri Bourlatchkov <[email protected]> > > > wrote: > > > > > > > Hi Huaxin. > > > > > > > > Thanks again for starting this proposal. > > > > > > > > After a quick initial read of the docs, this feels like a proposal to > > > treat > > > > each REST change request as an identifiable "task" or "operation" where > > > the > > > > ID is allocated by the client and the execution status is tracked by > > the > > > > server. > > > > > > > > Essentially, with the proposed feature implemented, it would be > > possible > > > to > > > > provide a "get status" API for each "idempotency key". > > > > > > > > Does that make sense? > > > > > > > > Do you think the proposal could be reformulated in terms of > > > > "tasks/operations"? From my POV it might be clearer and easier to > > > > understand. > > > > > > > > In that regard, as far as Polaris is concerned, I think it connected > > well > > > > and could build on top of the tasks proposal [1]. > > > > > > > > [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l > > > > > > > > Thanks, > > > > Dmitri. > > > > > > > > > > > > On Thu, Sep 11, 2025 at 2:37 PM huaxin gao <[email protected]> > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > I’d like to propose adding *optional idempotent retries* to Polaris > > > REST > > > > > mutation endpoints via an Idempotency-Key header. > > > > > > > > > > *The Problem* > > > > > When a mutation (e.g., POST /v1/tables/{name}/commit) succeeds but > > the > > > > > client doesn’t receive the response (timeout, restart, etc.), a retry > > > can > > > > > hit 409 Conflict. Some clients then treat this as a failed commit and > > > > clean > > > > > up files that actually belong to the committed snapshot, causing > > > > > catalog/storage inconsistency. > > > > > > > > > > *The Proposed Solution* > > > > > Accept Idempotency-Key on mutation routes. The server binds the key > > to > > > a > > > > > canonical payload hash and enforces: > > > > > > > > > > - > > > > > > > > > > *Same key + same payload* -> return the original result (no > > > > > re-execution). > > > > > - > > > > > > > > > > *Same key + different payload* -> 422 Unprocessable Content. > > > > > - > > > > > > > > > > *In-flight duplicate* -> 409 Conflict. > > > > > Transient 5xx are never cached. > > > > > > > > > > *Discovery & Compatibility* > > > > > When serving Iceberg REST, Polaris can advertise support and > > retention > > > > via > > > > > GET > > > > > /v1/config, e.g.: > > > > > > > > > > { "properties": { "idempotency-key-supported": "true", > > > > > "idempotency-key-lifetime": "PT30M" } } > > > > > > > > > > Fully backward compatible: servers may ignore the header; clients > > > enable > > > > > retries only when discovery indicates support. > > > > > > > > > > This is the *server-side counterpart* to the Iceberg REST Catalog > > > > > Idempotency proposal > > > > > < > > > > > > > > > > > > > > https://docs.google.com/document/d/1WyiIk08JRe8AjWh63txIP4i2xcIUHYQWFrF_1CCS3uw/edit?tab=t.0#heading=h.jfecktgonj1i > > > > > > > > > > > I sent to the Iceberg community. > > > > > Here is the Polaris proposal > > > > > < > > > > > > > > > > > > > > https://docs.google.com/document/d/1L5A8Cspugsk1dW4Ij4dy5wB5FKa8KnaXT0LHBJdgq9w/edit?usp=sharing > > > > > >. > > > > > Feedback welcome! > > > > > > > > > > Thanks, > > > > > > > > > > Huaxin > > > > > > > > > > > > > > >
