Hi Dimitri, Thanks for sharing your feedback. I see the analogy you’re making with tasks/operations, but I don’t think it holds in this case.
Idempotency and tasks solve very different problems: - Idempotency is about safely retrying a synchronous request without causing side effects or inconsistencies. See more details in the IETF doc[1]. - Tasks are about orchestrating and tracking asynchronous execution. Merging the two conceptually risks overcomplicating the problem. The client doesn’t need or want to manage a long-lived task here, only to ensure that a retry of the same payload doesn’t break consistency. I don’t think we should couple the idempotency proposal to tasks. Doing so creates unnecessary dependencies and delays for a feature that is valuable in its own right. Tasks may be useful for other scenarios, but idempotency should remain a lightweight mechanism, independent of whether the server executes work synchronously or asynchronously. So while there might be some surface-level similarities, I believe treating idempotency as a standalone concern is the more pragmatic path forward. [1] https://www.ietf.org/archive/id/draft-ietf-httpapi-idempotency-key-header-06.txt Yufei On Tue, Sep 16, 2025 at 11:08 AM Dmitri Bourlatchkov <[email protected]> wrote: > Hi Yufei, > > As I understand, the proposal is to allow re-submitted requests to succeed > when the previous request with the same key (ID) completed on the server > side while the client was in the process of re-trying. > > I agree that this is a valuable feature. > > However, my point about connecting it to tasks / operations is completely > at the conceptual level. > > If one considers the server as a black box, the client effectively submits > a request with some key and payload. Then the client submits another > request with the same key and payload and expects the second request to > return the result of the first request (success or failure). This is > conceptually equivalent to the server executing requests as asynchronous > tasks (the key being the task's ID). > > Therefore, before we consider any implementation for the "idempotency" > feature, I believe it would be wise to consider possible synergies with the > tasks proposal [1]. > > I personally think these synergies are in fact quite large and powerful, > especially considering Polaris Servers in a distributed environment. > > To put it another way, I believe implementing the "idempotency" proposal on > top of the "tasks" proposal is going to be quite easy (if not trivial). > > [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l > > Cheers, > Dmitri. > > On Sat, Sep 13, 2025 at 11:44 AM Yufei Gu <[email protected]> wrote: > > > Hi Dimitri, > > > > The idempotency key is mainly to reduce the chance of inconsistencies of > > table commit. Here is a typical scenario: A commit succeeds but the > client > > misses the response (due to timeout, restart, etc.), client retries and > > hits 409 Conflict, then mistakenly treats this as a failed commit and > > cleans up related files, leading to inconsistency. Without an idempotency > > key, the best a client could do is to treat most of the non-200 response > as > > unknown status, and reload the table to verify, which is also > inefficient. > > > > It could enable more use cases, as you mentioned, getting status of an > > async task, but I would not complicate it at this point. That probably > > deserves a separate proposal, since task status goes beyond simple HTTP > > codes and would require clients to interpret them correctly. For now, > > http_status is sufficient, as it’s already well-supported by all existing > > clients. > > > > Yufei > > > > > > On Fri, Sep 12, 2025 at 7:58 PM Dmitri Bourlatchkov <[email protected]> > > wrote: > > > > > Hi Huaxin. > > > > > > Thanks again for starting this proposal. > > > > > > After a quick initial read of the docs, this feels like a proposal to > > treat > > > each REST change request as an identifiable "task" or "operation" where > > the > > > ID is allocated by the client and the execution status is tracked by > the > > > server. > > > > > > Essentially, with the proposed feature implemented, it would be > possible > > to > > > provide a "get status" API for each "idempotency key". > > > > > > Does that make sense? > > > > > > Do you think the proposal could be reformulated in terms of > > > "tasks/operations"? From my POV it might be clearer and easier to > > > understand. > > > > > > In that regard, as far as Polaris is concerned, I think it connected > well > > > and could build on top of the tasks proposal [1]. > > > > > > [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l > > > > > > Thanks, > > > Dmitri. > > > > > > > > > On Thu, Sep 11, 2025 at 2:37 PM huaxin gao <[email protected]> > > wrote: > > > > > > > Hi all, > > > > > > > > I’d like to propose adding *optional idempotent retries* to Polaris > > REST > > > > mutation endpoints via an Idempotency-Key header. > > > > > > > > *The Problem* > > > > When a mutation (e.g., POST /v1/tables/{name}/commit) succeeds but > the > > > > client doesn’t receive the response (timeout, restart, etc.), a retry > > can > > > > hit 409 Conflict. Some clients then treat this as a failed commit and > > > clean > > > > up files that actually belong to the committed snapshot, causing > > > > catalog/storage inconsistency. > > > > > > > > *The Proposed Solution* > > > > Accept Idempotency-Key on mutation routes. The server binds the key > to > > a > > > > canonical payload hash and enforces: > > > > > > > > - > > > > > > > > *Same key + same payload* -> return the original result (no > > > > re-execution). > > > > - > > > > > > > > *Same key + different payload* -> 422 Unprocessable Content. > > > > - > > > > > > > > *In-flight duplicate* -> 409 Conflict. > > > > Transient 5xx are never cached. > > > > > > > > *Discovery & Compatibility* > > > > When serving Iceberg REST, Polaris can advertise support and > retention > > > via > > > > GET > > > > /v1/config, e.g.: > > > > > > > > { "properties": { "idempotency-key-supported": "true", > > > > "idempotency-key-lifetime": "PT30M" } } > > > > > > > > Fully backward compatible: servers may ignore the header; clients > > enable > > > > retries only when discovery indicates support. > > > > > > > > This is the *server-side counterpart* to the Iceberg REST Catalog > > > > Idempotency proposal > > > > < > > > > > > > > > > https://docs.google.com/document/d/1WyiIk08JRe8AjWh63txIP4i2xcIUHYQWFrF_1CCS3uw/edit?tab=t.0#heading=h.jfecktgonj1i > > > > > > > > > I sent to the Iceberg community. > > > > Here is the Polaris proposal > > > > < > > > > > > > > > > https://docs.google.com/document/d/1L5A8Cspugsk1dW4Ij4dy5wB5FKa8KnaXT0LHBJdgq9w/edit?usp=sharing > > > > >. > > > > Feedback welcome! > > > > > > > > Thanks, > > > > > > > > Huaxin > > > > > > > > > >
