Hi Dimitri,

Thanks for sharing your feedback. I see the analogy you’re making with
tasks/operations, but I don’t think it holds in this case.

Idempotency and tasks solve very different problems:

   - Idempotency is about safely retrying a synchronous request without
   causing side effects or inconsistencies. See more details in the IETF
   doc[1].
   - Tasks are about orchestrating and tracking asynchronous execution.

Merging the two conceptually risks overcomplicating the problem. The client
doesn’t need or want to manage a long-lived task here, only to ensure that
a retry of the same payload doesn’t break consistency.

I don’t think we should couple the idempotency proposal to tasks. Doing so
creates unnecessary dependencies and delays for a feature that is valuable
in its own right. Tasks may be useful for other scenarios, but idempotency
should remain a lightweight mechanism, independent of whether the server
executes work synchronously or asynchronously.

So while there might be some surface-level similarities, I believe treating
idempotency as a standalone concern is the more pragmatic path forward.

[1]
https://www.ietf.org/archive/id/draft-ietf-httpapi-idempotency-key-header-06.txt

Yufei


On Tue, Sep 16, 2025 at 11:08 AM Dmitri Bourlatchkov <[email protected]>
wrote:

> Hi Yufei,
>
> As I understand, the proposal is to allow re-submitted requests to succeed
> when the previous request with the same key (ID) completed on the server
> side while the client was in the process of re-trying.
>
> I agree that this is a valuable feature.
>
> However, my point about connecting it to tasks / operations is completely
> at the conceptual level.
>
> If one considers the server as a black box, the client effectively submits
> a request with some key and payload. Then the client submits another
> request with the same key and payload and expects the second request to
> return the result of the first request (success or failure). This is
> conceptually equivalent to the server executing requests as asynchronous
> tasks (the key being the task's ID).
>
> Therefore, before we consider any implementation for the "idempotency"
> feature, I believe it would be wise to consider possible synergies with the
> tasks proposal [1].
>
> I personally think these synergies are in fact quite large and powerful,
> especially considering Polaris Servers in a distributed environment.
>
> To put it another way, I believe implementing the "idempotency" proposal on
> top of the "tasks" proposal is going to be quite easy (if not trivial).
>
> [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l
>
> Cheers,
> Dmitri.
>
> On Sat, Sep 13, 2025 at 11:44 AM Yufei Gu <[email protected]> wrote:
>
> > Hi Dimitri,
> >
> > The idempotency key is mainly to reduce the chance of inconsistencies of
> > table commit. Here is a typical scenario: A commit succeeds but the
> client
> > misses the response (due to timeout, restart, etc.), client retries and
> > hits 409 Conflict, then mistakenly treats this as a failed commit and
> > cleans up related files, leading to inconsistency. Without an idempotency
> > key, the best a client could do is to treat most of the non-200 response
> as
> > unknown status, and reload the table to verify, which is also
> inefficient.
> >
> > It could enable more use cases, as you mentioned, getting status of an
> > async task, but I would not complicate it at this point. That probably
> > deserves a separate proposal, since task status goes beyond simple HTTP
> > codes and would require clients to interpret them correctly. For now,
> > http_status is sufficient, as it’s already well-supported by all existing
> > clients.
> >
> > Yufei
> >
> >
> > On Fri, Sep 12, 2025 at 7:58 PM Dmitri Bourlatchkov <[email protected]>
> > wrote:
> >
> > > Hi Huaxin.
> > >
> > > Thanks again for starting this proposal.
> > >
> > > After a quick initial read of the docs, this feels like a proposal to
> > treat
> > > each REST change request as an identifiable "task" or "operation" where
> > the
> > > ID is allocated by the client and the execution status is tracked by
> the
> > > server.
> > >
> > > Essentially, with the proposed feature implemented, it would be
> possible
> > to
> > > provide a "get status" API for each "idempotency key".
> > >
> > > Does that make sense?
> > >
> > > Do you think the proposal could be reformulated in terms of
> > > "tasks/operations"? From my POV it might be clearer and easier to
> > > understand.
> > >
> > > In that regard, as far as Polaris is concerned, I think it connected
> well
> > > and could build on top of the tasks proposal [1].
> > >
> > > [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l
> > >
> > > Thanks,
> > > Dmitri.
> > >
> > >
> > > On Thu, Sep 11, 2025 at 2:37 PM huaxin gao <[email protected]>
> > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I’d like to propose adding *optional idempotent retries* to Polaris
> > REST
> > > > mutation endpoints via an Idempotency-Key header.
> > > >
> > > > *The Problem*
> > > > When a mutation (e.g., POST /v1/tables/{name}/commit) succeeds but
> the
> > > > client doesn’t receive the response (timeout, restart, etc.), a retry
> > can
> > > > hit 409 Conflict. Some clients then treat this as a failed commit and
> > > clean
> > > > up files that actually belong to the committed snapshot, causing
> > > > catalog/storage inconsistency.
> > > >
> > > > *The Proposed Solution*
> > > > Accept Idempotency-Key on mutation routes. The server binds the key
> to
> > a
> > > > canonical payload hash and enforces:
> > > >
> > > >    -
> > > >
> > > >    *Same key + same payload* -> return the original result (no
> > > >    re-execution).
> > > >    -
> > > >
> > > >    *Same key + different payload* -> 422 Unprocessable Content.
> > > >    -
> > > >
> > > >    *In-flight duplicate* -> 409 Conflict.
> > > >    Transient 5xx are never cached.
> > > >
> > > > *Discovery & Compatibility*
> > > > When serving Iceberg REST, Polaris can advertise support and
> retention
> > > via
> > > > GET
> > > > /v1/config, e.g.:
> > > >
> > > > { "properties": { "idempotency-key-supported": "true",
> > > >                   "idempotency-key-lifetime": "PT30M" } }
> > > >
> > > > Fully backward compatible: servers may ignore the header; clients
> > enable
> > > > retries only when discovery indicates support.
> > > >
> > > > This is the *server-side counterpart* to the Iceberg REST Catalog
> > > > Idempotency proposal
> > > > <
> > > >
> > >
> >
> https://docs.google.com/document/d/1WyiIk08JRe8AjWh63txIP4i2xcIUHYQWFrF_1CCS3uw/edit?tab=t.0#heading=h.jfecktgonj1i
> > > > >
> > > > I sent to the Iceberg community.
> > > > Here is the Polaris proposal
> > > > <
> > > >
> > >
> >
> https://docs.google.com/document/d/1L5A8Cspugsk1dW4Ij4dy5wB5FKa8KnaXT0LHBJdgq9w/edit?usp=sharing
> > > > >.
> > > > Feedback welcome!
> > > >
> > > > Thanks,
> > > >
> > > > Huaxin
> > > >
> > >
> >
>

Reply via email to