Hi all, thanks for the thoughtful feedback (JB, Dmitri, Yufei). Sorry for the 
late reply. I missed the emails because I wasn’t subscribed to the dev list. 

The companion Iceberg REST Catalog proposal is here (discussion + spec text):
https://docs.google.com/document/d/1WyiIk08JRe8AjWh63txIP4i2xcIUHYQWFrF_1CCS3uw/edit?tab=t.0#heading=h.jfecktgonj1i

The discussion thread is here: 
https://lists.apache.org/[email protected]

The intent of the Polaris proposal is to implement those Iceberg client-visible 
semantics.

I see the conceptual link to tasks/operations. For this proposal, I’d like to 
keep the scope small and client-visible. Concretely:

Same key + same payload -> return the original result (no re-run)

Same key + different payload -> 422

If the first request is still running -> don’t start another (409 or 
wait-then-replay)

Only replay finalized results (2xx or terminal 4xx)

Advertise support via GET /v1/config (idempotency-key-supported, 
idempotency-key-lifetime)

This aligns with the IETF Idempotency-Key draft. It’s intentionally 
lightweight, so I’d prefer to keep the scope simple for now rather than tie it 
to a tasks framework.

Thanks,
Huaxin

On 2025/09/16 19:12:31 Yufei Gu wrote:
> Hi Dimitri,
> 
> Thanks for sharing your feedback. I see the analogy you’re making with
> tasks/operations, but I don’t think it holds in this case.
> 
> Idempotency and tasks solve very different problems:
> 
>    - Idempotency is about safely retrying a synchronous request without
>    causing side effects or inconsistencies. See more details in the IETF
>    doc[1].
>    - Tasks are about orchestrating and tracking asynchronous execution.
> 
> Merging the two conceptually risks overcomplicating the problem. The client
> doesn’t need or want to manage a long-lived task here, only to ensure that
> a retry of the same payload doesn’t break consistency.
> 
> I don’t think we should couple the idempotency proposal to tasks. Doing so
> creates unnecessary dependencies and delays for a feature that is valuable
> in its own right. Tasks may be useful for other scenarios, but idempotency
> should remain a lightweight mechanism, independent of whether the server
> executes work synchronously or asynchronously.
> 
> So while there might be some surface-level similarities, I believe treating
> idempotency as a standalone concern is the more pragmatic path forward.
> 
> [1]
> https://www.ietf.org/archive/id/draft-ietf-httpapi-idempotency-key-header-06.txt
> 
> Yufei
> 
> 
> On Tue, Sep 16, 2025 at 11:08 AM Dmitri Bourlatchkov <[email protected]>
> wrote:
> 
> > Hi Yufei,
> >
> > As I understand, the proposal is to allow re-submitted requests to succeed
> > when the previous request with the same key (ID) completed on the server
> > side while the client was in the process of re-trying.
> >
> > I agree that this is a valuable feature.
> >
> > However, my point about connecting it to tasks / operations is completely
> > at the conceptual level.
> >
> > If one considers the server as a black box, the client effectively submits
> > a request with some key and payload. Then the client submits another
> > request with the same key and payload and expects the second request to
> > return the result of the first request (success or failure). This is
> > conceptually equivalent to the server executing requests as asynchronous
> > tasks (the key being the task's ID).
> >
> > Therefore, before we consider any implementation for the "idempotency"
> > feature, I believe it would be wise to consider possible synergies with the
> > tasks proposal [1].
> >
> > I personally think these synergies are in fact quite large and powerful,
> > especially considering Polaris Servers in a distributed environment.
> >
> > To put it another way, I believe implementing the "idempotency" proposal on
> > top of the "tasks" proposal is going to be quite easy (if not trivial).
> >
> > [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l
> >
> > Cheers,
> > Dmitri.
> >
> > On Sat, Sep 13, 2025 at 11:44 AM Yufei Gu <[email protected]> wrote:
> >
> > > Hi Dimitri,
> > >
> > > The idempotency key is mainly to reduce the chance of inconsistencies of
> > > table commit. Here is a typical scenario: A commit succeeds but the
> > client
> > > misses the response (due to timeout, restart, etc.), client retries and
> > > hits 409 Conflict, then mistakenly treats this as a failed commit and
> > > cleans up related files, leading to inconsistency. Without an idempotency
> > > key, the best a client could do is to treat most of the non-200 response
> > as
> > > unknown status, and reload the table to verify, which is also
> > inefficient.
> > >
> > > It could enable more use cases, as you mentioned, getting status of an
> > > async task, but I would not complicate it at this point. That probably
> > > deserves a separate proposal, since task status goes beyond simple HTTP
> > > codes and would require clients to interpret them correctly. For now,
> > > http_status is sufficient, as it’s already well-supported by all existing
> > > clients.
> > >
> > > Yufei
> > >
> > >
> > > On Fri, Sep 12, 2025 at 7:58 PM Dmitri Bourlatchkov <[email protected]>
> > > wrote:
> > >
> > > > Hi Huaxin.
> > > >
> > > > Thanks again for starting this proposal.
> > > >
> > > > After a quick initial read of the docs, this feels like a proposal to
> > > treat
> > > > each REST change request as an identifiable "task" or "operation" where
> > > the
> > > > ID is allocated by the client and the execution status is tracked by
> > the
> > > > server.
> > > >
> > > > Essentially, with the proposed feature implemented, it would be
> > possible
> > > to
> > > > provide a "get status" API for each "idempotency key".
> > > >
> > > > Does that make sense?
> > > >
> > > > Do you think the proposal could be reformulated in terms of
> > > > "tasks/operations"? From my POV it might be clearer and easier to
> > > > understand.
> > > >
> > > > In that regard, as far as Polaris is concerned, I think it connected
> > well
> > > > and could build on top of the tasks proposal [1].
> > > >
> > > > [1] https://lists.apache.org/thread/gg0kn89vmblmjgllxn7jkn8ky2k28f5l
> > > >
> > > > Thanks,
> > > > Dmitri.
> > > >
> > > >
> > > > On Thu, Sep 11, 2025 at 2:37 PM huaxin gao <[email protected]>
> > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I’d like to propose adding *optional idempotent retries* to Polaris
> > > REST
> > > > > mutation endpoints via an Idempotency-Key header.
> > > > >
> > > > > *The Problem*
> > > > > When a mutation (e.g., POST /v1/tables/{name}/commit) succeeds but
> > the
> > > > > client doesn’t receive the response (timeout, restart, etc.), a retry
> > > can
> > > > > hit 409 Conflict. Some clients then treat this as a failed commit and
> > > > clean
> > > > > up files that actually belong to the committed snapshot, causing
> > > > > catalog/storage inconsistency.
> > > > >
> > > > > *The Proposed Solution*
> > > > > Accept Idempotency-Key on mutation routes. The server binds the key
> > to
> > > a
> > > > > canonical payload hash and enforces:
> > > > >
> > > > >    -
> > > > >
> > > > >    *Same key + same payload* -> return the original result (no
> > > > >    re-execution).
> > > > >    -
> > > > >
> > > > >    *Same key + different payload* -> 422 Unprocessable Content.
> > > > >    -
> > > > >
> > > > >    *In-flight duplicate* -> 409 Conflict.
> > > > >    Transient 5xx are never cached.
> > > > >
> > > > > *Discovery & Compatibility*
> > > > > When serving Iceberg REST, Polaris can advertise support and
> > retention
> > > > via
> > > > > GET
> > > > > /v1/config, e.g.:
> > > > >
> > > > > { "properties": { "idempotency-key-supported": "true",
> > > > >                   "idempotency-key-lifetime": "PT30M" } }
> > > > >
> > > > > Fully backward compatible: servers may ignore the header; clients
> > > enable
> > > > > retries only when discovery indicates support.
> > > > >
> > > > > This is the *server-side counterpart* to the Iceberg REST Catalog
> > > > > Idempotency proposal
> > > > > <
> > > > >
> > > >
> > >
> > https://docs.google.com/document/d/1WyiIk08JRe8AjWh63txIP4i2xcIUHYQWFrF_1CCS3uw/edit?tab=t.0#heading=h.jfecktgonj1i
> > > > > >
> > > > > I sent to the Iceberg community.
> > > > > Here is the Polaris proposal
> > > > > <
> > > > >
> > > >
> > >
> > https://docs.google.com/document/d/1L5A8Cspugsk1dW4Ij4dy5wB5FKa8KnaXT0LHBJdgq9w/edit?usp=sharing
> > > > > >.
> > > > > Feedback welcome!
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Huaxin
> > > > >
> > > >
> > >
> >
> 

Reply via email to