Sorry, Robert, it might be my fault for causing this new thread. My
intention was to have a more focused discussion about java code changes
here.

Cheers,
Dmitri.

On Mon, Dec 8, 2025 at 5:57 AM Robert Stupp <[email protected]> wrote:

> Hi,
>
> ah, just saw this email thread, but I replied to the other discussion
> thread earlier.
> Can we consider discussing this topic in just one thread?
>
> On Sun, Dec 7, 2025 at 2:19 AM huaxin gao <[email protected]> wrote:
> >
> > Hi Dmitri,
> >
> > Thanks a lot for the review and for calling this out!  You’re right that
> PR
> > 3205 only shows the persistence SPI, so the cross‑node behavior isn’t
> > obvious from the code alone.I’ve updated the design doc with a new
> section
> > “Multi‑node Coordination, Liveness, and Cleanup” that spells this out,
> but
> > let me summarize the answers to your questions here:
> >
> > Retry hits a different node while the original is still running
> >
> >    - Each request with an Idempotency-Key goes through an HTTP
> idempotency
> >    filter (IdempotencyFilter) that calls
> >
> > reserve(realmId, key, operationType, normalizedResourceId, expiresAt,
> > executorId, now).
> >
> >
> >    - The Postgres implementation uses INSERT … ON CONFLICT DO NOTHING
> > on (realm_id,
> >    idempotency_key), so only one node wins globally (OWNED); all others
> see
> >    DUPLICATE + the existing record.
> >    - The winning node is the owner and periodically calls
> >    updateHeartbeat(...) while the record is IN_PROGRESS, which keeps
> >    heartbeat_at fresh.
> >    - When a duplicate lands on another node, it looks at heartbeat_at
> and a
> >    short lease (lease_ttl, e.g. ~25s): if now − heartbeat_at ≤ lease_ttl,
> >    the original owner is treated as active. The duplicate does not
> re‑execute
> >    the mutation; it waits (with a small bounded poll) for the record to
> become
> >    FINALIZED, then replays the stored result.
> >
> > Original node crashes without finalizing the record
> >
> >    - In that case we still have an IN_PROGRESS row with an old
> heartbeat_at
> >    and the original executor_id.
> >    - After lease_ttl has passed, subsequent requests with the same key
> see
> >    IN_PROGRESS and a stale heartbeat (now − heartbeat_at > lease_ttl), so
> >    the filter treats the owner as dead and invokes a follow‑up Reconciler
> >    SPI.
> >    - The Reconciler does “finalize‑gap” reconciliation: it inspects the
> >    catalog state for the (operationType, resourceId, realmId) to decide
> >    whether the original mutation actually landed.
> >       - If the state already reflects the intended change, it can mark
> the
> >       record FINALIZED and allow duplicates to replay an equivalent
> success
> >       response without re‑executing the commit.
> >       - If the state does not reflect the change, it can re‑execute once
> on
> >       the new node (“takeover”).
> >    - The current PR just introduces the fields needed for this
> (executor_id,
> >    heartbeat_at, http_status, finalized_at, response_*); the filter +
> >    Reconciler wiring comes in a follow‑up change.
> >
> > Cleanup of old idempotency records / collaboration across nodes
> >
> >    - On first acceptance we set expires_at = firstAcceptance +
> >    idempotency-key-lifetime + grace. This is a long‑lived retention
> >    horizon, separate from the short liveness lease:
> >       - lease_ttl (seconds) → “is the current owner still alive?”
> >       - expires_at (minutes/hours+) → “how long must we remember this key
> >       for duplicates/replay and when can we delete it?”
> >    - The SPI exposes purgeExpired(before) which is implemented as DELETE
> >    FROM idempotency_records WHERE expires_at < ?. Any node (or a
> >    maintenance job) can call this periodically. The delete is idempotent
> and
> >    keyed only by time, so multiple nodes can run cleanup concurrently
> without
> >    coordination; it’s intentionally a collaborative, best‑effort process.
> >
> > Thanks,
> > Huaxin
> >
> >
> > On Fri, Dec 5, 2025 at 4:56 PM Dmitri Bourlatchkov <[email protected]>
> wrote:
> >
> > > Hi Huaxin,
> > >
> > > I looked at the PR briefly. It appears to add a persistence SPI for
> storing
> > > information related to the processing of Idempotency Keys.
> > >
> > > I do not have any particular comment about that interface itself ATM,
> but I
> > > suppose it will be used (eventually) in conjunction with coordinating
> > > request processing in multiple Polaris server nodes. This aspect is not
> > > apparent in PR 3205 and IIRC it is not discussed in depth in the
> proposal
> > > doc (please correct me if I missed that). The updateHeartbeat() method
> is
> > > declared, but how it is supposed to be used is not clear to me.
> > >
> > > So, in order to better understand how the proposed persistence SPI can
> work
> > > end-to-end, could you expand on your proposal for coordinating retry
> > > processing across nodes and how that will relate to persistence?
> > >
> > > Specifically, I mean the case when a retry is submitted by a client to
> one
> > > node while another node is still processing the original request.
> Another
> > > case is when the original request node crashes without updating the
> > > persisted data.
> > >
> > > Additionally, what is your vision regarding the cleanup of old
> Idempotency
> > > records? Is this a collaborative process where all server nodes
> participate
> > > equally?
> > >
> > > Thanks,
> > > Dmitri.
> > >
> > > On Thu, Dec 4, 2025 at 4:53 PM huaxin gao <[email protected]>
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > Following the earlier idempotency proposal, I’ve opened a PR
> > > > <https://github.com/apache/polaris/pull/3205> that adds the first
> > > > implementation of idempotency retries to Polaris core and would like
> > > > feedback on the Java design.
> > > >
> > > > The PR introduces:
> > > >
> > > >    -
> > > >
> > > >    org.apache.polaris.core.persistence.IdempotencyStore – an SPI to
> > > >    reserve/load/finalize/purge idempotency records (with
> ReserveResult /
> > > >    ReserveResultType).
> > > >
> > > >
> > > >    -
> > > >
> > > >    org.apache.polaris.idempotency.IdempotencyRecord – the value type
> for
> > > >    stored idempotency metadata.
> > > >
> > > >
> > > >    -
> > > >
> > > >    PostgresIdempotencyStore + PostgresIdempotencyStoreIT – a
> > > >    relational‑jdbc implementation backed by
> > > >    POLARIS_SCHEMA.idempotency_records, tested via
> > > Testcontainers/Postgres.
> > > >
> > > > Please take a look at the PR and reply here or comment on the change
> with
> > > > any concerns or suggestions.
> > > >
> > > > Thanks,
> > > >
> > > > Huaxin
> > > >
> > >
>

Reply via email to