noob-se7en opened a new pull request, #18559:
URL: https://github.com/apache/pinot/pull/18559
## Summary
Fixes a race condition in `IdealStateGroupCommit` where a leader's own
queued entry could be silently applied by a subsequent batch after its caller
had already concluded the commit failed and run cleanup. In the pauseless
segment commit path, this produces "in IdealState, no ZK metadata" orphans that
block `RealtimeSegmentValidationManager` from making any further progress.
## Problem
When a leader thread's batch fails (typically because a co-batched entry
throws `PermanentUpdaterException` after exceeding the max segment completion
time deadline), the leader's `commit()` exits with the exception. But if the
leader's own entry was never iterated — because iteration stopped at the
throwing entry earlier in FIFO order — the entry remains in `_pending`. A
subsequent leader picks it up and applies its updater.
Meanwhile, the original leader's caller (e.g.,
`PinotLLCRealtimeSegmentManager.commitSegmentMetadataInternal`) catches the
thrown exception and runs cleanup — calling `removeSegmentZKMetadataBestEffort`
on the newly-created segment from Step 2. By the time the cleanup completes,
the subsequent leader has applied the queued update and written the segment
into IdealState. End state: segment is in IdealState as CONSUMING, but its ZK
metadata is gone.
## Fix
Add a `volatile boolean _cancelled` flag to `Entry`. In `commit()`'s catch
path, set the flag on the leader's own entry. The next leader's iteration check
skips and removes any cancelled entry.
```java
// In the iteration loop:
if (ent._cancelled) {
it.remove();
continue;
}
// In the catch block:
entry._cancelled = true;
processed.add(entry);
```
Why a flag instead of `ConcurrentLinkedQueue.remove(entry)`: the flag is
O(1); `remove(Object)` is O(n) because `ConcurrentLinkedQueue` is a linked list.
The fix preserves the pre-existing all-or-nothing batch semantics (any
thrown exception aborts the CAS write, all processed entries get the exception,
the caller still throws). The only additional behavior: the leader's own entry
can no longer outlive a failed batch and be silently applied later.
## Test plan
- [x] `IdealStateGroupCommitTest` — 8/8 pass
- `testFreshUpdaterAppliedAfterCallerThrows` (regression): caller throws,
drainer skips the cancelled entry, no orphan
- `testMultipleStuckAndFreshConsistency`: every owner's observed outcome
matches IdealState ground truth
- `testNoOrphanWhenCoBatchedEntryThrowsDuringStepThree`: integration test
walking Step 2 + Step 3 + cleanup chain; asserts no orphan can form
- `testGroupCommit` (5× invocation): unchanged, all pass
- [x] `HelixHelperTest` — 2/2 pass (incl. single-caller
`PermanentUpdaterException` contract test)
- [x] `PinotLLCRealtimeSegmentManagerTest*` — 39/39 pass
- [x] `spotless:apply`, `checkstyle:check`, `license:check` on
`pinot-common` and `pinot-controller`
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]