Shiyang-Zhao opened a new pull request, #18621:
URL: https://github.com/apache/druid/pull/18621
This PR fixes nondeterministic behavior in the following flaky tests:
-
org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorTest.testIsTaskCurrent
-
org.apache.druid.indexing.kinesis.supervisor.KinesisSupervisorTest.testIsTaskCurrent
### Description
These tests occasionally failed due to nondeterministic behavior in how the
supervisor evaluated whether a task was “current.”
The issue originated from unstable iteration order inside the `DataSchema`
serialization process, which affected equality checks between supervisor and
task configurations.
```
KafkaSupervisorTest.testIsTaskCurrent()
↓
KafkaSupervisor.isTaskCurrent()
↓
SeekableStreamSupervisor.isTaskCurrent()
↓
SeekableStreamSupervisor.generateSequenceName()
↓
ObjectMapper.writeValueAsString(DataSchema)
↓
DataSchema → DimensionsSpec.dimensionExclusions used a HashSet
(non-deterministic iteration)
```
Here is a sample of an error from running
**org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorTest.testIsTaskCurrent**:
```
[ERROR]
org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorTest.testIsTaskCurrent[numThreads
= 1] -- Time elapsed: 0.79 s <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:87)
at org.junit.Assert.assertTrue(Assert.java:42)
at org.junit.Assert.assertTrue(Assert.java:53)
at
org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorTest.testIsTaskCurrent(KafkaSupervisorTest.java:4740)
[ERROR]
org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorTest.testIsTaskCurrent[numThreads
= 8] -- Time elapsed: 0.04 s <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:87)
at org.junit.Assert.assertTrue(Assert.java:42)
at org.junit.Assert.assertTrue(Assert.java:53)
at
org.apache.druid.indexing.kafka.supervisor.KafkaSupervisorTest.testIsTaskCurrent(KafkaSupervisorTest.java:4741)
```
**Problem:**
`SeekableStreamSupervisor.isTaskCurrent()` verifies task consistency by
recomputing a sequence name hash using `generateSequenceName()`, which
serializes the `DataSchema` object to JSON. Within `DataSchema`, the
`DimensionsSpec` class maintained its `dimensionExclusions` field as a
`HashSet`, which does not guarantee element iteration order. Because JSON
serialization depends on iteration order, identical `DimensionsSpec` instances
could produce different serialized strings across runs. This caused the
computed hashes for otherwise equivalent `DataSchema` objects to differ,
leading `isTaskCurrent()` to incorrectly treat valid tasks as outdated or
inconsistent. The issue manifested as nondeterministic test outcomes, since the
JSON field order—and thus the resulting hash—varied between executions.
**Proposed Changes:**
- Replaced `HashSet` with `LinkedHashSet` in `DimensionsSpec` constructor to
ensure deterministic iteration order.
- Ensured `DataSchema` serialization and equality checks are now stable and
repeatable.
Here is a sample of an error from running
**org.apache.druid.indexing.kinesis.supervisor.KinesisSupervisorTest.testIsTaskCurrent**:
```
[ERROR]
org.apache.druid.indexing.kinesis.supervisor.KinesisSupervisorTest.testIsTaskCurrent[numThreads
= 1] -- Time elapsed: 0.84 s <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:87)
at org.junit.Assert.assertTrue(Assert.java:42)
at org.junit.Assert.assertTrue(Assert.java:53)
at
org.apache.druid.indexing.kinesis.supervisor.KinesisSupervisorTest.testIsTaskCurrent(KinesisSupervisorTest.java:4821)
```
**Problem:**
The same serialization nondeterminism affected Kinesis supervisor tests,
since `KinesisSupervisor` extends the same
`SeekableStreamSupervisor` base class and reuses the same `isTaskCurrent()`
logic.
**Proposed Changes:**
- No Kinesis-specific changes were required; the fix in `DimensionsSpec`
resolved both cases.
---
This PR has:
- [x] been self-reviewed.
- [x] ensured no production logic changes beyond test stabilization.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]