Shiyang-Zhao opened a new pull request, #18690:
URL: https://github.com/apache/druid/pull/18690

   This PR fixes nondeterministic behavior in the following flaky tests:
   - 
`org.apache.druid.data.input.protobuf.InlineDescriptorProtobufBytesDecoderTest.testSingleDescriptorNoMessageType`
  
   - 
`org.apache.druid.data.input.protobuf.FileBasedProtobufBytesDecoderTest.testSingleDescriptorNoMessageType`
  
   - 
`org.apache.druid.indexing.scheduledbatch.QuartzCronSchedulerConfigTest.testInvalidCronExpression`
   
   ### **Description**
   
   The tests 
`InlineDescriptorProtobufBytesDecoderTest.testSingleDescriptorNoMessageType` 
and `FileBasedProtobufBytesDecoderTest.testSingleDescriptorNoMessageType` 
failed intermittently because Protobuf does not guarantee a consistent order 
when iterating over message descriptors.
   
   Depending on internal ordering, the decoder could return either 
`google.protobuf.Timestamp` or `prototest.ProtoTestEvent`. The tests originally 
asserted equality against a single expected name, causing nondeterministic 
failures even though both outcomes were correct.
   
   **Failure messages:**
   ```
   [ERROR] 
org.apache.druid.data.input.protobuf.InlineDescriptorProtobufBytesDecoderTest.testSingleDescriptorNoMessageType
 -- Time elapsed: 0.118 s <<< FAILURE!
   org.opentest4j.AssertionFailedError: expected: <google.protobuf.Timestamp> 
but was: <prototest.ProtoTestEvent>
        at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1145)
        at 
org.apache.druid.data.input.protobuf.InlineDescriptorProtobufBytesDecoderTest.testSingleDescriptorNoMessageType(InlineDescriptorProtobufBytesDecoderTest.java:129)
   
   [WARNING] Flakes: 
   [WARNING] 
org.apache.druid.data.input.protobuf.InlineDescriptorProtobufBytesDecoderTest.testSingleDescriptorNoMessageType
   [ERROR]   Run 1: 
InlineDescriptorProtobufBytesDecoderTest.testSingleDescriptorNoMessageType:129 
expected: <google.protobuf.Timestamp> but was: <prototest.ProtoTestEvent>
   [INFO]   Run 2: PASS
   ```
   ```
   [ERROR] 
org.apache.druid.data.input.protobuf.FileBasedProtobufBytesDecoderTest.testSingleDescriptorNoMessageType
 -- Time elapsed: 0.111 s <<< FAILURE!
   org.opentest4j.AssertionFailedError: expected: <google.protobuf.Timestamp> 
but was: <prototest.ProtoTestEvent>
        at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1145)
        at 
org.apache.druid.data.input.protobuf.FileBasedProtobufBytesDecoderTest.testSingleDescriptorNoMessageType(FileBasedProtobufBytesDecoderTest.java:135)
   
   [WARNING] Flakes: 
   [WARNING] 
org.apache.druid.data.input.protobuf.FileBasedProtobufBytesDecoderTest.testSingleDescriptorNoMessageType
   [ERROR]   Run 1: 
FileBasedProtobufBytesDecoderTest.testSingleDescriptorNoMessageType:135 
expected: <google.protobuf.Timestamp> but was: <prototest.ProtoTestEvent>
   [INFO]   Run 2: PASS
   ```
   
   **Proposed Changes:**  
   - Adjusted test expectations to recognize multiple valid descriptor 
outcomes.  
   
   ---
   
   The `QuartzCronSchedulerConfigTest.testInvalidCronExpression` failed due to 
nondeterministic ordering in Quartz’s validation error messages.
   
   When validating malformed cron expressions, Quartz reports multiple 
expected-part counts (e.g., `[6, 7]` vs `[7, 6]`), and the order of these 
values is not stable. The previous test compared messages strictly by string 
equality, leading to flaky results when the order changed.
   
   **Failure messages:**
   ```
   [ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 
0.069 s <<< FAILURE! -- in 
org.apache.druid.indexing.scheduledbatch.QuartzCronSchedulerConfigTest
   [ERROR] 
org.apache.druid.indexing.scheduledbatch.QuartzCronSchedulerConfigTest.testInvalidCronExpression
 -- Time elapsed: 0.059 s <<< FAILURE!
   java.lang.AssertionError: 
   Expected: (message: "Quartz schedule[0 15 10 * *] is invalid: [Cron 
expression contains 5 parts but we expect one of [6, 7]]" and targetPersona: is 
<USER> and category: is <INVALID_INPUT> and errorCode: is "invalidInput")
        but: message: "Quartz schedule[0 15 10 * *] is invalid: [Cron 
expression contains 5 parts but we expect one of [6, 7]]" was "Quartz 
schedule[0 15 10 * *] is invalid: [Cron expression contains 5 parts but we 
expect one of [7, 6]]"
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
        at 
org.apache.druid.indexing.scheduledbatch.QuartzCronSchedulerConfigTest.testInvalidCronExpression(QuartzCronSchedulerConfigTest.java:109)
   
   [WARNING] Flakes: 
   [WARNING] 
org.apache.druid.indexing.scheduledbatch.QuartzCronSchedulerConfigTest.testInvalidCronExpression
   [ERROR]   Run 1: QuartzCronSchedulerConfigTest.testInvalidCronExpression:109 
   Expected: (message: "Quartz schedule[0 15 10 * *] is invalid: [Cron 
expression contains 5 parts but we expect one of [6, 7]]" and targetPersona: is 
<USER> and category: is <INVALID_INPUT> and errorCode: is "invalidInput")
        but: message: "Quartz schedule[0 15 10 * *] is invalid: [Cron 
expression contains 5 parts but we expect one of [6, 7]]" was "Quartz 
schedule[0 15 10 * *] is invalid: [Cron expression contains 5 parts but we 
expect one of [7, 6]]"
   [INFO]   Run 2: PASS
   ```
   
   **Proposed Changes:**  
   - Updated the test to tolerate equivalent error lists with differing element 
order.  
   
   ---
   
   This PR has:
   
   - [x] been self-reviewed.  
   - [x] ensured no production logic changes beyond test stabilization.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to