chungen0126 opened a new pull request, #10396: URL: https://github.com/apache/ozone/pull/10396
## What changes were proposed in this pull request? ### Summary Fix intermittent failure in `TestContainerStateMachine#testApplyTransactionFailure`, `TestContainerStateMachine#testContainerStateMachineRestartWithDNChangePipeline`, and `testWriteStateMachineDataIdempotencyWithClosedContainer`. ### Changes #### For `testWriteStateMachineDataIdempotencyWithClosedContainer`: The test stemmed from a race between a retry write operation and a close container request. The test expects idempotency for identical data, but intermittent failures occurred because the initial write and the retry write contained different data. - Case A (Success): If close container executes first, no error occurs. - Case B (Failure): If the retry write executes before the close container, a mismatch occurs between the written data "hello" and the committed metadata. While the container successfully closes, it is later marked as "unhealthy" by the scanner due to a checksum mismatch. Fix: Updated the test to ensure data consistency during retries or adjusted the timing expectations to handle the race condition correctly. #### For testContainerStateMachineRestartWithDNChangePipeline & testApplyTransactionFailure These tests failed due to testContainerStateMachineFailures, which triggers a Ratis storage reset that breaks existing pipelines. Because these pipelines are closed passively via client-side retries instead of the ScrubbingService, they remain in the PipelineManager, leading to inevitable failures in subsequent tests that inadvertently select them. Fix: Make `testContainerStateMachineFailures` at the end of the class. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-13482 https://issues.apache.org/jira/browse/HDDS-12215 https://issues.apache.org/jira/browse/HDDS-14962 ## How was this patch tested? Before changes: TestContainerStateMachine failed 22 times in 20 * 10 iterations. https://github.com/chungen0126/ozone/actions/runs/26375145366/job/77634198274 After changes: TestContainerStateMachine passed: 20 * 10 iterations after changes. https://github.com/chungen0126/ozone/actions/runs/26699872167 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
