HyukjinKwon commented on PR #56721: URL: https://github.com/apache/spark/pull/56721#issuecomment-4788112096
✅ **Updated with a robust fix and re-validated; back to ready for review.** The previous patch was insufficient (the `avro` variant could still time out, as the integration run showed). The fix now waits for the version-2 snapshot **while version 2 is the current version** (right after the 2nd batch), via a `waitForStateSnapshot(version, partitions)` helper — this deterministically forces the maintenance thread to create the snapshot, since it only ever snapshots the current version. **Re-validated across both encodings, 8×** (the gap last time was only multi-running `unsaferow`): [run](https://github.com/HyukjinKwon/spark/actions/runs/28081816565) — 8 consecutive sbt runs, each `Tests: succeeded 2, failed 0` (both `unsaferow` and `avro`), **16/16 green**, including the previously-flaky `avro` variant. The on-timeout diagnostic (prints the state-dir contents) is retained for future debuggability. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
