HyukjinKwon opened a new pull request, #56724:
URL: https://github.com/apache/spark/pull/56724

   ### What changes were proposed in this pull request?
   Group the abort-path test `MetricsFailureInjectionSuite."Force checksum 
mismatch aborts a downstream ResultStage"` by the high-cardinality `id` column 
instead of the 5-value `low_cardinality_col`, so every one of the 20 reducer 
partitions reads the corrupted mapper-0.
   
   ### Why are the changes needed?
   The test was flaky under Maven (~3/10 scheduled runs; it always passed on 
SBT). Only ~5 of the 20 reducer partitions held mapper-0's few low-cardinality 
keys, and the mapper-0 corruption is applied **asynchronously** after the first 
result task succeeds (`RESULT_STAGE_DELAY=1`). The indeterminate-stage abort 
therefore only fired if one of those few partitions happened to be scheduled 
*after* the corruption landed — a scheduling race. Grouping by the 
high-cardinality `id` makes every reducer depend on mapper-0, so once the 
corruption lands the remaining result tasks (dispatched only after the first 
completes, on `local[2]`) deterministically hit it and the abort always fires.
   
   ### Does this PR introduce any user-facing change?
   No, test only.
   
   ### How was this patch tested?
   Ran the suite **20×** under Maven (the environment where it flaked) on a 
fork — all 20 passed.
   
   - ❌ Before (flaky, scheduled `Build / Maven (Scala 2.13, JDK 21)`): 
https://github.com/apache/spark/actions/runs/28035705490
   - ❌ Before (flaky, scheduled `Build / Maven (Scala 2.13, JDK 25)`): 
https://github.com/apache/spark/actions/runs/28035606804
   - ✅ After (this fix, MetricsFailureInjectionSuite ×20 under Maven, all 
green): https://github.com/HyukjinKwon/spark/actions/runs/28066715792
   
   ### Was this patch authored or co-authored using generative AI tooling?
   Yes, Generated-by: Claude Code
   
   This pull request and its description were written by Isaac.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to