HyukjinKwon commented on PR #56724: URL: https://github.com/apache/spark/pull/56724#issuecomment-4786345367
<!-- ai-code-review --> **Automated code review** (posted as a comment, not an approval) — via `spark-dev:code-review`. **Verdict: no blocking findings.** Test-only change; the fix is sound and well-justified. **What it does:** Groups the abort-path test by the high-cardinality `id` column instead of the 5-value `low_cardinality_col`, so all 20 reducer partitions depend on the corrupted mapper-0 and the indeterminate-stage abort fires deterministically (previously only ~5 partitions read mapper-0, so the abort hinged on scheduling order — the flake). **Design (Pass A):** Clean. No architecture/layer impact. The `local[2]` + `SHUFFLE_PARTITIONS=20` reasoning matches the sibling test in the same suite (`MetricsFailureInjectionSuite.scala:597`), so it's consistent with the established pattern. The scenario the test is named for (2-stage query → ResultStage abort under `RESULT_STAGE_DELAY=1` + forced checksum mismatch) is unchanged — only the grouping key's cardinality changes. **Correctness (Pass B):** `id` exists in the fixture (`setUpTestTable`, line 48). The assertion (`intercept[SparkException]` containing `"indeterminate"`) is unaffected. No metric-value assertions depend on the grouping key in this test (unlike the sibling tests), so switching to `id` is safe. **Nit (non-blocking):** the new comment says every one of the 20 reducers reads mapper-0 and at least one is *guaranteed* to hit the corrupted mapper. That's a practical near-certainty rather than a strict guarantee — it depends on 300 `id` values hashing across all 20 partitions and on `local[2]` dispatching the remaining result tasks after the corruption event is processed. Both hold overwhelmingly, but "guaranteed"/"every" slightly overstate it; consider "effectively always" to be precise. Not worth blocking on. **Verification:** suite run 20× under Maven (the env where it flaked) — all green: https://github.com/HyukjinKwon/spark/actions/runs/28066715792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
