Hi all,

I'd like to propose backporting the fix for SPARK-53759 to the active
release
branches (4.1, 4.0, and 3.5).

SPARK-53759 is a critical bug where PySpark crashes deterministically on
Windows with Python 3.12+. Windows always uses the simple-worker codepath
(because os.fork() is unavailable), and the worker's socket connection was
missing an explicit flush() before close(). On Python 3.12+, changed GC
finalization ordering [1] causes the underlying socket to close before the
write buffer is flushed, silently losing task results. The JVM sees
EOFException.

This was incidentally fixed on master by PR #54458 (SPARK-55665), which
unified worker socket handling across 14 files. I confirmed the fix is
present
in pyspark==4.2.0.dev3 on PyPI but not in any stable release — all versions
through 4.1.1 are affected.

Since PR #54458 is a large refactor (14 files), a clean cherry-pick to
release
branches may not be straightforward. However, the actual fix for
SPARK-53759 is
small — just adding flush() before close() in the worker's finally block,
mirroring what daemon.py already does. I've prepared minimal backport
branches
for review:

- branch-4.1:
https://github.com/anblanco/spark/tree/fix/SPARK-53759-simple-worker-flush
- (Can prepare branch-4.0 and branch-3.5 variants if there's interest)

I put together a reproducer with a test matrix and full root cause analysis
here: https://github.com/anblanco/spark53759-reproducer

The bug has been open since September 2025 and affects all Windows users on
Python 3.12+, which is now the default Python on most systems. I think the
impact warrants backporting, especially given how small the fix is.

Note that branch-3.5 LTS ends April 12 — if a backport is appropriate there,
it would need to happen soon.

Happy to prepare the backport PRs if maintainers agree this is worth doing.

Thanks you for your time,
Antonio Blanco

[1] https://github.com/python/cpython/issues/97922
[2] https://issues.apache.org/jira/browse/SPARK-53759
[3] https://github.com/apache/spark/pull/54458
-- 
Antonio
<witty signature />

Reply via email to