Re: [PR] [SPARK-54930][PYTHON] Remove redundant _accumulatorRegistry.clear() call in worker.py [spark]

via GitHub Wed, 07 Jan 2026 00:02:14 -0800


HyukjinKwon commented on code in PR #53708:
URL: https://github.com/apache/spark/pull/53708#discussion_r2667432960



##########
python/pyspark/worker.py:
##########
@@ -3493,7 +3493,6 @@ def main(infile, outfile):
 
         shuffle.MemoryBytesSpilled = 0
         shuffle.DiskBytesSpilled = 0
-        _accumulatorRegistry.clear()

Review Comment:
   I feel like it's a safeguard because accumulator server is running async, 
and when the worker is reused ... but not sure. just speculating it. Can we dig 
the commits and see if there are any hint related to it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-54930][PYTHON] Remove redundant _accumulatorRegistry.clear() call in worker.py [spark]

Reply via email to