bersprockets opened a new pull request #31826:
URL: https://github.com/apache/spark/pull/31826
### What changes were proposed in this pull request?
Change DAGScheduler to pass a clone of the Properties object, rather than
the original object, to the SparkListenerJobStart event.
### Why are the changes needed?
DAGScheduler might modify the Properties object (e.g., in
addPySparkConfigsToProperties) after firing off the SparkListenerJobStart
event. Since the handler for that event (onJobStart in EventLoggingListener)
will iterate over the elements of the Property object, this sometimes results
in a ConcurrentModificationException.
I've not actualy seen a ConcurrentModificationException in onStageSubmitted,
only in onJobStart. However, they both iterate over the Properties object, so
for safety's sake I pass a clone to SparkListenerStageSubmitted as well.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
By repeatedly running the reproduction steps outlined in SPARK-34731.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]