This is an automated email from the ASF dual-hosted git repository.
gurwls223 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 7614819884ca [SPARK-50430][CORE] Use the standard Properties.clone
instead of manual clone
7614819884ca is described below
commit 7614819884ca192fab45ee2ace8a8e081ec8becc
Author: Hyukjin Kwon <[email protected]>
AuthorDate: Wed Nov 27 14:22:01 2024 +0900
[SPARK-50430][CORE] Use the standard Properties.clone instead of manual
clone
### What changes were proposed in this pull request?
This PR proposes to use the standard Properties.clone instead of manual
clone
### Why are the changes needed?
In a very rare condition, when the properties were changed during the clone
of Properties, it might throw an exception as below:
```
: java.util.ConcurrentModificationException
at java.util.Hashtable$Enumerator.next(Hashtable.java:1408)
at java.util.Hashtable.putAll(Hashtable.java:523)
at org.apache.spark.util.Utils$.cloneProperties(Utils.scala:3474)
at
org.apache.spark.SparkContext.getCredentialResolvedProperties(SparkContext.scala:523)
at
org.apache.spark.SparkContext.runJobInternal(SparkContext.scala:3157)
at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1104)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:165)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:125)
at
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:454)
at org.apache.spark.rdd.RDD.collect(RDD.scala:1102)
at
org.apache.spark.mllib.evaluation.AreaUnderCurve$.of(AreaUnderCurve.scala:44)
at
org.apache.spark.mllib.evaluation.BinaryClassificationMetrics.areaUnderROC(BinaryClassificationMetrics.scala:127)
at
org.apache.spark.ml.evaluation.BinaryClassificationEvaluator.evaluate(BinaryClassificationEvaluator.scala:101)
at sun.reflect.GeneratedMethodAccessor323.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at
py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:397)
at py4j.Gateway.invoke(Gateway.java:306)
at
py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at
py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:199)
at py4j.ClientServerConnection.run(ClientServerConnection.java:119)
at java.lang.Thread.run(Thread.java:750)
```
We should use the standard clone method.
### Does this PR introduce _any_ user-facing change?
It fixes a very corner case bug as described above.
### How was this patch tested?
It's difficult to test because the issue is from concurrent execution.
### Was this patch authored or co-authored using generative AI tooling?
No.
Closes #48978 from HyukjinKwon/SPARK-50430.
Authored-by: Hyukjin Kwon <[email protected]>
Signed-off-by: Hyukjin Kwon <[email protected]>
---
core/src/main/scala/org/apache/spark/util/Utils.scala | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 5703128aacbb..109db36d4069 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -2982,9 +2982,7 @@ private[spark] object Utils
if (props == null) {
return props
}
- val resultProps = new Properties()
- props.forEach((k, v) => resultProps.put(k, v))
- resultProps
+ props.clone().asInstanceOf[Properties]
}
/**
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]