Oleksii Kostyliev created SPARK-7233:
----------------------------------------
Summary: ClosureCleaner#clean blocks concurrent job submitter
threads
Key: SPARK-7233
URL: https://issues.apache.org/jira/browse/SPARK-7233
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.3.1, 1.4.0
Reporter: Oleksii Kostyliev
{{org.apache.spark.util.ClosureCleaner#clean}} method contains logic to
determine if Spark is run in interpreter mode:
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala#L120
While this behavior is indeed valuable in particular situations, in addition to
this it causes concurrent submitter threads to be blocked on a native call to
{{java.lang.Class#forName0}} since it appears only 1 thread at a time can make
the call.
This becomes a major issue when you have multiple threads concurrently
submitting short-lived jobs. This is one of the patterns how we use Spark in
production, and the number of parallel requests is expected to be quite high,
up to a couple of thousand at a time.
A typical stacktrace of a blocked thread looks like:
{code}
http-bio-8091-exec-14 [BLOCKED] [DAEMON]
java.lang.Class.forName0(String, boolean, ClassLoader, Class) Class.java
(native)
java.lang.Class.forName(String) Class.java:260
org.apache.spark.util.ClosureCleaner$.clean(Object, boolean)
ClosureCleaner.scala:122
org.apache.spark.SparkContext.clean(Object, boolean) SparkContext.scala:1623
org.apache.spark.rdd.RDD.reduce(Function2) RDD.scala:883
org.apache.spark.rdd.RDD.takeOrdered(int, Ordering) RDD.scala:1240
org.apache.spark.api.java.JavaRDDLike$class.takeOrdered(JavaRDDLike, int,
Comparator) JavaRDDLike.scala:586
org.apache.spark.api.java.AbstractJavaRDDLike.takeOrdered(int, Comparator)
JavaRDDLike.scala:46
...
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]