[jira] [Commented] (SPARK-7233) ClosureCleaner#clean blocks concurrent job submitter threads

Oleksii Kostyliev (JIRA) Wed, 29 Apr 2015 04:39:45 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-7233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14519169#comment-14519169
 ]


Oleksii Kostyliev commented on SPARK-7233:
------------------------------------------

To illustrate the issue, I performed a test against local Spark.
Attached is the screenshot from the Threads view in Yourkit profiler.
The test was generating only 20 concurrent requests.
As you can see, job submitter threads mainly spend their time being blocked by 
each other.

> ClosureCleaner#clean blocks concurrent job submitter threads
> ------------------------------------------------------------
>
>                 Key: SPARK-7233
>                 URL: https://issues.apache.org/jira/browse/SPARK-7233
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.3.1, 1.4.0
>            Reporter: Oleksii Kostyliev
>         Attachments: blocked_threads_closurecleaner.png
>
>
> {{org.apache.spark.util.ClosureCleaner#clean}} method contains logic to 
> determine if Spark is run in interpreter mode: 
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/ClosureCleaner.scala#L120
> While this behavior is indeed valuable in particular situations, in addition 
> to this it causes concurrent submitter threads to be blocked on a native call 
> to {{java.lang.Class#forName0}} since it appears only 1 thread at a time can 
> make the call.
> This becomes a major issue when you have multiple threads concurrently 
> submitting short-lived jobs. This is one of the patterns how we use Spark in 
> production, and the number of parallel requests is expected to be quite high, 
> up to a couple of thousand at a time.
> A typical stacktrace of a blocked thread looks like:
> {code}
> http-bio-8091-exec-14 [BLOCKED] [DAEMON]
> java.lang.Class.forName0(String, boolean, ClassLoader, Class) Class.java 
> (native)
> java.lang.Class.forName(String) Class.java:260
> org.apache.spark.util.ClosureCleaner$.clean(Object, boolean) 
> ClosureCleaner.scala:122
> org.apache.spark.SparkContext.clean(Object, boolean) SparkContext.scala:1623
> org.apache.spark.rdd.RDD.reduce(Function2) RDD.scala:883
> org.apache.spark.rdd.RDD.takeOrdered(int, Ordering) RDD.scala:1240
> org.apache.spark.api.java.JavaRDDLike$class.takeOrdered(JavaRDDLike, int, 
> Comparator) JavaRDDLike.scala:586
> org.apache.spark.api.java.AbstractJavaRDDLike.takeOrdered(int, Comparator) 
> JavaRDDLike.scala:46
> ...
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-7233) ClosureCleaner#clean blocks concurrent job submitter threads

Reply via email to