[
https://issues.apache.org/jira/browse/SPARK-14540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226995#comment-16226995
]
Sean Owen commented on SPARK-14540:
-----------------------------------
[~joshrosen] was right that this is actually the hard part. A few notes from
working on this:
Almost all tests pass with no change to the closure cleaner, except to not
attempt to treat lambdas as inner class closures. That was kind of surprising.
I assume that their implementation as a lambda means many of the synthetic
links the cleaner had to snip just don't exist.
I am still not clear if you can extract referenced fields from the synthetic
lambda class itself. The "bsmArgs" (boostrap method args) aren't quite that.
However it looks like you can manually serialize the lambda and get this info
from the SerializedLambda and examine captured args. Next thing to try.
Still, without this change, I find a lot of code just works already.
> Support Scala 2.12 closures and Java 8 lambdas in ClosureCleaner
> ----------------------------------------------------------------
>
> Key: SPARK-14540
> URL: https://issues.apache.org/jira/browse/SPARK-14540
> Project: Spark
> Issue Type: Sub-task
> Components: Spark Core
> Reporter: Josh Rosen
>
> Using https://github.com/JoshRosen/spark/tree/build-for-2.12, I tried running
> ClosureCleanerSuite with Scala 2.12 and ran into two bad test failures:
> {code}
> [info] - toplevel return statements in closures are identified at cleaning
> time *** FAILED *** (32 milliseconds)
> [info] Expected exception
> org.apache.spark.util.ReturnStatementInClosureException to be thrown, but no
> exception was thrown. (ClosureCleanerSuite.scala:57)
> {code}
> and
> {code}
> [info] - user provided closures are actually cleaned *** FAILED *** (56
> milliseconds)
> [info] Expected ReturnStatementInClosureException, but got
> org.apache.spark.SparkException: Job aborted due to stage failure: Task not
> serializable: java.io.NotSerializableException: java.lang.Object
> [info] - element of array (index: 0)
> [info] - array (class "[Ljava.lang.Object;", size: 1)
> [info] - field (class "java.lang.invoke.SerializedLambda", name:
> "capturedArgs", type: "class [Ljava.lang.Object;")
> [info] - object (class "java.lang.invoke.SerializedLambda",
> SerializedLambda[capturingClass=class
> org.apache.spark.util.TestUserClosuresActuallyCleaned$,
> functionalInterfaceMethod=scala/runtime/java8/JFunction1$mcII$sp.apply$mcII$sp:(I)I,
> implementation=invokeStatic
> org/apache/spark/util/TestUserClosuresActuallyCleaned$.org$apache$spark$util$TestUserClosuresActuallyCleaned$$$anonfun$69:(Ljava/lang/Object;I)I,
> instantiatedMethodType=(I)I, numCaptured=1])
> [info] - element of array (index: 0)
> [info] - array (class "[Ljava.lang.Object;", size: 1)
> [info] - field (class "java.lang.invoke.SerializedLambda", name:
> "capturedArgs", type: "class [Ljava.lang.Object;")
> [info] - object (class "java.lang.invoke.SerializedLambda",
> SerializedLambda[capturingClass=class org.apache.spark.rdd.RDD,
> functionalInterfaceMethod=scala/Function3.apply:(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;,
> implementation=invokeStatic
> org/apache/spark/rdd/RDD.org$apache$spark$rdd$RDD$$$anonfun$20$adapted:(Lscala/Function1;Lorg/apache/spark/TaskContext;Ljava/lang/Object;Lscala/collection/Iterator;)Lscala/collection/Iterator;,
>
> instantiatedMethodType=(Lorg/apache/spark/TaskContext;Ljava/lang/Object;Lscala/collection/Iterator;)Lscala/collection/Iterator;,
> numCaptured=1])
> [info] - field (class "org.apache.spark.rdd.MapPartitionsRDD", name:
> "f", type: "interface scala.Function3")
> [info] - object (class "org.apache.spark.rdd.MapPartitionsRDD",
> MapPartitionsRDD[2] at apply at Transformer.scala:22)
> [info] - field (class "scala.Tuple2", name: "_1", type: "class
> java.lang.Object")
> [info] - root object (class "scala.Tuple2", (MapPartitionsRDD[2] at
> apply at
> Transformer.scala:22,org.apache.spark.SparkContext$$Lambda$957/431842435@6e803685)).
> [info] This means the closure provided by user is not actually cleaned.
> (ClosureCleanerSuite.scala:78)
> {code}
> We'll need to figure out a closure cleaning strategy which works for 2.12
> lambdas.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]