Github user choochootrain commented on a diff in the pull request:
https://github.com/apache/spark/pull/9367#discussion_r43942556
--- Diff:
core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala ---
@@ -366,6 +367,72 @@ class SparkSubmitSuite
}
}
+ // SPARK-11195
+ test("classes are correctly loaded when tasks fail") {
+ // Compile a simple jar that throws a user defined exception on the
driver
+ val tempDir = Utils.createTempDir()
+ val srcDir = new File(tempDir, "repro/")
+ srcDir.mkdirs()
+ // scalastyle:off line.size.limit
+ val mainSource = new JavaSourceFromString(new File(srcDir,
"MyJob").getAbsolutePath,
+ """package repro;
+ |
+ |import java.util.*;
+ |import java.util.regex.*;
+ |import org.apache.spark.*;
+ |import org.apache.spark.api.java.*;
+ |import org.apache.spark.api.java.function.*;
+ |
+ |public class MyJob {
+ | public static class MyException extends Exception {
+ | }
+ |
+ | public static void main(String[] args) {
+ | SparkConf conf = new SparkConf();
+ | JavaSparkContext sc = new JavaSparkContext(conf);
+ |
+ | JavaRDD rdd = sc.parallelize(Arrays.asList(new Integer[]{1}),
1).map(new Function<Integer, Boolean>() {
+ | public Boolean call(Integer x) throws MyException {
+ | throw new MyException();
+ | }
+ | });
+ |
+ | try {
+ | rdd.collect();
+ |
+ | assert(false); // should be unreachable
+ | } catch (Exception e) {
+ | // the driver should not have any problems resolving the
exception class and determining
+ | // why the task failed.
+ |
+ | Pattern unknownFailure = Pattern.compile(".*Lost task.*:
UnknownReason.*", Pattern.DOTALL);
+ | Pattern expectedFailure = Pattern.compile(".*Lost task.*:
repro.MyJob\\$MyException.*", Pattern.DOTALL);
+ |
+ | assert(!unknownFailure.matcher(e.getMessage()).matches());
+ | assert(expectedFailure.matcher(e.getMessage()).matches());
+ | }
+ | }
+ |}
+ """.stripMargin)
+ // scalastyle:on line.size.limit
+ val sparkJar =
"../assembly/target/scala-2.10/spark-assembly-1.5.1-hadoop2.2.0.jar"
--- End diff --
all this patch does is make `TaskResult` deserialization use
`Utils.getContextOrSparkClassLoader` (the classloader which loaded the
`spark-submit`ed jar) instead of `Utils.getSparkClassLoader` (this is
`AppClassLoader` which only has spark classes in it). without this patch, a
failed task would not be able to deserialize an exception if it did not exist
in `Utils.getSparkClassLoader`.
in order to reproduce this issue, i set up a situation where
`Utils.getContextOnSparkClassLoader` contains `MyException` but
`Utils.getSparkClassLoader` does not (see
https://issues.apache.org/jira/browse/SPARK-11195). this is easy to manually
test with `spark-submit` and a user defined exception, but turning this into an
automated test is proving to be much trickier. here are the 3 options:
* :x: if i place all of the code into `SparkSubmitSuite`, the bug won't be
hit because `MyException` will be in the root classloader and my patch makes no
difference.
* :grey_question: if i place all of the code into an external jar and run
`spark-submit`, i can set up the same situation as my repro which found this
bug. the issue i am running into is that i need a spark classpath in order to
compile my jar. i can use the assembled jar, but this changes depending on the
maven profiles that are enabled and so on.
* :grey_question: i can try @brkyvz & @yhuai's hybrid approach of putting
only the exception into a jar and the rest of the code into `SparkSubmitSuite`.
i will have to do the following in order to repro this issue:
* load the jar with `MyException` in a new classloader whose parent is
the root classloader
* somehow allow this classloader to be used by the driver and the
executor *without* changing `Utils.getSparkClassLoader`.
at this point am i not reimplementing `spark-submit`? :)
the final approach is certainly worth trying, i'll take a look at it later
today.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]