Github user tmyklebu commented on the pull request:
https://github.com/apache/spark/pull/4261#issuecomment-72037926
I don't think these test failures are my fault, unless I need to handle
SparkContext lifetimes differently . One thing that I see in the test failure
log is this:
[info] Test org.apache.spark.sql.api.java.JavaAPISuite.udf1Test started
20:59:18.609 WARN org.apache.spark.SparkContext: Multiple running
SparkContexts detected in the same JVM!
org.apache.spark.SparkException: Only one SparkContext may be running
in this JVM (see SPARK-2243). To ignore this error, set
spark.driver.allowMultipleContexts = true. The currently running SparkContext
was created at:
org.apache.spark.SparkContext.<init>(SparkContext.scala:124)
org.apache.spark.sql.test.TestSQLContext$.<init>(TestSQLContext.scala:29)
org.apache.spark.sql.test.TestSQLContext$.<clinit>(TestSQLContext.scala)
[...]
at org.apache.spark.SparkContext.<init>(SparkContext.scala:159)
at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:67)
at
org.apache.spark.sql.api.java.JavaAPISuite.setUp(JavaAPISuite.java:40)
`JavaAPISuite` spins up a new SparkContext:
@Before
public void setUp() {
sc = new JavaSparkContext("local", "JavaAPISuite");
sqlContext = new SQLContext(sc);
}
and destroys it when it's done:
@After
public void tearDown() {
sc.stop();
sc = null;
}
Should it? There's already a SQLContext out there; it's called
`TestSQLContext$.MODULE$`.
I can't reproduce the test failures on my side. The test failures (all
three in JavaJDBCSuite) look like this:
[info] Test org.apache.spark.sql.jdbc.JavaJDBCTest.basicTest started
[error] Test org.apache.spark.sql.jdbc.JavaJDBCTest.basicTest failed:
Task not serializable
[error] at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166)
[error] at
org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158)
[error] at
org.apache.spark.SparkContext.clean(SparkContext.scala:1488)
[error] at org.apache.spark.rdd.RDD.map(RDD.scala:290)
[error] at org.apache.spark.sql.DataFrame.rdd(DataFrame.scala:527)
[error] at
org.apache.spark.sql.DataFrame.collect(DataFrame.scala:484)
[error] at
org.apache.spark.sql.jdbc.JavaJDBCTest.basicTest(JavaJDBCTest.java:62)
[error] ...
[error] Caused by: java.lang.NullPointerException
[error] at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:164)
[error] ... 44 more
Line 164 is
SparkEnv.get.closureSerializer.newInstance().serialize(func)
I think `SparkEnv.get` is returning null. When you spin up a SparkContext,
it creates a `SparkEnv` and does `SparkEnv.set(env)`. When you `stop()` it, it
does `SparkEnv.set(null)`.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]