[ 
https://issues.apache.org/jira/browse/SPARK-28877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16916218#comment-16916218
 ] 

Dongjoon Hyun commented on SPARK-28877:
---------------------------------------

Thank you for filing a Jira issue with the detailed analysis.

> Investigate/fix JAXB failure running Pyspark tests on JDK 11
> ------------------------------------------------------------
>
>                 Key: SPARK-28877
>                 URL: https://issues.apache.org/jira/browse/SPARK-28877
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Build, PySpark
>    Affects Versions: 3.0.0
>            Reporter: Sean Owen
>            Priority: Major
>
> It looks like we might have a test failure in Pyspark with JDK 11:
> https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109686/console
> {code}
> ======================================================================
> ERROR: test_linear_regression_pmml_basic 
> (pyspark.ml.tests.test_persistence.PersistenceTest)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File 
> "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/ml/tests/test_persistence.py",
>  line 69, in test_linear_regression_pmml_basic
>     model.write().format("pmml").save(lr_path)
>   File 
> "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/ml/util.py", 
> line 175, in save
>     self._jwrite.save(path)
>   File 
> "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py",
>  line 1286, in __call__
>     answer, self.gateway_client, self.target_id, self.name)
>   File 
> "/home/jenkins/workspace/SparkPullRequestBuilder/python/pyspark/sql/utils.py",
>  line 89, in deco
>     return f(*a, **kw)
>   File 
> "/home/jenkins/workspace/SparkPullRequestBuilder/python/lib/py4j-0.10.8.1-src.zip/py4j/protocol.py",
>  line 328, in get_return_value
>     format(target_id, ".", name), value)
> Py4JJavaError: An error occurred while calling o529.save.
> : javax.xml.bind.JAXBException
>  - with linked exception:
> [java.lang.ClassNotFoundException: 
> com.sun.xml.internal.bind.v2.ContextFactory]
>       at javax.xml.bind.ContextFinder.newInstance(ContextFinder.java:241)
>       at javax.xml.bind.ContextFinder.find(ContextFinder.java:477)
>       at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:656)
>       at javax.xml.bind.JAXBContext.newInstance(JAXBContext.java:599)
>       at org.jpmml.model.JAXBUtil.getContext(JAXBUtil.java:103)
>       at org.jpmml.model.JAXBUtil.createMarshaller(JAXBUtil.java:132)
>       at org.jpmml.model.JAXBUtil.marshal(JAXBUtil.java:77)
>       at org.jpmml.model.JAXBUtil.marshalPMML(JAXBUtil.java:67)
>       at 
> org.apache.spark.mllib.pmml.PMMLExportable.toPMML(PMMLExportable.scala:44)
>       at 
> org.apache.spark.mllib.pmml.PMMLExportable.toPMML(PMMLExportable.scala:78)
> ...
> {code}
> The error is typical of other JDK 11-related incompatibilities, because Java 
> 9 removed the built-in JAXB implementation from Sun. It appears that somehow 
> the classpath is trying to load the 'old' JAXB implementation.
> It's curious because the JVM-based tests appear to pass. This suggests it may 
> be more about how the Pyspark test classpath is constructed, and perhaps 
> there is an old dependency or something selecting this implementation via a 
> META-INF/MANIFEST.MF entry. 
> It's also curious because we seemed to observe Pyspark tests passing with JDK 
> 11 during earlier testing. This is likely to be more related to how Pyspark 
> tests are run, but still needs a reproduction and an answer.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to