HyukjinKwon opened a new pull request #27808: [SPARK-30994][BUILD][FOLLOW-UP] 
Change scope of xml-apis to include it
URL: https://github.com/apache/spark/pull/27808
 
 
   ### What changes were proposed in this pull request?
   
   This PR propose to explicitly include xml-apis. xml-apis is already the part 
of xerces 2.12.0 
(https://repo1.maven.org/maven2/xerces/xercesImpl/2.12.0/xercesImpl-2.12.0.pom).
 However, we're excluding it by setting `scope` to `test`. This seems causing 
`spark-shell`, built from Maven, to fail.
   
   Seems like previously xml-apis wasn't reached for some reasons but after we 
upgrade, it seems requiring. Therefore, this PR proposes to include it.
   
   Hadoop 3 does not looks requiring this as they replaced xerces as of 
[HDFS-12221](https://issues.apache.org/jira/browse/HDFS-12221).
   
   ### Why are the changes needed?
   
   To make `spark-shell` working from Maven build.
   
   ### Does this PR introduce any user-facing change?
   
   No, it's master only.
   
   ### How was this patch tested?
   
   ```bash
   ./build/mvn -DskipTests -Psparkr -Phive clean package
   ./bin/spark-shell
   ```
   
   **Before:**
   
   ```
   Exception in thread "main" java.lang.NoClassDefFoundError: 
org/w3c/dom/ElementTraversal
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
        at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at org.apache.xerces.parsers.AbstractDOMParser.startDocument(Unknown 
Source)
        at org.apache.xerces.xinclude.XIncludeHandler.startDocument(Unknown 
Source)
        at org.apache.xerces.impl.dtd.XMLDTDValidator.startDocument(Unknown 
Source)
        at org.apache.xerces.impl.XMLDocumentScannerImpl.startEntity(Unknown 
Source)
        at 
org.apache.xerces.impl.XMLVersionDetector.startDocumentParsing(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.DOMParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source)
        at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150)
        at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2482)
        at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2470)
        at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2541)
        at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2494)
        at 
org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2407)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1143)
        at org.apache.hadoop.conf.Configuration.set(Configuration.java:1115)
        at 
org.apache.spark.deploy.SparkHadoopUtil$.org$apache$spark$deploy$SparkHadoopUtil$$appendS3AndSparkHadoopHiveConfigurations(SparkHadoopUtil.scala:456)
        at 
org.apache.spark.deploy.SparkHadoopUtil$.newConfiguration(SparkHadoopUtil.scala:427)
        at 
org.apache.spark.deploy.SparkSubmit.$anonfun$prepareSubmitEnvironment$2(SparkSubmit.scala:342)
        at scala.Option.getOrElse(Option.scala:189)
        at 
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:342)
        at 
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
        at 
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at 
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: java.lang.ClassNotFoundException: org.w3c.dom.ElementTraversal
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 42 more
   ```
   
   **After:**
   
   
   
   ```
   ...
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /___/ .__/\_,_/_/ /_/\_\   version 3.1.0-SNAPSHOT
         /_/
   
   Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 
1.8.0_202)
   Type in expressions to have them evaluated.
   Type :help for more information.
   
   scala>
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to