[ https://issues.apache.org/jira/browse/HIVE-14240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15506745#comment-15506745 ]
Ferdinand Xu commented on HIVE-14240: ------------------------------------- Hi [~stakiar], do you have any updates for this ticket? I am trying to move HIVE-14029 forwards. Thanks, Ferd > HoS itests shouldn't depend on a Spark distribution > --------------------------------------------------- > > Key: HIVE-14240 > URL: https://issues.apache.org/jira/browse/HIVE-14240 > Project: Hive > Issue Type: Improvement > Components: Spark > Affects Versions: 2.0.0, 2.1.0, 2.0.1 > Reporter: Sahil Takiar > Assignee: Sahil Takiar > > The HoS integration tests download a full Spark Distribution (a tar-ball) > from CloudFront. It uses this distribution to run Spark locally. It runs a > few tests with Spark in embedded mode, and some tests against a local Spark > on YARN cluster. The {{itests/pom.xml}} actually contains scripts to download > the tar-ball from a pre-defined location. > This is problematic because the Spark Distribution shades all its > dependencies, including Hadoop dependencies. This can cause problems when > upgrading the Hadoop version for Hive (ref: HIVE-13930). > Removing it will also avoid having to download the tar-ball during every > build, and simplify the build process for the itests module. > The Hive itests should instead directly depend on Spark artifacts published > in Maven Central. It will require some effort to get this working. The > current Hive Spark Client uses a launch script in the Spark installation to > run Spark jobs. The script basically does some setup work and invokes > org.apache.spark.deploy.SparkSubmit. It is possible to invoke this class > directly, which avoids the need to have a full Spark distribution available > locally (in fact this option already exists, but isn't tested). > There may be other issues around classpath conflicts between Hive and Spark. > For example, Hive and Spark require different versions of Kyro. One solution > to this would be to take Spark artifacts and shade Kyro inside them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)