[ https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yuming Wang updated SPARK-23710: -------------------------------- Description: Spark fail to run on Hadoop 3.x, because Hive's ShimLoader considers Hadoop 3.x to be an unknown Hadoop version. see SPARK-18673 and HIVE-16081 for more details. So we need to upgrade the built-in Hive for Hadoop-3.x. This is an umbrella JIRA to track this upgrade. *Upgrade Plan*: # SPARK-27054 Remove the Calcite dependency. This can avoid some jar conflicts. # SPARK-23749 Replace built-in Hive API (isSub/toKryo) and remove OrcProto.Type usage # SPARK-27158, SPARK-27130 Update dev/* to support dynamic change profiles when testing # Fix ORC dependency conflict to makes it test passed on Hive 1.2.1 and compile passed on Hive 2.3.4 # Add an empty hive-thriftserverV2 module. then we could test all test cases in next step # Make Hadoop-3.1 with Hive 2.3.4 test passed # Adapted hive-thriftserverV2 from hive-thriftserver with Hive 2.3.4's [TCLIService.thrift|https://github.com/apache/hive/blob/rel/release-2.3.4/service-rpc/if/TCLIService.thrift] I have completed the [initial work|https://github.com/apache/spark/pull/24044] and plan to finish this upgrade step by step. was: Upgrade built-in Hive to 2.3.4 for Hadoop-3.1(Please note that this upgrade only for Hadoop-3.1). To achieve this. We need to change sql/core, sql/hive, sql/hive-thriftserver modules at least: *sql/core*: Add two source directories(sql/core/v1.2.1 and sql/core/v2.3.4) to distinguish the code for different built-in Hive. *sql/hive:* use Java reflect or shim to support Hive 1.2.1 and Hive 2.3.4 same time. *sql/hive-thriftserver:* Add new thriftserver named hive-thriftserverV2 with Hive 2.3.4's [TCLIService.thrift|https://github.com/apache/hive/blob/rel/release-2.3.4/service-rpc/if/TCLIService.thrift]. Spark fail to run on Hadoop 3.x, because Hive's shimloader considers Hadoop 3.x to be an unknown Hadoop version. see [SPARK-18673|https://issues.apache.org/jira/browse/SPARK-18673] and [HIVE-16081|https://issues.apache.org/jira/browse/HIVE-16081] for more details. Upgrade Plan: # SPARK-27054 Remove the Calcite dependency. This can avoid some jar conflicts. # SPARK-23749 Replace built-in Hive API (isSub/toKryo) and remove OrcProto.Type usage # SPARK-27158, SPARK-27130 Update dev/* to support dynamic change profiles when testing # Fix ORC dependency conflict to makes it test passed on Hive 1.2.1 and compile passed on Hive 2.3.4 # Add an empty hive-thriftserverV2 module. then we could test all test cases in next step # Make Hadoop-3.1 with Hive 2.3.4 test passed # Adapted hive-thriftserverV2 from hive-thriftserver with Hive 2.3.4's [TCLIService.thrift|https://github.com/apache/hive/blob/rel/release-2.3.4/service-rpc/if/TCLIService.thrift] > Upgrade the built-in Hive to 2.3.4 for hadoop-3.1 > ------------------------------------------------- > > Key: SPARK-23710 > URL: https://issues.apache.org/jira/browse/SPARK-23710 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 2.4.0 > Reporter: Yuming Wang > Priority: Critical > > Spark fail to run on Hadoop 3.x, because Hive's ShimLoader considers Hadoop > 3.x to be an unknown Hadoop version. see SPARK-18673 and HIVE-16081 for more > details. So we need to upgrade the built-in Hive for Hadoop-3.x. This is an > umbrella JIRA to track this upgrade. > > *Upgrade Plan*: > # SPARK-27054 Remove the Calcite dependency. This can avoid some jar > conflicts. > # SPARK-23749 Replace built-in Hive API (isSub/toKryo) and remove > OrcProto.Type usage > # SPARK-27158, SPARK-27130 Update dev/* to support dynamic change profiles > when testing > # Fix ORC dependency conflict to makes it test passed on Hive 1.2.1 and > compile passed on Hive 2.3.4 > # Add an empty hive-thriftserverV2 module. then we could test all test cases > in next step > # Make Hadoop-3.1 with Hive 2.3.4 test passed > # Adapted hive-thriftserverV2 from hive-thriftserver with Hive 2.3.4's > [TCLIService.thrift|https://github.com/apache/hive/blob/rel/release-2.3.4/service-rpc/if/TCLIService.thrift] > > I have completed the [initial > work|https://github.com/apache/spark/pull/24044] and plan to finish this > upgrade step by step. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org