[jira] [Updated] (SPARK-37091) Bump SystemRequirements to use Java 17
[ https://issues.apache.org/jira/browse/SPARK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Darek updated SPARK-37091: -- Description: Please bump Java version to <= 17 in [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] Currently it is set to be: {code:java} SystemRequirements: Java (>= 8, < 12){code} [PR|https://github.com/apache/spark/pull/34371] has been created for this issue already. was: Please bump Java version to <= 17 in [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] Currently it is set to be: {code:java} SystemRequirements: Java (>= 8, < 12){code} [PR|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION#L16] has been created for this issue already. > Bump SystemRequirements to use Java 17 > -- > > Key: SPARK-37091 > URL: https://issues.apache.org/jira/browse/SPARK-37091 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Affects Versions: 3.3.0 >Reporter: Darek >Priority: Major > Labels: newbie > Fix For: 3.2.1 > > Original Estimate: 1h > Remaining Estimate: 1h > > Please bump Java version to <= 17 in > [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] > Currently it is set to be: > {code:java} > SystemRequirements: Java (>= 8, < 12){code} > [PR|https://github.com/apache/spark/pull/34371] has been created for this > issue already. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37091) Bump SystemRequirements to use Java 17
[ https://issues.apache.org/jira/browse/SPARK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Darek updated SPARK-37091: -- Description: Please bump Java version to <= 17 in [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] Currently it is set to be: {code:java} SystemRequirements: Java (>= 8, < 12){code} [PR|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION#L16] has been created for this issue already. was: Please bump Java version to <= 17 in [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] Currently it is set to be: {code:java} SystemRequirements: Java (>= 8, < 12){code} > Bump SystemRequirements to use Java 17 > -- > > Key: SPARK-37091 > URL: https://issues.apache.org/jira/browse/SPARK-37091 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Affects Versions: 3.3.0 >Reporter: Darek >Priority: Major > Labels: newbie > Fix For: 3.2.1 > > Original Estimate: 1h > Remaining Estimate: 1h > > Please bump Java version to <= 17 in > [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] > Currently it is set to be: > {code:java} > SystemRequirements: Java (>= 8, < 12){code} > > [PR|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION#L16] > has been created for this issue already. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37091) Bump SystemRequirements to use Java 17
[ https://issues.apache.org/jira/browse/SPARK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Darek updated SPARK-37091: -- Target Version/s: 3.3.0 (was: 3.2.0) Affects Version/s: (was: 3.2.0) 3.3.0 Description: Please bump Java version to <= 17 in [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] Currently it is set to be: {code:java} SystemRequirements: Java (>= 8, < 12){code} was: Please bump Java version to > 11 in [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] Currently it is set to be: {code:java} SystemRequirements: Java (>= 8, < 12){code} Summary: Bump SystemRequirements to use Java 17 (was: Bump SystemRequirements to use Java > 11) > Bump SystemRequirements to use Java 17 > -- > > Key: SPARK-37091 > URL: https://issues.apache.org/jira/browse/SPARK-37091 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Affects Versions: 3.3.0 >Reporter: Darek >Priority: Major > Labels: newbie > Fix For: 3.2.1 > > Original Estimate: 1h > Remaining Estimate: 1h > > Please bump Java version to <= 17 in > [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] > Currently it is set to be: > {code:java} > SystemRequirements: Java (>= 8, < 12){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37091) Bump SystemRequirements to use Java > 11
[ https://issues.apache.org/jira/browse/SPARK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Darek updated SPARK-37091: -- Parent: SPARK-33772 Issue Type: Sub-task (was: Improvement) > Bump SystemRequirements to use Java > 11 > > > Key: SPARK-37091 > URL: https://issues.apache.org/jira/browse/SPARK-37091 > Project: Spark > Issue Type: Sub-task > Components: SparkR >Affects Versions: 3.2.0 >Reporter: Darek >Priority: Major > Labels: newbie > Fix For: 3.2.1 > > Original Estimate: 1h > Remaining Estimate: 1h > > Please bump Java version to > 11 in > [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] > Currently it is set to be: > {code:java} > SystemRequirements: Java (>= 8, < 12){code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-37091) Bump SystemRequirements to use Java > 11
Darek created SPARK-37091: - Summary: Bump SystemRequirements to use Java > 11 Key: SPARK-37091 URL: https://issues.apache.org/jira/browse/SPARK-37091 Project: Spark Issue Type: Improvement Components: SparkR Affects Versions: 3.2.0 Reporter: Darek Fix For: 3.2.1 Please bump Java version to > 11 in [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION] Currently it is set to be: {code:java} SystemRequirements: Java (>= 8, < 12){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767195#comment-16767195 ] Darek commented on SPARK-23534: --- It's NOT just Hadoop, it's Java 1.8 which is EOL and needs to be upgraded, Azure blob storage library and host of others. Life needs to move on if were to continue using Spark, otherwise it's time to move to something that uses current technologies. > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23710) Upgrade Hive to 2.3.2
[ https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522319#comment-16522319 ] Darek commented on SPARK-23710: --- The work was done in [PR20659|https://github.com/apache/spark/pull/20659] but for some reason it was not merged in. > Upgrade Hive to 2.3.2 > - > > Key: SPARK-23710 > URL: https://issues.apache.org/jira/browse/SPARK-23710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.0 >Reporter: Yuming Wang >Priority: Critical > > h1. Mainly changes > * Maven dependency: > hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change > {{hive.classifier}} to {{core}} > calcite.version from {{1.2.0-incubating}} to {{1.10.0}} > datanucleus-core.version from {{3.2.10}} to {{4.1.17}} > remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: > ORC-174 > add new dependency {{avatica}} and {{hive.storage.api}} > * ORC compatibility changes: > OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, > OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala > * hive-thriftserver java file update: > update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2 > update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to > hive 2.3.2 > * TestSuite should update: > ||TestSuite||Reason|| > |StatisticsSuite|HIVE-16098| > |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]| > |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, > SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to > hive-hcatalog-core-2.3.2.jar| > |SparkExecuteStatementOperationSuite|Interface changed from > org.apache.hive.service.cli.Type.NULL_TYPE to > org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE| > |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo > change to com.esotericsoftware.kryo.Kryo| > |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") > to Seq("1.100\t1", "2.100\t2")| > |HiveOrcFilterSuite|Result format changed| > |HiveDDLSuite|Remove $ (This change needs to be reconsidered)| > |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: > org.datanucleus.identity.DatastoreIdImpl cannot be cast to > org.datanucleus.identity.OID| > * Other changes: > Close hive schema verification: > [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251] > and > [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58] > Update > [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192] > Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't > connect to Hive 1.x metastore, We should use > {{HiveMetaStoreClient.getDelegationToken}} instead of > {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}} > All changes can be found at > [PR-20659|https://github.com/apache/spark/pull/20659]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466088#comment-16466088 ] Darek edited comment on SPARK-18673 at 5/7/18 4:09 PM: --- [PR20819|https://github.com/apache/spark/pull/20819] for Spark => Hive 2.x was done but not merged and deleted. was (Author: bidek): PR20819 for Spark => Hive 2.x was done but not merged and deleted. > Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version > -- > > Key: SPARK-18673 > URL: https://issues.apache.org/jira/browse/SPARK-18673 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 > Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT >Reporter: Steve Loughran >Priority: Major > > Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader > considers 3.x to be an unknown Hadoop version. > Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it > will need to be updated to match. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466088#comment-16466088 ] Darek commented on SPARK-18673: --- PR20819 for Spark => Hive 2.x was done but not merged and deleted. > Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version > -- > > Key: SPARK-18673 > URL: https://issues.apache.org/jira/browse/SPARK-18673 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 > Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT >Reporter: Steve Loughran >Priority: Major > > Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader > considers 3.x to be an unknown Hadoop version. > Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it > will need to be updated to match. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465934#comment-16465934 ] Darek edited comment on SPARK-18673 at 5/7/18 1:59 PM: --- Based on the recent PR, the community is moving toward Hadoop 3.1, why do you even bother with this ticket? Check the recent PR like SPARK-23807 was (Author: bidek): Based on the recent PR, the community is moving toward Hadoop 3.1, why do you event bother with this ticket? Check the recent PR like SPARK-23807 > Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version > -- > > Key: SPARK-18673 > URL: https://issues.apache.org/jira/browse/SPARK-18673 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 > Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT >Reporter: Steve Loughran >Priority: Major > > Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader > considers 3.x to be an unknown Hadoop version. > Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it > will need to be updated to match. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465934#comment-16465934 ] Darek commented on SPARK-18673: --- Based on the recent PR, the community is moving toward Hadoop 3.1, why do you event bother with this ticket? Check the recent PR like SPARK-23807 > Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version > -- > > Key: SPARK-18673 > URL: https://issues.apache.org/jira/browse/SPARK-18673 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 > Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT >Reporter: Steve Loughran >Priority: Major > > Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader > considers 3.x to be an unknown Hadoop version. > Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it > will need to be updated to match. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23710) Upgrade Hive to 2.3.2
[ https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437224#comment-16437224 ] Darek commented on SPARK-23710: --- Spark is on Hive 1.2 and there's no appetite in the community to merge [PR 20659|https://github.com/apache/spark/pull/20659], although the work on the upgrade has been completed. > Upgrade Hive to 2.3.2 > - > > Key: SPARK-23710 > URL: https://issues.apache.org/jira/browse/SPARK-23710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.0 >Reporter: Yuming Wang >Priority: Major > > h1. Mainly changes > * Maven dependency: > hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change > {{hive.classifier}} to {{core}} > calcite.version from {{1.2.0-incubating}} to {{1.10.0}} > datanucleus-core.version from {{3.2.10}} to {{4.1.17}} > remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: > ORC-174 > add new dependency {{avatica}} and {{hive.storage.api}} > * ORC compatibility changes: > OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, > OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala > * hive-thriftserver java file update: > update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2 > update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to > hive 2.3.2 > * TestSuite should update: > ||TestSuite||Reason|| > |StatisticsSuite|HIVE-16098| > |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]| > |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, > SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to > hive-hcatalog-core-2.3.2.jar| > |SparkExecuteStatementOperationSuite|Interface changed from > org.apache.hive.service.cli.Type.NULL_TYPE to > org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE| > |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo > change to com.esotericsoftware.kryo.Kryo| > |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") > to Seq("1.100\t1", "2.100\t2")| > |HiveOrcFilterSuite|Result format changed| > |HiveDDLSuite|Remove $ (This change needs to be reconsidered)| > |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: > org.datanucleus.identity.DatastoreIdImpl cannot be cast to > org.datanucleus.identity.OID| > * Other changes: > Close hive schema verification: > [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251] > and > [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58] > Update > [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192] > Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't > connect to Hive 1.x metastore, We should use > {{HiveMetaStoreClient.getDelegationToken}} instead of > {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}} > All changes can be found at > [PR-20659|https://github.com/apache/spark/pull/20659]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19076) Upgrade Hive dependence to Hive 2.x
[ https://issues.apache.org/jira/browse/SPARK-19076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417293#comment-16417293 ] Darek commented on SPARK-19076: --- It's done, passed all the tests, just needs to be merged in, not sure who can merge it, I have been asking for a while now, but no one is willing to step in and merge it. If you know know anyone who can merge it, it would be great help. Thanks > Upgrade Hive dependence to Hive 2.x > --- > > Key: SPARK-19076 > URL: https://issues.apache.org/jira/browse/SPARK-19076 > Project: Spark > Issue Type: Improvement >Reporter: Dapeng Sun >Priority: Major > > Currently the upstream Spark depends on Hive 1.2.1 to build package, and Hive > 2.0 has been released in February 2016, Hive 2.0.1 and 2.1.0 also released > for a long time, at Spark side, it is better to support Hive 2.0 and above. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19076) Upgrade Hive dependence to Hive 2.x
[ https://issues.apache.org/jira/browse/SPARK-19076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417284#comment-16417284 ] Darek commented on SPARK-19076: --- [PR 20659|https://github.com/apache/spark/pull/20659] for this issue already exists, just needs to be merged into master. > Upgrade Hive dependence to Hive 2.x > --- > > Key: SPARK-19076 > URL: https://issues.apache.org/jira/browse/SPARK-19076 > Project: Spark > Issue Type: Improvement >Reporter: Dapeng Sun >Priority: Major > > Currently the upstream Spark depends on Hive 1.2.1 to build package, and Hive > 2.0 has been released in February 2016, Hive 2.0.1 and 2.1.0 also released > for a long time, at Spark side, it is better to support Hive 2.0 and above. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23710) Upgrade Hive to 2.3.2
[ https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413234#comment-16413234 ] Darek commented on SPARK-23710: --- It passed all the tests and we need it for Hadoop 3.0, we need to merge the PR ASAP. > Upgrade Hive to 2.3.2 > - > > Key: SPARK-23710 > URL: https://issues.apache.org/jira/browse/SPARK-23710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.0 >Reporter: Yuming Wang >Priority: Major > > h1. Mainly changes > * Maven dependency: > hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change > {{hive.classifier}} to {{core}} > calcite.version from {{1.2.0-incubating}} to {{1.10.0}} > datanucleus-core.version from {{3.2.10}} to {{4.1.17}} > remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: > ORC-174 > add new dependency {{avatica}} and {{hive.storage.api}} > * ORC compatibility changes: > OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, > OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala > * hive-thriftserver java file update: > update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2 > update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to > hive 2.3.2 > * TestSuite should update: > ||TestSuite||Reason|| > |StatisticsSuite|HIVE-16098| > |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]| > |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, > SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to > hive-hcatalog-core-2.3.2.jar| > |SparkExecuteStatementOperationSuite|Interface changed from > org.apache.hive.service.cli.Type.NULL_TYPE to > org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE| > |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo > change to com.esotericsoftware.kryo.Kryo| > |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") > to Seq("1.100\t1", "2.100\t2")| > |HiveOrcFilterSuite|Result format changed| > |HiveDDLSuite|Remove $ (This change needs to be reconsidered)| > |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: > org.datanucleus.identity.DatastoreIdImpl cannot be cast to > org.datanucleus.identity.OID| > * Other changes: > Close hive schema verification: > [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251] > and > [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58] > Update > [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192] > Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't > connect to Hive 1.x metastore, We should use > {{HiveMetaStoreClient.getDelegationToken}} instead of > {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}} > All changes can be found at > [PR-20659|https://github.com/apache/spark/pull/20659]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408316#comment-16408316 ] Darek edited comment on SPARK-23534 at 3/21/18 6:05 PM: It seems that Hive upgrade to 2.3.2 is almost done [SPARK-23710|https://issues.apache.org/jira/browse/SPARK-23710] , once it's done, hopefully Hadoop 3.0 will build. was (Author: bidek): It seems that Hive upgrade to 2.3.2 is almost done ( https://issues.apache.org/jira/browse/SPARK-23710 ), once it's done, hopefully Hadoop 3.0 will build. > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-23710) Upgrade Hive to 2.3.2
[ https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408324#comment-16408324 ] Darek edited comment on SPARK-23710 at 3/21/18 6:04 PM: Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks [SPARK-23534|https://issues.apache.org/jira/browse/SPARK-23534] [SPARK-18673|https://issues.apache.org/jira/browse/SPARK-18673] was (Author: bidek): Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks [#SPARK-23534] [#SPARK-18673] > Upgrade Hive to 2.3.2 > - > > Key: SPARK-23710 > URL: https://issues.apache.org/jira/browse/SPARK-23710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.0 >Reporter: Yuming Wang >Priority: Major > > h1. Mainly changes > * Maven dependency: > hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change > {{hive.classifier}} to {{core}} > calcite.version from {{1.2.0-incubating}} to {{1.10.0}} > datanucleus-core.version from {{3.2.10}} to {{4.1.17}} > remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: > ORC-174 > add new dependency {{avatica}} and {{hive.storage.api}} > * ORC compatibility changes: > OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, > OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala > * hive-thriftserver java file update: > update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2 > update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to > hive 2.3.2 > * TestSuite should update: > ||TestSuite||Reason|| > |StatisticsSuite|HIVE-16098| > |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]| > |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, > SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to > hive-hcatalog-core-2.3.2.jar| > |SparkExecuteStatementOperationSuite|Interface changed from > org.apache.hive.service.cli.Type.NULL_TYPE to > org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE| > |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo > change to com.esotericsoftware.kryo.Kryo| > |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") > to Seq("1.100\t1", "2.100\t2")| > |HiveOrcFilterSuite|Result format changed| > |HiveDDLSuite|Remove $ (This change needs to be reconsidered)| > |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: > org.datanucleus.identity.DatastoreIdImpl cannot be cast to > org.datanucleus.identity.OID| > * Other changes: > Close hive schema verification: > [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251] > and > [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58] > Update > [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192] > Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't > connect to Hive 1.x metastore, We should use > {{HiveMetaStoreClient.getDelegationToken}} instead of > {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}} > All changes can be found at > [PR-20659|https://github.com/apache/spark/pull/20659]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-23710) Upgrade Hive to 2.3.2
[ https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408324#comment-16408324 ] Darek edited comment on SPARK-23710 at 3/21/18 6:00 PM: Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks [#SPARK-23534] [#SPARK-18673] was (Author: bidek): Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks https://issues.apache.org/jira/browse/SPARK-23534 https://issues.apache.org/jira/browse/SPARK-18673 > Upgrade Hive to 2.3.2 > - > > Key: SPARK-23710 > URL: https://issues.apache.org/jira/browse/SPARK-23710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.0 >Reporter: Yuming Wang >Priority: Major > > h1. Mainly changes > * Maven dependency: > hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change > {{hive.classifier}} to {{core}} > calcite.version from {{1.2.0-incubating}} to {{1.10.0}} > datanucleus-core.version from {{3.2.10}} to {{4.1.17}} > remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: > ORC-174 > add new dependency {{avatica}} and {{hive.storage.api}} > * ORC compatibility changes: > OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, > OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala > * hive-thriftserver java file update: > update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2 > update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to > hive 2.3.2 > * TestSuite should update: > ||TestSuite||Reason|| > |StatisticsSuite|HIVE-16098| > |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]| > |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, > SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to > hive-hcatalog-core-2.3.2.jar| > |SparkExecuteStatementOperationSuite|Interface changed from > org.apache.hive.service.cli.Type.NULL_TYPE to > org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE| > |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo > change to com.esotericsoftware.kryo.Kryo| > |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") > to Seq("1.100\t1", "2.100\t2")| > |HiveOrcFilterSuite|Result format changed| > |HiveDDLSuite|Remove $ (This change needs to be reconsidered)| > |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: > org.datanucleus.identity.DatastoreIdImpl cannot be cast to > org.datanucleus.identity.OID| > * Other changes: > Close hive schema verification: > [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251] > and > [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58] > Update > [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192] > Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't > connect to Hive 1.x metastore, We should use > {{HiveMetaStoreClient.getDelegationToken}} instead of > {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}} > All changes can be found at > [PR-20659|https://github.com/apache/spark/pull/20659]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23710) Upgrade Hive to 2.3.2
[ https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408324#comment-16408324 ] Darek commented on SPARK-23710: --- Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks https://issues.apache.org/jira/browse/SPARK-23534 https://issues.apache.org/jira/browse/SPARK-18673 > Upgrade Hive to 2.3.2 > - > > Key: SPARK-23710 > URL: https://issues.apache.org/jira/browse/SPARK-23710 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.0 >Reporter: Yuming Wang >Priority: Major > > h1. Mainly changes > * Maven dependency: > hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change > {{hive.classifier}} to {{core}} > calcite.version from {{1.2.0-incubating}} to {{1.10.0}} > datanucleus-core.version from {{3.2.10}} to {{4.1.17}} > remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: > ORC-174 > add new dependency {{avatica}} and {{hive.storage.api}} > * ORC compatibility changes: > OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, > OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala > * hive-thriftserver java file update: > update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2 > update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to > hive 2.3.2 > * TestSuite should update: > ||TestSuite||Reason|| > |StatisticsSuite|HIVE-16098| > |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]| > |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, > SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to > hive-hcatalog-core-2.3.2.jar| > |SparkExecuteStatementOperationSuite|Interface changed from > org.apache.hive.service.cli.Type.NULL_TYPE to > org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE| > |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo > change to com.esotericsoftware.kryo.Kryo| > |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") > to Seq("1.100\t1", "2.100\t2")| > |HiveOrcFilterSuite|Result format changed| > |HiveDDLSuite|Remove $ (This change needs to be reconsidered)| > |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: > org.datanucleus.identity.DatastoreIdImpl cannot be cast to > org.datanucleus.identity.OID| > * Other changes: > Close hive schema verification: > [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251] > and > [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58] > Update > [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192] > Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't > connect to Hive 1.x metastore, We should use > {{HiveMetaStoreClient.getDelegationToken}} instead of > {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}} > All changes can be found at > [PR-20659|https://github.com/apache/spark/pull/20659]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408316#comment-16408316 ] Darek edited comment on SPARK-23534 at 3/21/18 5:45 PM: It seems that Hive upgrade to 2.3.2 is almost done ( https://issues.apache.org/jira/browse/SPARK-23710 ), once it's done, hopefully Hadoop 3.0 will build. was (Author: bidek): It seems that Hive upgrade to 2.3.2 is almost done (https://issues.apache.org/jira/browse/SPARK-23710), once it's done, hopefully Hadoop 3.0 will build. > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408316#comment-16408316 ] Darek commented on SPARK-23534: --- It seems that Hive upgrade to 2.3.2 is almost done (https://issues.apache.org/jira/browse/SPARK-23710), once it's done, hopefully Hadoop 3.0 will build. > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407247#comment-16407247 ] Darek commented on SPARK-23534: --- https://github.com/Azure/azure-storage-java 7.0 will only work with org.apache.hadoop/hadoop-azure/3.0.0. I am afraid of using of older version of azure-storage because of all the security issues that have been found and fixed in the newer version, not to mention all the new features that Azure has added in the last 2 years. Using old software and public cloud = bad idea. > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405653#comment-16405653 ] Darek commented on SPARK-23534: --- It's not about pure Hadoop 3.0, it's about the rest of the libraries that require Hadoop 3.0 jars. For example you can not run Spark on Java 9 because of Hadoop 2.7 and Java 1.8 is very old by now. New Azure Blob libraries won't work with Hadoop 2.7 and list goes on. I tried building Spark with Hadoop 3.0 and the blocker is the fact that Spark is not using mainstream Hive, it uses an old fork of Hive. Need to convince the community that we need to switch to current release of Hive instead of the old fork. > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405604#comment-16405604 ] Darek commented on SPARK-18673: --- [~joshrosen] Would you know who can help adding the current version of HIVE to Spark stack? > Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version > -- > > Key: SPARK-18673 > URL: https://issues.apache.org/jira/browse/SPARK-18673 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 > Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT >Reporter: Steve Loughran >Priority: Major > > Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader > considers 3.x to be an unknown Hadoop version. > Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it > will need to be updated to match. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405592#comment-16405592 ] Darek commented on SPARK-23534: --- There has been no progress on this ticket, do you know anyone that can help with the upgrade? > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389629#comment-16389629 ] Darek commented on SPARK-23534: --- We need a resolution sooner than later. Most vendors have moved on to Hadoop 3.0 already, Java 9.0 has been out for 6 months now, Java 10 will be out in 2 weeks. Who can help to move this issue forward? Thanks > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388757#comment-16388757 ] Darek commented on SPARK-18673: --- When running the pyspark tests using Hadoop 3.0.0 I am not getting the java.lang.IllegalArgumentException but I am getting ClassNotFoundException: org.apache.hadoop.hive.sql.metadata.HiveException. Who can help to move this ticket forward? Thanks > Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version > -- > > Key: SPARK-18673 > URL: https://issues.apache.org/jira/browse/SPARK-18673 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 > Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT >Reporter: Steve Loughran >Priority: Major > > Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader > considers 3.x to be an unknown Hadoop version. > Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it > will need to be updated to match. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388017#comment-16388017 ] Darek commented on SPARK-23534: --- SPARK-18673 should be closed since HIVE-15016 and HIVE-18550 are closed. There should be not blockers are this point. > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388014#comment-16388014 ] Darek commented on SPARK-18673: --- HIVE tickets are closed already, can we close this ticket? > Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version > -- > > Key: SPARK-18673 > URL: https://issues.apache.org/jira/browse/SPARK-18673 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 > Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT >Reporter: Steve Loughran >Priority: Major > > Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader > considers 3.x to be an unknown Hadoop version. > Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it > will need to be updated to match. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org