[jira] [Updated] (SPARK-37091) Bump SystemRequirements to use Java 17

2021-10-22 Thread Darek (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darek updated SPARK-37091:
--
Description: 
Please bump Java version to <= 17 in 
[DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]

Currently it is set to be:
{code:java}
SystemRequirements: Java (>= 8, < 12){code}
 [PR|https://github.com/apache/spark/pull/34371] has been created for this 
issue already.

  was:
Please bump Java version to <= 17 in 
[DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]

Currently it is set to be:
{code:java}
SystemRequirements: Java (>= 8, < 12){code}
 
[PR|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION#L16]
 has been created for this issue already.


> Bump SystemRequirements to use Java 17
> --
>
> Key: SPARK-37091
> URL: https://issues.apache.org/jira/browse/SPARK-37091
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Affects Versions: 3.3.0
>Reporter: Darek
>Priority: Major
>  Labels: newbie
> Fix For: 3.2.1
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Please bump Java version to <= 17 in 
> [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]
> Currently it is set to be:
> {code:java}
> SystemRequirements: Java (>= 8, < 12){code}
>  [PR|https://github.com/apache/spark/pull/34371] has been created for this 
> issue already.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37091) Bump SystemRequirements to use Java 17

2021-10-22 Thread Darek (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darek updated SPARK-37091:
--
Description: 
Please bump Java version to <= 17 in 
[DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]

Currently it is set to be:
{code:java}
SystemRequirements: Java (>= 8, < 12){code}
 
[PR|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION#L16]
 has been created for this issue already.

  was:
Please bump Java version to <= 17 in 
[DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]

Currently it is set to be:
{code:java}
SystemRequirements: Java (>= 8, < 12){code}
 


> Bump SystemRequirements to use Java 17
> --
>
> Key: SPARK-37091
> URL: https://issues.apache.org/jira/browse/SPARK-37091
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Affects Versions: 3.3.0
>Reporter: Darek
>Priority: Major
>  Labels: newbie
> Fix For: 3.2.1
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Please bump Java version to <= 17 in 
> [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]
> Currently it is set to be:
> {code:java}
> SystemRequirements: Java (>= 8, < 12){code}
>  
> [PR|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION#L16]
>  has been created for this issue already.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37091) Bump SystemRequirements to use Java 17

2021-10-22 Thread Darek (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darek updated SPARK-37091:
--
 Target Version/s: 3.3.0  (was: 3.2.0)
Affects Version/s: (was: 3.2.0)
   3.3.0
  Description: 
Please bump Java version to <= 17 in 
[DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]

Currently it is set to be:
{code:java}
SystemRequirements: Java (>= 8, < 12){code}
 

  was:
Please bump Java version to > 11 in 
[DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]

Currently it is set to be:
{code:java}
SystemRequirements: Java (>= 8, < 12){code}
 

  Summary: Bump SystemRequirements to use Java 17  (was: Bump 
SystemRequirements to use Java > 11)

> Bump SystemRequirements to use Java 17
> --
>
> Key: SPARK-37091
> URL: https://issues.apache.org/jira/browse/SPARK-37091
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Affects Versions: 3.3.0
>Reporter: Darek
>Priority: Major
>  Labels: newbie
> Fix For: 3.2.1
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Please bump Java version to <= 17 in 
> [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]
> Currently it is set to be:
> {code:java}
> SystemRequirements: Java (>= 8, < 12){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-37091) Bump SystemRequirements to use Java > 11

2021-10-22 Thread Darek (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-37091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Darek updated SPARK-37091:
--
Parent: SPARK-33772
Issue Type: Sub-task  (was: Improvement)

> Bump SystemRequirements to use Java > 11
> 
>
> Key: SPARK-37091
> URL: https://issues.apache.org/jira/browse/SPARK-37091
> Project: Spark
>  Issue Type: Sub-task
>  Components: SparkR
>Affects Versions: 3.2.0
>Reporter: Darek
>Priority: Major
>  Labels: newbie
> Fix For: 3.2.1
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Please bump Java version to > 11 in 
> [DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]
> Currently it is set to be:
> {code:java}
> SystemRequirements: Java (>= 8, < 12){code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-37091) Bump SystemRequirements to use Java > 11

2021-10-21 Thread Darek (Jira)
Darek created SPARK-37091:
-

 Summary: Bump SystemRequirements to use Java > 11
 Key: SPARK-37091
 URL: https://issues.apache.org/jira/browse/SPARK-37091
 Project: Spark
  Issue Type: Improvement
  Components: SparkR
Affects Versions: 3.2.0
Reporter: Darek
 Fix For: 3.2.1


Please bump Java version to > 11 in 
[DESCRIPTION|https://github.com/apache/spark/blob/f9f95686cb397271f55aaff29ec4352b4ef9aade/R/pkg/DESCRIPTION]

Currently it is set to be:
{code:java}
SystemRequirements: Java (>= 8, < 12){code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2019-02-13 Thread Darek (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16767195#comment-16767195
 ] 

Darek commented on SPARK-23534:
---

It's NOT just Hadoop, it's Java 1.8 which is EOL and needs to be upgraded, 
Azure blob storage library and host of others. Life needs to move on if were to 
continue using Spark, otherwise it's time to move to something that uses 
current technologies.

> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23710) Upgrade Hive to 2.3.2

2018-06-25 Thread Darek (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16522319#comment-16522319
 ] 

Darek commented on SPARK-23710:
---

The work was done in [PR20659|https://github.com/apache/spark/pull/20659] but 
for some reason it was not merged in.

> Upgrade Hive to 2.3.2
> -
>
> Key: SPARK-23710
> URL: https://issues.apache.org/jira/browse/SPARK-23710
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Yuming Wang
>Priority: Critical
>
> h1. Mainly changes
>  * Maven dependency:
>  hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change 
> {{hive.classifier}} to {{core}}
>  calcite.version from {{1.2.0-incubating}} to {{1.10.0}}
>  datanucleus-core.version from {{3.2.10}} to {{4.1.17}}
>  remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: 
> ORC-174
>  add new dependency {{avatica}} and {{hive.storage.api}}
>  * ORC compatibility changes:
>  OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, 
> OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala
>  * hive-thriftserver java file update:
>  update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2
>  update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to 
> hive 2.3.2
>  * TestSuite should update:
> ||TestSuite||Reason||
> |StatisticsSuite|HIVE-16098|
> |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]|
> |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, 
> SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to 
> hive-hcatalog-core-2.3.2.jar|
> |SparkExecuteStatementOperationSuite|Interface changed from 
> org.apache.hive.service.cli.Type.NULL_TYPE to 
> org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE|
> |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo 
> change to com.esotericsoftware.kryo.Kryo|
> |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") 
> to Seq("1.100\t1", "2.100\t2")|
> |HiveOrcFilterSuite|Result format changed|
> |HiveDDLSuite|Remove $ (This change needs to be reconsidered)|
> |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: 
> org.datanucleus.identity.DatastoreIdImpl cannot be cast to 
> org.datanucleus.identity.OID|
>  * Other changes:
> Close hive schema verification:  
> [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251]
>  and 
> [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58]
> Update 
> [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192]
> Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't 
> connect to Hive 1.x metastore, We should use 
> {{HiveMetaStoreClient.getDelegationToken}} instead of 
> {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}}
> All changes can be found at 
> [PR-20659|https://github.com/apache/spark/pull/20659].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2018-05-07 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466088#comment-16466088
 ] 

Darek edited comment on SPARK-18673 at 5/7/18 4:09 PM:
---

[PR20819|https://github.com/apache/spark/pull/20819] for Spark => Hive 2.x was 
done but not merged and deleted.


was (Author: bidek):
PR20819 for Spark => Hive 2.x was done but not merged and deleted.

> Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
> --
>
> Key: SPARK-18673
> URL: https://issues.apache.org/jira/browse/SPARK-18673
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
> Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT 
>Reporter: Steve Loughran
>Priority: Major
>
> Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader 
> considers 3.x to be an unknown Hadoop version.
> Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it 
> will need to be updated to match.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2018-05-07 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16466088#comment-16466088
 ] 

Darek commented on SPARK-18673:
---

PR20819 for Spark => Hive 2.x was done but not merged and deleted.

> Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
> --
>
> Key: SPARK-18673
> URL: https://issues.apache.org/jira/browse/SPARK-18673
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
> Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT 
>Reporter: Steve Loughran
>Priority: Major
>
> Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader 
> considers 3.x to be an unknown Hadoop version.
> Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it 
> will need to be updated to match.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2018-05-07 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465934#comment-16465934
 ] 

Darek edited comment on SPARK-18673 at 5/7/18 1:59 PM:
---

Based on the recent PR, the community is moving toward Hadoop 3.1, why do you 
even bother with this ticket? Check the recent PR like SPARK-23807


was (Author: bidek):
Based on the recent PR, the community is moving toward Hadoop 3.1, why do you 
event bother with this ticket? Check the recent PR like SPARK-23807

> Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
> --
>
> Key: SPARK-18673
> URL: https://issues.apache.org/jira/browse/SPARK-18673
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
> Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT 
>Reporter: Steve Loughran
>Priority: Major
>
> Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader 
> considers 3.x to be an unknown Hadoop version.
> Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it 
> will need to be updated to match.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2018-05-07 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465934#comment-16465934
 ] 

Darek commented on SPARK-18673:
---

Based on the recent PR, the community is moving toward Hadoop 3.1, why do you 
event bother with this ticket? Check the recent PR like SPARK-23807

> Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
> --
>
> Key: SPARK-18673
> URL: https://issues.apache.org/jira/browse/SPARK-18673
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
> Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT 
>Reporter: Steve Loughran
>Priority: Major
>
> Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader 
> considers 3.x to be an unknown Hadoop version.
> Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it 
> will need to be updated to match.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23710) Upgrade Hive to 2.3.2

2018-04-13 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16437224#comment-16437224
 ] 

Darek commented on SPARK-23710:
---

Spark is on Hive 1.2 and there's no appetite in the community to merge [PR 
20659|https://github.com/apache/spark/pull/20659], although the work on the 
upgrade has been completed.

> Upgrade Hive to 2.3.2
> -
>
> Key: SPARK-23710
> URL: https://issues.apache.org/jira/browse/SPARK-23710
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Yuming Wang
>Priority: Major
>
> h1. Mainly changes
>  * Maven dependency:
>  hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change 
> {{hive.classifier}} to {{core}}
>  calcite.version from {{1.2.0-incubating}} to {{1.10.0}}
>  datanucleus-core.version from {{3.2.10}} to {{4.1.17}}
>  remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: 
> ORC-174
>  add new dependency {{avatica}} and {{hive.storage.api}}
>  * ORC compatibility changes:
>  OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, 
> OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala
>  * hive-thriftserver java file update:
>  update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2
>  update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to 
> hive 2.3.2
>  * TestSuite should update:
> ||TestSuite||Reason||
> |StatisticsSuite|HIVE-16098|
> |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]|
> |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, 
> SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to 
> hive-hcatalog-core-2.3.2.jar|
> |SparkExecuteStatementOperationSuite|Interface changed from 
> org.apache.hive.service.cli.Type.NULL_TYPE to 
> org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE|
> |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo 
> change to com.esotericsoftware.kryo.Kryo|
> |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") 
> to Seq("1.100\t1", "2.100\t2")|
> |HiveOrcFilterSuite|Result format changed|
> |HiveDDLSuite|Remove $ (This change needs to be reconsidered)|
> |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: 
> org.datanucleus.identity.DatastoreIdImpl cannot be cast to 
> org.datanucleus.identity.OID|
>  * Other changes:
> Close hive schema verification:  
> [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251]
>  and 
> [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58]
> Update 
> [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192]
> Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't 
> connect to Hive 1.x metastore, We should use 
> {{HiveMetaStoreClient.getDelegationToken}} instead of 
> {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}}
> All changes can be found at 
> [PR-20659|https://github.com/apache/spark/pull/20659].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19076) Upgrade Hive dependence to Hive 2.x

2018-03-28 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417293#comment-16417293
 ] 

Darek commented on SPARK-19076:
---

It's done, passed all the tests, just needs to be merged in, not sure who can 
merge it, I have been asking for a while now, but no one is willing to step in 
and merge it. If you know know anyone who can merge it, it would be great help. 
Thanks

> Upgrade Hive dependence to Hive 2.x
> ---
>
> Key: SPARK-19076
> URL: https://issues.apache.org/jira/browse/SPARK-19076
> Project: Spark
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Priority: Major
>
> Currently the upstream Spark depends on Hive 1.2.1 to build package, and Hive 
> 2.0 has been released in February 2016, Hive 2.0.1 and 2.1.0  also released 
> for a long time, at Spark side, it is better to support Hive 2.0 and above.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19076) Upgrade Hive dependence to Hive 2.x

2018-03-28 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417284#comment-16417284
 ] 

Darek commented on SPARK-19076:
---

[PR 20659|https://github.com/apache/spark/pull/20659] for this issue already 
exists, just needs to be merged into master.

> Upgrade Hive dependence to Hive 2.x
> ---
>
> Key: SPARK-19076
> URL: https://issues.apache.org/jira/browse/SPARK-19076
> Project: Spark
>  Issue Type: Improvement
>Reporter: Dapeng Sun
>Priority: Major
>
> Currently the upstream Spark depends on Hive 1.2.1 to build package, and Hive 
> 2.0 has been released in February 2016, Hive 2.0.1 and 2.1.0  also released 
> for a long time, at Spark side, it is better to support Hive 2.0 and above.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23710) Upgrade Hive to 2.3.2

2018-03-25 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413234#comment-16413234
 ] 

Darek commented on SPARK-23710:
---

It passed all the tests and we need it for Hadoop 3.0, we need to merge the PR 
ASAP.

> Upgrade Hive to 2.3.2
> -
>
> Key: SPARK-23710
> URL: https://issues.apache.org/jira/browse/SPARK-23710
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Yuming Wang
>Priority: Major
>
> h1. Mainly changes
>  * Maven dependency:
>  hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change 
> {{hive.classifier}} to {{core}}
>  calcite.version from {{1.2.0-incubating}} to {{1.10.0}}
>  datanucleus-core.version from {{3.2.10}} to {{4.1.17}}
>  remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: 
> ORC-174
>  add new dependency {{avatica}} and {{hive.storage.api}}
>  * ORC compatibility changes:
>  OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, 
> OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala
>  * hive-thriftserver java file update:
>  update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2
>  update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to 
> hive 2.3.2
>  * TestSuite should update:
> ||TestSuite||Reason||
> |StatisticsSuite|HIVE-16098|
> |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]|
> |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, 
> SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to 
> hive-hcatalog-core-2.3.2.jar|
> |SparkExecuteStatementOperationSuite|Interface changed from 
> org.apache.hive.service.cli.Type.NULL_TYPE to 
> org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE|
> |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo 
> change to com.esotericsoftware.kryo.Kryo|
> |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") 
> to Seq("1.100\t1", "2.100\t2")|
> |HiveOrcFilterSuite|Result format changed|
> |HiveDDLSuite|Remove $ (This change needs to be reconsidered)|
> |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: 
> org.datanucleus.identity.DatastoreIdImpl cannot be cast to 
> org.datanucleus.identity.OID|
>  * Other changes:
> Close hive schema verification:  
> [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251]
>  and 
> [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58]
> Update 
> [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192]
> Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't 
> connect to Hive 1.x metastore, We should use 
> {{HiveMetaStoreClient.getDelegationToken}} instead of 
> {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}}
> All changes can be found at 
> [PR-20659|https://github.com/apache/spark/pull/20659].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-03-21 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408316#comment-16408316
 ] 

Darek edited comment on SPARK-23534 at 3/21/18 6:05 PM:


It seems that Hive upgrade to 2.3.2 is almost done 
[SPARK-23710|https://issues.apache.org/jira/browse/SPARK-23710]
, once it's done, hopefully Hadoop 3.0 will build.


was (Author: bidek):
It seems that Hive upgrade to 2.3.2 is almost done ( 
https://issues.apache.org/jira/browse/SPARK-23710 ), once it's done, hopefully 
Hadoop 3.0 will build.

> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23710) Upgrade Hive to 2.3.2

2018-03-21 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408324#comment-16408324
 ] 

Darek edited comment on SPARK-23710 at 3/21/18 6:04 PM:


Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks
[SPARK-23534|https://issues.apache.org/jira/browse/SPARK-23534]
[SPARK-18673|https://issues.apache.org/jira/browse/SPARK-18673]


was (Author: bidek):
Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks

[#SPARK-23534]
[#SPARK-18673]

> Upgrade Hive to 2.3.2
> -
>
> Key: SPARK-23710
> URL: https://issues.apache.org/jira/browse/SPARK-23710
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Yuming Wang
>Priority: Major
>
> h1. Mainly changes
>  * Maven dependency:
>  hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change 
> {{hive.classifier}} to {{core}}
>  calcite.version from {{1.2.0-incubating}} to {{1.10.0}}
>  datanucleus-core.version from {{3.2.10}} to {{4.1.17}}
>  remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: 
> ORC-174
>  add new dependency {{avatica}} and {{hive.storage.api}}
>  * ORC compatibility changes:
>  OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, 
> OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala
>  * hive-thriftserver java file update:
>  update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2
>  update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to 
> hive 2.3.2
>  * TestSuite should update:
> ||TestSuite||Reason||
> |StatisticsSuite|HIVE-16098|
> |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]|
> |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, 
> SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to 
> hive-hcatalog-core-2.3.2.jar|
> |SparkExecuteStatementOperationSuite|Interface changed from 
> org.apache.hive.service.cli.Type.NULL_TYPE to 
> org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE|
> |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo 
> change to com.esotericsoftware.kryo.Kryo|
> |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") 
> to Seq("1.100\t1", "2.100\t2")|
> |HiveOrcFilterSuite|Result format changed|
> |HiveDDLSuite|Remove $ (This change needs to be reconsidered)|
> |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: 
> org.datanucleus.identity.DatastoreIdImpl cannot be cast to 
> org.datanucleus.identity.OID|
>  * Other changes:
> Close hive schema verification:  
> [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251]
>  and 
> [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58]
> Update 
> [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192]
> Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't 
> connect to Hive 1.x metastore, We should use 
> {{HiveMetaStoreClient.getDelegationToken}} instead of 
> {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}}
> All changes can be found at 
> [PR-20659|https://github.com/apache/spark/pull/20659].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23710) Upgrade Hive to 2.3.2

2018-03-21 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408324#comment-16408324
 ] 

Darek edited comment on SPARK-23710 at 3/21/18 6:00 PM:


Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks

[#SPARK-23534]
[#SPARK-18673]


was (Author: bidek):
Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks

https://issues.apache.org/jira/browse/SPARK-23534
https://issues.apache.org/jira/browse/SPARK-18673

> Upgrade Hive to 2.3.2
> -
>
> Key: SPARK-23710
> URL: https://issues.apache.org/jira/browse/SPARK-23710
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Yuming Wang
>Priority: Major
>
> h1. Mainly changes
>  * Maven dependency:
>  hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change 
> {{hive.classifier}} to {{core}}
>  calcite.version from {{1.2.0-incubating}} to {{1.10.0}}
>  datanucleus-core.version from {{3.2.10}} to {{4.1.17}}
>  remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: 
> ORC-174
>  add new dependency {{avatica}} and {{hive.storage.api}}
>  * ORC compatibility changes:
>  OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, 
> OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala
>  * hive-thriftserver java file update:
>  update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2
>  update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to 
> hive 2.3.2
>  * TestSuite should update:
> ||TestSuite||Reason||
> |StatisticsSuite|HIVE-16098|
> |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]|
> |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, 
> SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to 
> hive-hcatalog-core-2.3.2.jar|
> |SparkExecuteStatementOperationSuite|Interface changed from 
> org.apache.hive.service.cli.Type.NULL_TYPE to 
> org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE|
> |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo 
> change to com.esotericsoftware.kryo.Kryo|
> |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") 
> to Seq("1.100\t1", "2.100\t2")|
> |HiveOrcFilterSuite|Result format changed|
> |HiveDDLSuite|Remove $ (This change needs to be reconsidered)|
> |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: 
> org.datanucleus.identity.DatastoreIdImpl cannot be cast to 
> org.datanucleus.identity.OID|
>  * Other changes:
> Close hive schema verification:  
> [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251]
>  and 
> [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58]
> Update 
> [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192]
> Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't 
> connect to Hive 1.x metastore, We should use 
> {{HiveMetaStoreClient.getDelegationToken}} instead of 
> {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}}
> All changes can be found at 
> [PR-20659|https://github.com/apache/spark/pull/20659].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23710) Upgrade Hive to 2.3.2

2018-03-21 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408324#comment-16408324
 ] 

Darek commented on SPARK-23710:
---

Can we merge PR 20659 into master? it's blocking a lot of tickets Thanks

https://issues.apache.org/jira/browse/SPARK-23534
https://issues.apache.org/jira/browse/SPARK-18673

> Upgrade Hive to 2.3.2
> -
>
> Key: SPARK-23710
> URL: https://issues.apache.org/jira/browse/SPARK-23710
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.4.0
>Reporter: Yuming Wang
>Priority: Major
>
> h1. Mainly changes
>  * Maven dependency:
>  hive.version from {{1.2.1.spark2}} to {{2.3.2}} and change 
> {{hive.classifier}} to {{core}}
>  calcite.version from {{1.2.0-incubating}} to {{1.10.0}}
>  datanucleus-core.version from {{3.2.10}} to {{4.1.17}}
>  remove {{orc.classifier}}, it means orc use the {{hive.storage.api}}, see: 
> ORC-174
>  add new dependency {{avatica}} and {{hive.storage.api}}
>  * ORC compatibility changes:
>  OrcColumnVector.java, OrcColumnarBatchReader.java, OrcDeserializer.scala, 
> OrcFilters.scala, OrcSerializer.scala, OrcFilterSuite.scala
>  * hive-thriftserver java file update:
>  update {{sql/hive-thriftserver/if/TCLIService.thrift}} to hive 2.3.2
>  update {{sql/hive-thriftserver/src/main/java/org/apache/hive/service/*}} to 
> hive 2.3.2
>  * TestSuite should update:
> ||TestSuite||Reason||
> |StatisticsSuite|HIVE-16098|
> |SessionCatalogSuite|Similar to [VersionsSuite.scala#L427|#L427]|
> |CliSuite, HiveThriftServer2Suites, HiveSparkSubmitSuite, HiveQuerySuite, 
> SQLQuerySuite|Update hive-hcatalog-core-0.13.1.jar to 
> hive-hcatalog-core-2.3.2.jar|
> |SparkExecuteStatementOperationSuite|Interface changed from 
> org.apache.hive.service.cli.Type.NULL_TYPE to 
> org.apache.hadoop.hive.serde2.thrift.Type.NULL_TYPE|
> |ClasspathDependenciesSuite|org.apache.hive.com.esotericsoftware.kryo.Kryo 
> change to com.esotericsoftware.kryo.Kryo|
> |HiveMetastoreCatalogSuite|Result format changed from Seq("1.1\t1", "2.1\t2") 
> to Seq("1.100\t1", "2.100\t2")|
> |HiveOrcFilterSuite|Result format changed|
> |HiveDDLSuite|Remove $ (This change needs to be reconsidered)|
> |HiveExternalCatalogVersionsSuite| java.lang.ClassCastException: 
> org.datanucleus.identity.DatastoreIdImpl cannot be cast to 
> org.datanucleus.identity.OID|
>  * Other changes:
> Close hive schema verification:  
> [HiveClientImpl.scala#L251|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L251]
>  and 
> [HiveExternalCatalog.scala#L58|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala#L58]
> Update 
> [IsolatedClientLoader.scala#L189-L192|https://github.com/wangyum/spark/blob/75e4cc9e80f85517889e87a35da117bc361f2ff3/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala#L189-L192]
> Because Hive 2.3.2's {{org.apache.hadoop.hive.ql.metadata.Hive}} can't 
> connect to Hive 1.x metastore, We should use 
> {{HiveMetaStoreClient.getDelegationToken}} instead of 
> {{Hive.getDelegationToken}} and update {{HiveClientImpl.toHiveTable}}
> All changes can be found at 
> [PR-20659|https://github.com/apache/spark/pull/20659].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-03-21 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408316#comment-16408316
 ] 

Darek edited comment on SPARK-23534 at 3/21/18 5:45 PM:


It seems that Hive upgrade to 2.3.2 is almost done ( 
https://issues.apache.org/jira/browse/SPARK-23710 ), once it's done, hopefully 
Hadoop 3.0 will build.


was (Author: bidek):
It seems that Hive upgrade to 2.3.2 is almost done 
(https://issues.apache.org/jira/browse/SPARK-23710), once it's done, hopefully 
Hadoop 3.0 will build.

> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-03-21 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16408316#comment-16408316
 ] 

Darek commented on SPARK-23534:
---

It seems that Hive upgrade to 2.3.2 is almost done 
(https://issues.apache.org/jira/browse/SPARK-23710), once it's done, hopefully 
Hadoop 3.0 will build.

> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-03-20 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407247#comment-16407247
 ] 

Darek commented on SPARK-23534:
---

https://github.com/Azure/azure-storage-java 7.0 will only work with 
org.apache.hadoop/hadoop-azure/3.0.0.

I am afraid of using of older version of azure-storage because of all the 
security issues that have been found and fixed in the newer version, not to 
mention all the new features that Azure has added in the last 2 years. Using 
old software and public cloud = bad idea.



> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-03-19 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405653#comment-16405653
 ] 

Darek commented on SPARK-23534:
---

It's not about pure Hadoop 3.0, it's about the rest of the libraries that 
require Hadoop 3.0 jars.
For example you can not run Spark on Java 9 because of Hadoop 2.7 and Java 1.8 
is very old by now.
New Azure Blob libraries won't work with Hadoop 2.7 and list goes on.

I tried building Spark with Hadoop 3.0 and the blocker is the fact that Spark 
is not using mainstream Hive, it uses an old fork of Hive. Need to convince the 
community that we need to switch to current release of Hive instead of the old 
fork. 

> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2018-03-19 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405604#comment-16405604
 ] 

Darek commented on SPARK-18673:
---

[~joshrosen] Would you know who can help adding the current version of HIVE to 
Spark stack?

> Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
> --
>
> Key: SPARK-18673
> URL: https://issues.apache.org/jira/browse/SPARK-18673
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
> Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT 
>Reporter: Steve Loughran
>Priority: Major
>
> Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader 
> considers 3.x to be an unknown Hadoop version.
> Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it 
> will need to be updated to match.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-03-19 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16405592#comment-16405592
 ] 

Darek commented on SPARK-23534:
---

There has been no progress on this ticket, do you know anyone that can help 
with the upgrade?

> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-03-07 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16389629#comment-16389629
 ] 

Darek commented on SPARK-23534:
---

We need a resolution sooner than later. Most vendors have moved on to Hadoop 
3.0 already, Java 9.0 has been out for 6 months now, Java 10 will be out in 2 
weeks. Who can help to move this issue forward? Thanks

> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2018-03-06 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388757#comment-16388757
 ] 

Darek commented on SPARK-18673:
---

When running the pyspark tests using Hadoop 3.0.0 I am not getting the 
java.lang.IllegalArgumentException but I am getting ClassNotFoundException: 
org.apache.hadoop.hive.sql.metadata.HiveException.

Who can help to move this ticket forward?

Thanks

> Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
> --
>
> Key: SPARK-18673
> URL: https://issues.apache.org/jira/browse/SPARK-18673
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
> Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT 
>Reporter: Steve Loughran
>Priority: Major
>
> Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader 
> considers 3.x to be an unknown Hadoop version.
> Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it 
> will need to be updated to match.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0

2018-03-06 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388017#comment-16388017
 ] 

Darek commented on SPARK-23534:
---

SPARK-18673 should be closed since HIVE-15016 and HIVE-18550 are closed.

There should be not blockers are this point.

> Spark run on Hadoop 3.0.0
> -
>
> Key: SPARK-23534
> URL: https://issues.apache.org/jira/browse/SPARK-23534
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make 
> sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark 
> run on Hadoop 3.0.
> The work includes:
>  # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0.
>  # Test to see if there's dependency issues with Hadoop 3.0.
>  # Investigating the feasibility to use shaded client jars (HADOOP-11804).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2018-03-06 Thread Darek (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388014#comment-16388014
 ] 

Darek commented on SPARK-18673:
---

HIVE tickets are closed already, can we close this ticket?

> Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version
> --
>
> Key: SPARK-18673
> URL: https://issues.apache.org/jira/browse/SPARK-18673
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.1.0
> Environment: Spark built with -Dhadoop.version=3.0.0-alpha2-SNAPSHOT 
>Reporter: Steve Loughran
>Priority: Major
>
> Spark Dataframes fail to run on Hadoop 3.0.x, because hive.jar's shimloader 
> considers 3.x to be an unknown Hadoop version.
> Hive itself will have to fix this; as Spark uses its own hive 1.2.x JAR, it 
> will need to be updated to match.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org