[jira] [Commented] (SPARK-23807) Add Hadoop 3 profile with relevant POM fix ups, cloud-storage artifacts and binding

2018-03-29 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419017#comment-16419017
 ] 

Steve Loughran commented on SPARK-23807:


yes, this profile is part of the hadoop 3 support, "necessary but not 
sufficient", given the hive issue

> Add Hadoop 3 profile with relevant POM fix ups, cloud-storage artifacts and 
> binding
> ---
>
> Key: SPARK-23807
> URL: https://issues.apache.org/jira/browse/SPARK-23807
> Project: Spark
>  Issue Type: Sub-task
>  Components: Build
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> Hadoop 3, and particular Hadoop 3.1 adds:
>  * Java 8 as the minimum (and currently sole) supported Java version
>  * A new "hadoop-cloud-storage" module intended to be a minimal dependency 
> POM for all the cloud connectors in the version of hadoop built against
>  * The ability to declare a committer for any FileOutputFormat which 
> supercedes the classic FileOutputCommitter -in both a job and for a specific 
> FS URI
>  * A shaded client JAR, though not yet one complete enough for spark.
>  * Lots of other features and fixes.
> The basic work of building spark with hadoop 3 is one of just doing the build 
> with {{-Dhadoop.version=3.x.y}}; however that
>  * Doesn't build on SBT (dependency resolution of zookeeper JAR)
>  * Misses the new cloud features
> The ZK dependency can be fixed everywhere by explicitly declaring the ZK 
> artifact, instead of relying on curator to pull it in; this needs a profile 
> to declare the right ZK version, obviously..
> To use the cloud features spark the hadoop-3 profile should declare that the 
> spark-hadoop-cloud module depends on —and only on— the 
> hadoop/hadoop-cloud-storage module for its transitive dependencies on cloud 
> storage, and a source package which is only built and tested when build 
> against Hadoop 3.1+
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23807) Add Hadoop 3 profile with relevant POM fix ups, cloud-storage artifacts and binding

2018-03-28 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418298#comment-16418298
 ] 

Saisai Shao commented on SPARK-23807:
-

Hi [~ste...@apache.org], this is dup of SPARK-23534, I think we can convert 
this as a subtask of SPARK-23534, what do you think?

> Add Hadoop 3 profile with relevant POM fix ups, cloud-storage artifacts and 
> binding
> ---
>
> Key: SPARK-23807
> URL: https://issues.apache.org/jira/browse/SPARK-23807
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> Hadoop 3, and particular Hadoop 3.1 adds:
>  * Java 8 as the minimum (and currently sole) supported Java version
>  * A new "hadoop-cloud-storage" module intended to be a minimal dependency 
> POM for all the cloud connectors in the version of hadoop built against
>  * The ability to declare a committer for any FileOutputFormat which 
> supercedes the classic FileOutputCommitter -in both a job and for a specific 
> FS URI
>  * A shaded client JAR, though not yet one complete enough for spark.
>  * Lots of other features and fixes.
> The basic work of building spark with hadoop 3 is one of just doing the build 
> with {{-Dhadoop.version=3.x.y}}; however that
>  * Doesn't build on SBT (dependency resolution of zookeeper JAR)
>  * Misses the new cloud features
> The ZK dependency can be fixed everywhere by explicitly declaring the ZK 
> artifact, instead of relying on curator to pull it in; this needs a profile 
> to declare the right ZK version, obviously..
> To use the cloud features spark the hadoop-3 profile should declare that the 
> spark-hadoop-cloud module depends on —and only on— the 
> hadoop/hadoop-cloud-storage module for its transitive dependencies on cloud 
> storage, and a source package which is only built and tested when build 
> against Hadoop 3.1+
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23807) Add Hadoop 3 profile with relevant POM fix ups, cloud-storage artifacts and binding

2018-03-28 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23807?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417864#comment-16417864
 ] 

Apache Spark commented on SPARK-23807:
--

User 'steveloughran' has created a pull request for this issue:
https://github.com/apache/spark/pull/20923

> Add Hadoop 3 profile with relevant POM fix ups, cloud-storage artifacts and 
> binding
> ---
>
> Key: SPARK-23807
> URL: https://issues.apache.org/jira/browse/SPARK-23807
> Project: Spark
>  Issue Type: Improvement
>  Components: Build
>Affects Versions: 2.4.0
>Reporter: Steve Loughran
>Priority: Major
>
> Hadoop 3, and particular Hadoop 3.1 adds:
>  * Java 8 as the minimum (and currently sole) supported Java version
>  * A new "hadoop-cloud-storage" module intended to be a minimal dependency 
> POM for all the cloud connectors in the version of hadoop built against
>  * The ability to declare a committer for any FileOutputFormat which 
> supercedes the classic FileOutputCommitter -in both a job and for a specific 
> FS URI
>  * A shaded client JAR, though not yet one complete enough for spark.
>  * Lots of other features and fixes.
> The basic work of building spark with hadoop 3 is one of just doing the build 
> with {{-Dhadoop.version=3.x.y}}; however that
>  * Doesn't build on SBT (dependency resolution of zookeeper JAR)
>  * Misses the new cloud features
> The ZK dependency can be fixed everywhere by explicitly declaring the ZK 
> artifact, instead of relying on curator to pull it in; this needs a profile 
> to declare the right ZK version, obviously..
> To use the cloud features spark the hadoop-3 profile should declare that the 
> spark-hadoop-cloud module depends on —and only on— the 
> hadoop/hadoop-cloud-storage module for its transitive dependencies on cloud 
> storage, and a source package which is only built and tested when build 
> against Hadoop 3.1+
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org