[jira] [Commented] (SPARK-5134) Bump default Hadoop version to 2+
[ https://issues.apache.org/jira/browse/SPARK-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14361673#comment-14361673 ] Apache Spark commented on SPARK-5134: - User 'srowen' has created a pull request for this issue: https://github.com/apache/spark/pull/5027 Bump default Hadoop version to 2+ - Key: SPARK-5134 URL: https://issues.apache.org/jira/browse/SPARK-5134 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.2.0 Reporter: Ryan Williams Priority: Minor [~srowen] and I discussed bumping [the default hadoop version in the parent POM|https://github.com/apache/spark/blob/bb38ebb1abd26b57525d7d29703fd449e40cd6de/pom.xml#L122] from {{1.0.4}} to something more recent. There doesn't seem to be a good reason that it was set/kept at {{1.0.4}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5134) Bump default Hadoop version to 2+
[ https://issues.apache.org/jira/browse/SPARK-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14353047#comment-14353047 ] Sean Owen commented on SPARK-5134: -- Yep, I confirmed that ... {code} [INFO] \- org.apache.spark:spark-core_2.10:jar:1.2.1:compile ... [INFO]+- org.apache.hadoop:hadoop-client:jar:2.2.0:compile [INFO]| +- org.apache.hadoop:hadoop-common:jar:2.2.0:compile [INFO]| | +- commons-cli:commons-cli:jar:1.2:compile ... {code} Well, FWIW, although unintentional I do think there are upsides to this change. It would be good to codify that in the build, I suppose, by updating the default version number. How about updating to 2.2.0 to match what has actually happened? This would not entail activating the Hadoop build profiles by default or anything. [~rdub] would you care to do the honors? Bump default Hadoop version to 2+ - Key: SPARK-5134 URL: https://issues.apache.org/jira/browse/SPARK-5134 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.2.0 Reporter: Ryan Williams Priority: Minor [~srowen] and I discussed bumping [the default hadoop version in the parent POM|https://github.com/apache/spark/blob/bb38ebb1abd26b57525d7d29703fd449e40cd6de/pom.xml#L122] from {{1.0.4}} to something more recent. There doesn't seem to be a good reason that it was set/kept at {{1.0.4}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5134) Bump default Hadoop version to 2+
[ https://issues.apache.org/jira/browse/SPARK-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352268#comment-14352268 ] Patrick Wendell commented on SPARK-5134: Hey [~rdub] [~srowen], As part of the 1.3 release cycle I did some more forensics on the actual artifacts we publish. It turns out that because of the changes made for Scala 2.11 with the way our publishing works, we've actually been publishing poms that link against Hadoop 2.2 as of Spark 1.2. And in general, the published pom Hadoop version is decoupled now from the default one in the build itself, because of our use of the effective pom plugin. https://github.com/apache/spark/blob/master/dev/create-release/create-release.sh#L119 I'm actually a bit bummed that we (unintentionally) made this change in 1.2 because I do fear it likely screwed things up for some users. But on the plus side, since we no decouple the publishing from the default version in the pom, I don't see a big issue with updating the POM. So I withdraw my objection on the PR. Bump default Hadoop version to 2+ - Key: SPARK-5134 URL: https://issues.apache.org/jira/browse/SPARK-5134 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.2.0 Reporter: Ryan Williams Priority: Minor [~srowen] and I discussed bumping [the default hadoop version in the parent POM|https://github.com/apache/spark/blob/bb38ebb1abd26b57525d7d29703fd449e40cd6de/pom.xml#L122] from {{1.0.4}} to something more recent. There doesn't seem to be a good reason that it was set/kept at {{1.0.4}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5134) Bump default Hadoop version to 2+
[ https://issues.apache.org/jira/browse/SPARK-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352317#comment-14352317 ] Shivaram Venkataraman commented on SPARK-5134: -- Yeah so this did change in 1.2 and I think I mentioned it to Patrick when it affected a couple of other projects of mine. The main problem there was that even if you have an explicit Hadoop 1 dependency in your project, SBT picks up the highest version required while building an assembly jar for the project -- Thus with Spark linked against Hadoop 2.2, one would require an exclusion rule to use Hadoop 1. It might be good to add this to the docs or to some of the example Quick Start documentation we have Bump default Hadoop version to 2+ - Key: SPARK-5134 URL: https://issues.apache.org/jira/browse/SPARK-5134 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.2.0 Reporter: Ryan Williams Priority: Minor [~srowen] and I discussed bumping [the default hadoop version in the parent POM|https://github.com/apache/spark/blob/bb38ebb1abd26b57525d7d29703fd449e40cd6de/pom.xml#L122] from {{1.0.4}} to something more recent. There doesn't seem to be a good reason that it was set/kept at {{1.0.4}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5134) Bump default Hadoop version to 2+
[ https://issues.apache.org/jira/browse/SPARK-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352341#comment-14352341 ] Patrick Wendell commented on SPARK-5134: [~shivaram] did it end up working alright if you just excluded Spark's Hadoop dependency? If so we can just document this. Bump default Hadoop version to 2+ - Key: SPARK-5134 URL: https://issues.apache.org/jira/browse/SPARK-5134 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.2.0 Reporter: Ryan Williams Priority: Minor [~srowen] and I discussed bumping [the default hadoop version in the parent POM|https://github.com/apache/spark/blob/bb38ebb1abd26b57525d7d29703fd449e40cd6de/pom.xml#L122] from {{1.0.4}} to something more recent. There doesn't seem to be a good reason that it was set/kept at {{1.0.4}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5134) Bump default Hadoop version to 2+
[ https://issues.apache.org/jira/browse/SPARK-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14352374#comment-14352374 ] Shivaram Venkataraman commented on SPARK-5134: -- Yeah if you exclude Spark's Hadoop dependency things work correctly for Hadoop1. There are some additional issues that come up in 1.2 if due to the Guava changes, but those are not related to the default Hadoop version change. I think the documentation to update would be [1] but I am thinking it would be good to mention this in the Quick Start guide [2] as well [1] https://github.com/apache/spark/blob/55b1b32dc8b9b25deea8e5864b53fe802bb92741/docs/hadoop-third-party-distributions.md#linking-applications-to-the-hadoop-version [2] https://github.com/apache/spark/blob/55b1b32dc8b9b25deea8e5864b53fe802bb92741/docs/quick-start.md#self-contained-applications Bump default Hadoop version to 2+ - Key: SPARK-5134 URL: https://issues.apache.org/jira/browse/SPARK-5134 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.2.0 Reporter: Ryan Williams Priority: Minor [~srowen] and I discussed bumping [the default hadoop version in the parent POM|https://github.com/apache/spark/blob/bb38ebb1abd26b57525d7d29703fd449e40cd6de/pom.xml#L122] from {{1.0.4}} to something more recent. There doesn't seem to be a good reason that it was set/kept at {{1.0.4}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5134) Bump default Hadoop version to 2+
[ https://issues.apache.org/jira/browse/SPARK-5134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14267867#comment-14267867 ] Apache Spark commented on SPARK-5134: - User 'ryan-williams' has created a pull request for this issue: https://github.com/apache/spark/pull/3917 Bump default Hadoop version to 2+ - Key: SPARK-5134 URL: https://issues.apache.org/jira/browse/SPARK-5134 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 1.2.0 Reporter: Ryan Williams Priority: Minor [~srowen] and I discussed bumping [the default hadoop version in the parent POM|https://github.com/apache/spark/blob/bb38ebb1abd26b57525d7d29703fd449e40cd6de/pom.xml#L122] from {{1.0.4}} to something more recent. There doesn't seem to be a good reason that it was set/kept at {{1.0.4}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org