[
https://issues.apache.org/jira/browse/SPARK-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013378#comment-14013378
]
Sandy Ryza commented on SPARK-1518:
-----------------------------------
bq. I don't think there's a single release I'd call "Hadoop 2", and it would be
confusing to users to link to the "Hadoop 2" artifact and not have it run on
their cluster.
While Hadoop releases have not historically been amazing at maintaining
compatibility, I think this a bit of an overstatement. There is a definitive
Hadoop 2, which became GA starting at 2.2. It has a set of public/stable APIs
that have not been broken since then, a promise not to break them for the
remainder of 2.x, and a comprehensive compatibility guide that describes
exactly what "break" means -
http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/Compatibility.html.
All the major distributions (CDH, Pivotal, HDP, I think MapR?) support these
APIs. The Hadoop 2 releases that preceded 2.2 were labeled alpha and beta, and
did not come with these same guarantees.
While, with my Cloudera hat on, I'd love for the CDH5 Spark artifacts to become
the canonical Spark Hadoop 2 artifacts, with my Apache hat on, I do see some
value in publishing Spark Hadoop 2 artifacts. Though, as Matei and Patrick
pointed out, these only matter when bundling Spark inside your own application.
In most cases, it's better to point to Spark jars installed on one's laptop or
cluster.
> Spark master doesn't compile against hadoop-common trunk
> --------------------------------------------------------
>
> Key: SPARK-1518
> URL: https://issues.apache.org/jira/browse/SPARK-1518
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Reporter: Marcelo Vanzin
> Assignee: Colin Patrick McCabe
> Priority: Critical
>
> FSDataOutputStream::sync() has disappeared from trunk in Hadoop;
> FileLogger.scala is calling it.
> I've changed it locally to hsync() so I can compile the code, but haven't
> checked yet whether those are equivalent. hsync() seems to have been there
> forever, so it hopefully works with all versions Spark cares about.
--
This message was sent by Atlassian JIRA
(v6.2#6252)