[ 
https://issues.apache.org/jira/browse/SPARK-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14013378#comment-14013378
 ] 

Sandy Ryza commented on SPARK-1518:
-----------------------------------

bq. I don't think there's a single release I'd call "Hadoop 2", and it would be 
confusing to users to link to the "Hadoop 2" artifact and not have it run on 
their cluster.

While Hadoop releases have not historically been amazing at maintaining 
compatibility, I think this a bit of an overstatement.  There is a definitive 
Hadoop 2, which became GA starting at 2.2.  It has a set of public/stable APIs 
that have not been broken since then, a promise not to break them for the 
remainder of 2.x, and a comprehensive compatibility guide that describes 
exactly what "break" means - 
http://hadoop.apache.org/docs/r2.3.0/hadoop-project-dist/hadoop-common/Compatibility.html.
  All the major distributions (CDH, Pivotal, HDP, I think MapR?) support these 
APIs.  The Hadoop 2 releases that preceded 2.2 were labeled alpha and beta, and 
did not come with these same guarantees.

While, with my Cloudera hat on, I'd love for the CDH5 Spark artifacts to become 
the canonical Spark Hadoop 2 artifacts, with my Apache hat on, I do see some 
value in publishing Spark Hadoop 2 artifacts.  Though, as Matei and Patrick 
pointed out, these only matter when bundling Spark inside your own application. 
 In most cases, it's better to point to Spark jars installed on one's laptop or 
cluster.


> Spark master doesn't compile against hadoop-common trunk
> --------------------------------------------------------
>
>                 Key: SPARK-1518
>                 URL: https://issues.apache.org/jira/browse/SPARK-1518
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Marcelo Vanzin
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>
> FSDataOutputStream::sync() has disappeared from trunk in Hadoop; 
> FileLogger.scala is calling it.
> I've changed it locally to hsync() so I can compile the code, but haven't 
> checked yet whether those are equivalent. hsync() seems to have been there 
> forever, so it hopefully works with all versions Spark cares about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to