[ 
https://issues.apache.org/jira/browse/SPARK-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012895#comment-14012895
 ] 

Colin Patrick McCabe commented on SPARK-1518:
---------------------------------------------

bq. The concrete problem is that Hadoop has been extremely fickle with 
compatibility even within a major release series (1.x or 2.x). HDFS protocol 
versions change and you can't access the cluster, YARN versions change, etc. I 
don't think there's a single release I'd call "Hadoop 2", and it would be 
confusing to users to link to the "Hadoop 2" artifact and not have it run on 
their cluster.

I know that there was an RPC compatibility break between {{2.1.1-beta}} and 
{{2.1.0-beta}}.  Around the 2.3 time-frame, Hadoop decided to freeze the RPC 
format at version 9, and try to maintain compatibility going forward.  You are 
right that bundling the appropriate version of the Hadoop client is the usual 
approach that projects which depend on Hadoop take, exactly to avoid these 
kinds of worries.

> Spark master doesn't compile against hadoop-common trunk
> --------------------------------------------------------
>
>                 Key: SPARK-1518
>                 URL: https://issues.apache.org/jira/browse/SPARK-1518
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Marcelo Vanzin
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>
> FSDataOutputStream::sync() has disappeared from trunk in Hadoop; 
> FileLogger.scala is calling it.
> I've changed it locally to hsync() so I can compile the code, but haven't 
> checked yet whether those are equivalent. hsync() seems to have been there 
> forever, so it hopefully works with all versions Spark cares about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to