[ 
https://issues.apache.org/jira/browse/SPARK-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14012651#comment-14012651
 ] 

Matei Zaharia commented on SPARK-1518:
--------------------------------------

Okay, got it. But this only applies to you running the job on your laptop, 
right? Because otherwise you'll get the right Hadoop via the installation on 
the cluster.

For this use case I still think it's fine to require use of hadoop-client. It's 
been like that for the past 2 releases and nobody has asked questions about it. 
It's just one more entry to add to your pom.xml.

The concrete problem is that Hadoop has been extremely fickle with 
compatibility even within a major release series (1.x or 2.x). HDFS protocol 
versions change and you can't access the cluster, YARN versions change, etc. I 
don't think there's a single release I'd call "Hadoop 2", and it would be 
confusing to users to link to the "Hadoop 2" artifact and not have it run on 
their cluster.

> Spark master doesn't compile against hadoop-common trunk
> --------------------------------------------------------
>
>                 Key: SPARK-1518
>                 URL: https://issues.apache.org/jira/browse/SPARK-1518
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Marcelo Vanzin
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>
> FSDataOutputStream::sync() has disappeared from trunk in Hadoop; 
> FileLogger.scala is calling it.
> I've changed it locally to hsync() so I can compile the code, but haven't 
> checked yet whether those are equivalent. hsync() seems to have been there 
> forever, so it hopefully works with all versions Spark cares about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to