[ 
https://issues.apache.org/jira/browse/SPARK-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014536#comment-14014536
 ] 

Matei Zaharia commented on SPARK-1518:
--------------------------------------

Got it, the excludes have indeed gotten more painful, and I can see that being 
a problem. Maybe the solution would be to publish some kind of 
"spark-core-hadoopX" for each version of Hadoop, which depends on hadoop-client 
and spark-core. But then we'll need a list of supported versions to publish 
for. By the way AFAIK Maven classifiers do not solve this issue, as versions of 
an artifact with different classifiers must have the same dependency tree (they 
can differ in other things, e.g. maybe they're compiled on different Java 
versions).

BTW, in terms of the Hadoop 2 thing, I just meant that there was a lot of 
variability before the community decided to go GA, and unfortunately a lot of 
users are on older versions. (Ironically sometimes because of this volatility). 
I definitely appreciate the move to GA and the compatibility policies. When I 
said Hadoop 2, I meant that, for example, do you consider 0.23 to be Hadoop 2? 
It's YARN-based and it was (and still is AFAIK) widely used at Yahoo. What 
about 2.0.x? Some users are on that too. What about CDH4? Its version number is 
2.0.0-something. In any case we will be sensible about old versions, but my 
philosophy is always to support the broadest range possible, and from 
everything I've seen it's paid off -- users appreciate when you do not force 
their hand to upgrade. This is why Yahoo for example continues to be our 
biggest contributor on YARN support, even though their YARN is pretty different.

> Spark master doesn't compile against hadoop-common trunk
> --------------------------------------------------------
>
>                 Key: SPARK-1518
>                 URL: https://issues.apache.org/jira/browse/SPARK-1518
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>            Reporter: Marcelo Vanzin
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>
> FSDataOutputStream::sync() has disappeared from trunk in Hadoop; 
> FileLogger.scala is calling it.
> I've changed it locally to hsync() so I can compile the code, but haven't 
> checked yet whether those are equivalent. hsync() seems to have been there 
> forever, so it hopefully works with all versions Spark cares about.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to