[
https://issues.apache.org/jira/browse/SPARK-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14014536#comment-14014536
]
Matei Zaharia commented on SPARK-1518:
--------------------------------------
Got it, the excludes have indeed gotten more painful, and I can see that being
a problem. Maybe the solution would be to publish some kind of
"spark-core-hadoopX" for each version of Hadoop, which depends on hadoop-client
and spark-core. But then we'll need a list of supported versions to publish
for. By the way AFAIK Maven classifiers do not solve this issue, as versions of
an artifact with different classifiers must have the same dependency tree (they
can differ in other things, e.g. maybe they're compiled on different Java
versions).
BTW, in terms of the Hadoop 2 thing, I just meant that there was a lot of
variability before the community decided to go GA, and unfortunately a lot of
users are on older versions. (Ironically sometimes because of this volatility).
I definitely appreciate the move to GA and the compatibility policies. When I
said Hadoop 2, I meant that, for example, do you consider 0.23 to be Hadoop 2?
It's YARN-based and it was (and still is AFAIK) widely used at Yahoo. What
about 2.0.x? Some users are on that too. What about CDH4? Its version number is
2.0.0-something. In any case we will be sensible about old versions, but my
philosophy is always to support the broadest range possible, and from
everything I've seen it's paid off -- users appreciate when you do not force
their hand to upgrade. This is why Yahoo for example continues to be our
biggest contributor on YARN support, even though their YARN is pretty different.
> Spark master doesn't compile against hadoop-common trunk
> --------------------------------------------------------
>
> Key: SPARK-1518
> URL: https://issues.apache.org/jira/browse/SPARK-1518
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Reporter: Marcelo Vanzin
> Assignee: Colin Patrick McCabe
> Priority: Critical
>
> FSDataOutputStream::sync() has disappeared from trunk in Hadoop;
> FileLogger.scala is calling it.
> I've changed it locally to hsync() so I can compile the code, but haven't
> checked yet whether those are equivalent. hsync() seems to have been there
> forever, so it hopefully works with all versions Spark cares about.
--
This message was sent by Atlassian JIRA
(v6.2#6252)