Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/5786#issuecomment-101466543
So, I had a chat offline (well, off-github) with Sean and these are my
conclusions:
- There is a real issue, addressed by this PR, that the "default build"
generates an assembly that cannot talk to any version of HDFS.
- In my view, the fix proposed here is the right way forward; it
standardizes on hadoop-2 as the preferred hadoop version by making it the
default, and having the default build work with a hadoop 2 cluster.
- The smallest fix for the issue would be to revert back to 1.0.4 as the
default version. Because we publish effective poms, that would not change the
version of Hadoop for any artifacts except for spark-parent; that is not a big
problem because it would only affect someone who depends on `${hadoop.version}`
and has `spark-parent_2.10` as the parent project of their own project, which
I'd guess is a very small set of people (if it even exists).
As for whether the "default build" should work or we should disallow it, I
don't really have a strong opinion. If there's an easy fix, sure, but if it
gets complicated, then it's probably not worth it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]