Github user ryan-williams commented on the pull request:
https://github.com/apache/spark/pull/6599#issuecomment-108484037
@srowen has convinced me that, rather than try to publish one Spark that
works for Hadoops 1 **and** 2, "we" should publish individual artifacts for the
Hadoop versions that are not compatible with each other (hopefully just a 1.*
and a 2.*).
Conveniently, such artifacts are already built and published at
[https://spark.apache.org/downloads.html](https://spark.apache.org/downloads.html),
they're just not published anywhere that can be easily programmatically built
against, e.g. a Maven repository.
It seems to me that the "correct" solution is to take those
already-published artifacts, which people can manually download and run against
today, and also publish them to a Maven repository.
Maybe I don't fully understand what is meant by "embedded" Spark, but
shouldn't [people that want to "embed" Spark and run against Hadoop 1] simply
"embed" one of the Spark JARs that is already built for Hadoop 1 and published
and hosted at apache.org? Is it important that they "embed" it via a Maven
dependency?
If so, again, we should publish Maven JARs that are built to support Hadoop
1.
Thanks, let me know if I'm misunderstanding something.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]