Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/507#issuecomment-41323534
This is a bit of a weird setup. Thrift has a Hadoop 1 and Hadoop 2 version,
but usually you deploy those as separate artifacts. The Hadoop 1 profile is the
default, which specifies thrift 0.7.0, but you have to parse profiles to know
that. I think it would have been better to set 0.7.0 as the version outside a
profile. But I think that's why SBT doesn't get it.
Next I note that flume is telling us it wants to use 0.7.0 with Hadoop 1
and 0.8.0 with Hadoop 2. Ideally we just get away with 0.8.0 for both and not
try to version this independently.
You have the SBT build using 0.8.0 but I bet that if you ran mvn
dependency:tree you'd find that the Maven build is actually pulling in 0.7.0,
given that it's the 'default'.
(But wait, there's more. In Examples, Hive 0.12 wants thrift 0.9.0! But
that's irrelevant, it's not a dependency on external.)
Suggest that the Maven build force thrift 0.8.0 too and document it. It can
be `<scope>runtime</scope>`. I think that's pretty tidy, and at least this is
contained to particular leaf module.
Once Hadoop 1 and yarn alpha support and such go away, some of this jar
hell will too. And maybe when there's one build of reference.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---