Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/237#issuecomment-38715156
Thanks Aaron - looks good. From what I can tell you've just copied the
model used for ganglia and previously YARN.
The only ugliness is that SPARK_HIVE is now needed at runtime to determine
whether to include the Datanucleus jars. And this is conflated with the
setting at compile-time which has a different meaning. This is a bit
unfortunate - is there no better way here? We could have the assembly name be
different if Hive is included, but that's sort of ugly too. We could also just
include the datanucleus jars on the classpath if they are present. The only
downside is that if someone did a build for hive, then did a normal build, they
would still include it. But in that case maybe it's just okay to include them -
it's not a widely used library.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---