Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/3238#issuecomment-65141491
I thought about making this a generic "add all the jars in this directory
to the dist cache and to the app's classpath". This would make sense for
regular application dependencies - all jars would be added to the app's
classpath (similar to --jar). But the datanucleus jars are "special", they have
to be in the main class path so that Spark classes pick them up.
I think making this generic is a little dangerous since we shouldn't be
encouraging people to add things to Spark's classpath, and should encourage the
use of things like "userClassPathFirst" once we're comfortable that it works
properly.
Another thing about the datanucleus jars is that most people actually
shouldn't need them; Hive's preferred way to connect to the metastore is
through the metastore server, and that doesn't require these jars. But I don't
know how Spark SQL is generally deployed these days, so maybe that doesn't
apply here.
In light of these I'd rather keep this as specific to these jars as
possible, with the added comment that Tom made of investigating whether it's
possible to add these to the uber jar somehow. That would be the best solution.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]