Boy it's a long story, but I think the short answer is that it's only worth manually fixing the mismatches that are clearly going to cause a problem.
Dependency version mismatch is inevitable, and Maven will always settle on one version of a particular group/artifact using a nearest-wins rule (SBT uses latest-wins). However it's possible that you get different versions of closely-related artifacts. Often it doesn't matter; sometimes it does. It's always possible to force a version of a group/artifact with <dependencyManagement>. The drawback is that, as dependencies evolve, you may be silently forcing that to an older version than other dependencies want. It builds up its own quiet legacy problem. On Sat, Sep 17, 2016 at 11:08 PM, Jacek Laskowski <ja...@japila.pl> wrote: > Hi Sean, > > Thanks a lot for help understanding the different jars. > > Do you think there's anything that should be reported as an > enhancement/issue/task in JIRA? > > Pozdrawiam, > Jacek Laskowski > ---- > https://medium.com/@jaceklaskowski/ > Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark > Follow me at https://twitter.com/jaceklaskowski > > > On Sat, Sep 17, 2016 at 11:34 PM, Sean Owen <so...@cloudera.com> wrote: >> No, these are different major versions of these components, each of >> which gets used by something in the transitive dependency graph. They >> are not redundant because they're not actually presenting roughly the >> same component in the same namespace. >> >> However the parquet-hadoop bit looks wrong, in that it should be >> harmonized to one 1.x version. It's not that Spark uses inconsistent >> versions but that transitive deps do. We can still harmonize them in >> the build if it causes problems. >> >> On Sat, Sep 17, 2016 at 8:14 PM, Jacek Laskowski <ja...@japila.pl> wrote: >>> Hi, >>> >>> Just noticed in assembly/target/scala-2.11/jars that similar libraries >>> have different versions: >>> >>> -rw-r--r-- 1 jacek staff 1230201 17 wrz 09:51 netty-3.8.0.Final.jar >>> -rw-r--r-- 1 jacek staff 2305335 17 wrz 09:51 netty-all-4.0.41.Final.jar >>> >>> and >>> >>> -rw-r--r-- 1 jacek staff 218076 17 wrz 09:51 parquet-hadoop-1.8.1.jar >>> -rw-r--r-- 1 jacek staff 2796935 17 wrz 09:51 >>> parquet-hadoop-bundle-1.6.0.jar >>> >>> and >>> >>> -rw-r--r-- 1 jacek staff 46983 17 wrz 09:51 >>> jackson-annotations-2.6.5.jar >>> -rw-r--r-- 1 jacek staff 258876 17 wrz 09:51 jackson-core-2.6.5.jar >>> -rw-r--r-- 1 jacek staff 232248 17 wrz 09:51 >>> jackson-core-asl-1.9.13.jar >>> -rw-r--r-- 1 jacek staff 1171380 17 wrz 09:51 jackson-databind-2.6.5.jar >>> -rw-r--r-- 1 jacek staff 18336 17 wrz 09:51 jackson-jaxrs-1.9.13.jar >>> -rw-r--r-- 1 jacek staff 780664 17 wrz 09:51 >>> jackson-mapper-asl-1.9.13.jar >>> -rw-r--r-- 1 jacek staff 41263 17 wrz 09:51 >>> jackson-module-paranamer-2.6.5.jar >>> -rw-r--r-- 1 jacek staff 515604 17 wrz 09:51 >>> jackson-module-scala_2.11-2.6.5.jar >>> -rw-r--r-- 1 jacek staff 27084 17 wrz 09:51 jackson-xc-1.9.13.jar >>> >>> and >>> >>> -rw-r--r-- 1 jacek staff 188671 17 wrz 09:51 >>> commons-beanutils-1.7.0.jar >>> -rw-r--r-- 1 jacek staff 206035 17 wrz 09:51 >>> commons-beanutils-core-1.8.0.jar >>> >>> and >>> >>> -rw-r--r-- 1 jacek staff 445288 17 wrz 09:51 antlr-2.7.7.jar >>> -rw-r--r-- 1 jacek staff 164368 17 wrz 09:51 antlr-runtime-3.4.jar >>> -rw-r--r-- 1 jacek staff 302248 17 wrz 09:51 antlr4-runtime-4.5.3.jar >>> >>> Even if that does not cause any class mismatches, it might be worth to >>> exclude them to minimize the size of the Spark distro. >>> >>> What do you think? >>> >>> Pozdrawiam, >>> Jacek Laskowski >>> ---- >>> https://medium.com/@jaceklaskowski/ >>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark >>> Follow me at https://twitter.com/jaceklaskowski >>> >>> --------------------------------------------------------------------- >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >>> --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org