Finally broke through the classpath problem by "relocating" Jackson. Nothing like stepping away from from a problem for a couple days. In case anyone cares, here are some details:
https://maven.apache.org/plugins/maven-shade-plugin/examples/class-relocation.html <plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.0.0</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <shadedArtifactAttached>true</shadedArtifactAttached> <shadedClassifierName>shaded</shadedClassifierName> <createDependencyReducedPom>false</createDependencyReducedPom> <filters> <filter> <artifact>*:*</artifact> <excludes> <exclude>META-INF/*.SF</exclude> <exclude>META-INF/*.DSA</exclude> <exclude>META-INF/*.RSA</exclude> </excludes> </filter> </filters> <relocations> *<!-- fixes Spark overriding the classpath with older version of Jackson -->* * <relocation>* * <pattern>com.fasterxml.jackson</pattern>* * <shadedPattern>com.kochava.repackaged.com.fasterxml.jackson</shadedPattern>* * </relocation>* </relocations> <transformers> <!-- fixes SparkRunner class not found --> <transformer implementation="org.apache.maven.plugins.shade.resource.ServicesResourceTransformer"/> </transformers> </configuration> </execution> </executions> </plugin> Jacob On Mon, Oct 2, 2017 at 10:35 PM, Jacob Marble <[email protected]> wrote: > Romain- > > I have been using dependency:tree to check myself. Also, > META-INF/maven/com.fasterxml.jackson.core/jackson-core/pom.* in this > shaded jar definitely indicates version 2.8.9 of Jackson. > > Just tried the Spark runtime properties spark.driver.userClassPathFirst > and spark.executor.userClassPathFirst. Setting spark.driver.userClassPathFirst > does change things, but not in a helpful way. See the trace below. > > It looks like Spark 2 moved to Jackson 2.6.5, and jbonofre's spark2 runner > follows. The conversation around BEAM-1920 sounds like that is no small > change, jbonofre do you think that will land before Beam 2.2.0? > > (I hate to just wait around for something else to version, but I spent all > day on this, need to get back to being productive). > > Exception in thread "main" java.lang.LinkageError: loader constraint > violation: loader (instance of org/apache/spark/util/ChildFirstURLClassLoader) > previously initiated loading for a different type with name > "com/codahale/metrics/MetricRegistry" > at java.lang.ClassLoader.defineClass1(Native Method) > at java.lang.ClassLoader.defineClass(ClassLoader.java:763) > at java.security.SecureClassLoader.defineClass( > SecureClassLoader.java:142) > at java.net.URLClassLoader.defineClass(URLClassLoader.java:467) > > Jacob > > On Mon, Oct 2, 2017 at 9:52 PM, Romain Manni-Bucau <[email protected]> > wrote: > >> NoSuchMethodError doesnt mean it is not shaded but more it is not the >> right >> version. You should be able to check META-INF/maven in the shaded jar or >> maybe share your mvn output in verbose mode (-X) and a dependency:tree >> >> Le 3 oct. 2017 02:16, "Jacob Marble" <[email protected]> a écrit : >> >> > I gave up on running a Spark pipeline locally, tried AWS EMR/Spark >> instead. >> > Now this: >> > >> > 17/10/02 23:53:17 ERROR ApplicationMaster: User class threw exception: >> > java.lang.NoSuchMethodError: >> > com.google.common.base.Preconditions.checkArgument( >> > ZLjava/lang/String;Ljava/lang/Object;)V >> > java.lang.NoSuchMethodError: >> > com.google.common.base.Preconditions.checkArgument( >> > ZLjava/lang/String;Ljava/lang/Object;)V >> > at com.kochava.beam.s3.S3FileSystem.matchNewResource( >> > S3FileSystem.java:133) >> > at com.kochava.beam.s3.S3FileSystem.matchNewResource(S3FileSyst >> em.java:38) >> > at org.apache.beam.sdk.io.FileSystems.matchNewResource( >> > FileSystems.java:518) >> > >> > It doesn't make sense. Guava is definitely shaded; `maven package` >> tells me >> > so and the jar contains com/google/common/base/Preconditions.class. >> Here's >> > my shade config, in case someone sees something. >> > >> > <plugin> >> > <groupId>org.apache.maven.plugins</groupId> >> > <artifactId>maven-shade-plugin</artifactId> >> > <version>3.0.0</version> >> > <executions> >> > <execution> >> > <phase>package</phase> >> > <goals> >> > <goal>shade</goal> >> > </goals> >> > <configuration> >> > <shadedArtifactAttached>true</shadedArtifactAttached> >> > <shadedClassifierName>shaded</shadedClassifierName> >> > <createDependencyReducedPom>false</ >> > createDependencyReducedPom> >> > <filters> >> > <filter> >> > <artifact>*:*</artifact> >> > <excludes> >> > <exclude>META-INF/*.SF</exclude> >> > <exclude>META-INF/*.DSA</exclude> >> > <exclude>META-INF/*.RSA</exclude> >> > </excludes> >> > </filter> >> > </filters> >> > <!-- fixes SparkRunner class not found --> >> > <transformers> >> > <transformer >> > >> > implementation="org.apache.maven.plugins.shade.resource. >> > ServicesResourceTransformer"/> >> > </transformers> >> > </configuration> >> > </execution> >> > </executions> >> > </plugin> >> > >> > Jacob >> > >> > On Mon, Oct 2, 2017 at 11:17 AM, Jacob Marble <[email protected]> >> wrote: >> > >> > > There is a lot of chatter about AWS and Jackson on their forums, etc. >> I >> > > have been using the AWS SDK and Jackson 2.8.9 for a couple of weeks >> > without >> > > problems. Adding Spark to the mix is what changes this. >> > > >> > > Jacob >> > > >> > > On Mon, Oct 2, 2017 at 11:14 AM, Romain Manni-Bucau < >> > [email protected] >> > > > wrote: >> > > >> > >> Hi Jacob, >> > >> >> > >> isn't aws API only supporting jackson 2.6 and not 2.8? >> > >> >> > >> >> > >> Romain Manni-Bucau >> > >> @rmannibucau <https://twitter.com/rmannibucau> | Blog >> > >> <https://rmannibucau.metawerx.net/> | Old Blog >> > >> <http://rmannibucau.wordpress.com> | Github < >> > >> https://github.com/rmannibucau> | >> > >> LinkedIn <https://www.linkedin.com/in/rmannibucau> >> > >> >> > >> 2017-10-02 20:13 GMT+02:00 Jacob Marble <[email protected]>: >> > >> >> > >> > Yes, I'm using spark-submit, and I'm giving it a shaded jar. >> > >> > >> > >> > What do you mean "aligning the dependencies"? >> > >> > >> > >> > Jacob >> > >> > >> > >> > On Mon, Oct 2, 2017 at 11:06 AM, Jean-Baptiste Onofré < >> > [email protected]> >> > >> > wrote: >> > >> > >> > >> > > Hi >> > >> > > >> > >> > > Do you start your pipeline with spark-submit ? If so you can >> provide >> > >> the >> > >> > > packages. You can also create a shaded jar. >> > >> > > >> > >> > > I have a similar issue in the spark 2 runner that I worked >> around by >> > >> > > aligning the dependencies. >> > >> > > >> > >> > > Regards >> > >> > > JB >> > >> > > >> > >> > > On Oct 2, 2017, 20:04, at 20:04, Jacob Marble < >> [email protected]> >> > >> > wrote: >> > >> > > >My Beam pipeline runs fine with DirectRunner and DataflowRunner, >> > but >> > >> > > >fails >> > >> > > >with SparkRunner. That stack trace is after this message. >> > >> > > > >> > >> > > >The exception indicates that >> > >> > > >com.fasterxml.jackson.databind.ObjectMapper.enable doesn't >> exist. >> > >> > > >ObjectMapper.enable() didn't exist until Jackson 2.5. `mvn >> > >> > > >dependency:tree >> > >> > > >-Dverbose` shows that spark-core_2.10 (1.6.3) and >> > beam-runners-spark >> > >> > > >(2.1.0) both request versions of Jackson before 2.5. >> > >> > > > >> > >> > > >Since I'm using a local, standalone Spark cluster for >> development, >> > I >> > >> > > >have >> > >> > > >to include spark-core_2.10 version 1.6.3 in dependencies. >> > >> > > > >> > >> > > >I have added explicit dependencies to my pom.xml, so that I can >> be >> > >> > > >certain >> > >> > > >that the more recent version of Jackson is included in my shaded >> > jar. >> > >> > > >`mvn >> > >> > > >clean package` confirms this: >> > >> > > > >> > >> > > >[INFO] Including com.fasterxml.jackson.core:jac >> kson-core:jar:2.8.9 >> > >> in >> > >> > > >the >> > >> > > >shaded jar. >> > >> > > >[INFO] Including >> > >> > > >com.fasterxml.jackson.core:jackson-annotations:jar:2.8.9 >> > >> > > >in the shaded jar. >> > >> > > >[INFO] Including com.fasterxml.jackson.core:jac >> > >> kson-databind:jar:2.8.9 >> > >> > > >in >> > >> > > >the shaded jar. >> > >> > > >[INFO] Including >> > >> > > >com.fasterxml.jackson.module:jackson-module-scala_2.10:jar: >> 2.8.9 >> > in >> > >> the >> > >> > > >shaded jar. >> > >> > > >[INFO] Including >> > >> > > >com.fasterxml.jackson.module:jackson-module-paranamer:jar:2.8.9 >> in >> > >> the >> > >> > > >shaded jar. >> > >> > > >[INFO] Including >> > >> > > >com.fasterxml.jackson.dataformat:jackson-dataformat-cbor: >> jar:2.8.9 >> > >> in >> > >> > > >the >> > >> > > >shaded jar. >> > >> > > > >> > >> > > >Beyond jar creation, is there anything I can do to ensure that >> my >> > >> > > >chosen >> > >> > > >version of a dependency is used when Spark runs my pipeline? I >> > can't >> > >> be >> > >> > > >the >> > >> > > >first to encounter this problem. >> > >> > > > >> > >> > > >Thanks! >> > >> > > > >> > >> > > >Jacob >> > >> > > > >> > >> > > >-------- >> > >> > > > >> > >> > > >Exception in thread "main" java.lang.RuntimeException: >> > >> > > >java.lang.NoSuchMethodError: >> > >> > > >com.fasterxml.jackson.databind.ObjectMapper.enable([ >> > >> > > Lcom/fasterxml/jackson/core/JsonParser$Feature;)Lcom/ >> > >> > > fasterxml/jackson/databind/ObjectMapper; >> > >> > > >at >> > >> > > >org.apache.beam.runners.spark.SparkPipelineResult.runtimeEx >> > >> ceptionFrom( >> > >> > > SparkPipelineResult.java:55) >> > >> > > >at >> > >> > > >org.apache.beam.runners.spark.SparkPipelineResult. >> > beamExceptionFrom( >> > >> > > SparkPipelineResult.java:72) >> > >> > > >at >> > >> > > >org.apache.beam.runners.spark.SparkPipelineResult.waitUntil >> Finish( >> > >> > > SparkPipelineResult.java:99) >> > >> > > >at >> > >> > > >org.apache.beam.runners.spark.SparkPipelineResult.waitUntil >> Finish( >> > >> > > SparkPipelineResult.java:87) >> > >> > > >at com.kochava.beam.jobs.ExampleS3.main(ExampleS3.java:46) >> > >> > > >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> > >> > > >at >> > >> > > >sun.reflect.NativeMethodAccessorImpl.invoke( >> > >> > > NativeMethodAccessorImpl.java:62) >> > >> > > >at >> > >> > > >sun.reflect.DelegatingMethodAccessorImpl.invoke( >> > >> > > DelegatingMethodAccessorImpl.java:43) >> > >> > > >at java.lang.reflect.Method.invoke(Method.java:498) >> > >> > > >at >> > >> > > >org.apache.spark.deploy.SparkSubmit$.org$apache$spark$ >> > >> > > deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) >> > >> > > >at >> > >> > > >org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmi >> > >> t.scala:181) >> > >> > > >at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit. >> > >> scala:206) >> > >> > > >at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit. >> > scala:121) >> > >> > > >at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) >> > >> > > >Caused by: java.lang.NoSuchMethodError: >> > >> > > >com.fasterxml.jackson.databind.ObjectMapper.enable([ >> > >> > > Lcom/fasterxml/jackson/core/JsonParser$Feature;)Lcom/ >> > >> > > fasterxml/jackson/databind/ObjectMapper; >> > >> > > >at >> > >> > > >com.amazonaws.partitions.PartitionsLoader.<clinit>( >> > >> > > PartitionsLoader.java:54) >> > >> > > >at >> > >> > > >com.amazonaws.regions.RegionMetadataFactory.create( >> > >> > > RegionMetadataFactory.java:30) >> > >> > > >at com.amazonaws.regions.RegionUtils.initialize( >> > RegionUtils.java:64) >> > >> > > >at >> > >> > > >com.amazonaws.regions.RegionUtils.getRegionMetadata( >> > >> > RegionUtils.java:52) >> > >> > > >at com.amazonaws.regions.RegionUtils.getRegion( >> > RegionUtils.java:105) >> > >> > > >at >> > >> > > >com.amazonaws.client.builder.AwsClientBuilder.withRegion( >> > >> > > AwsClientBuilder.java:239) >> > >> > > >at com.kochava.beam.s3.S3Util.<init>(S3Util.java:103) >> > >> > > >at com.kochava.beam.s3.S3Util.<init>(S3Util.java:53) >> > >> > > >at com.kochava.beam.s3.S3Util$S3UtilFactory.create(S3Util.java: >> 81) >> > >> > > >at com.kochava.beam.s3.S3Util$S3UtilFactory.create(S3Util.java: >> 55) >> > >> > > >> > >> > >> > >> >> > > >> > > >> > >> > >
