posted a JIRA https://issues.apache.org/jira/browse/SPARK-1952

On Wed, May 28, 2014 at 1:14 PM, Ryan Compton <compton.r...@gmail.com> wrote:
> Remark, just including the jar built by sbt will produce the same
> error. i,.e this pig script will fail:
>
> REGISTER 
> /usr/share/osi1/spark-1.0.0/assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop0.20.2-cdh3u4.jar;
>
> edgeList0 = LOAD
> '/user/rfcompton/twitter-mention-networks/bidirectional-network-current/part-r-00001'
> USING PigStorage() AS (id1:long, id2:long, weight:int);
> ttt = LIMIT edgeList0 10;
> DUMP ttt;
>
> On Wed, May 28, 2014 at 12:55 PM, Ryan Compton <compton.r...@gmail.com> wrote:
>> It appears to be Spark 1.0 related. I made a pom.xml with a single
>> dependency on Spark, registering the resulting jar created the error.
>>
>> Spark 1.0 was compiled via $ SPARK_HADOOP_VERSION=0.20.2-cdh3u4 sbt/sbt 
>> assembly
>>
>> The pom.xml, as well as some other information, is below. The only
>> thing that should not be standard is the inclusion of my in-house
>> repository (it's where I host the spark jar I compiled above).
>>
>> <project xmlns="http://maven.apache.org/POM/4.0.0";
>> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
>>         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
>> http://maven.apache.org/xsd/maven-4.0.0.xsd";>
>>     <modelVersion>4.0.0</modelVersion>
>>
>>     <groupId>com.mycompany.app</groupId>
>>     <artifactId>my-app</artifactId>
>>     <version>1.0-SNAPSHOT</version>
>>     <packaging>jar</packaging>
>>
>>     <name>my-app</name>
>>     <url>http://maven.apache.org</url>
>>
>>     <properties>
>>         <maven.compiler.source>1.6</maven.compiler.source>
>>         <maven.compiler.target>1.6</maven.compiler.target>
>>         <encoding>UTF-8</encoding>
>>         <scala.version>2.10.4</scala.version>
>>     </properties>
>>
>>     <build>
>>         <pluginManagement>
>>             <plugins>
>>                 <plugin>
>>                     <groupId>net.alchim31.maven</groupId>
>>                     <artifactId>scala-maven-plugin</artifactId>
>>                     <version>3.1.5</version>
>>                 </plugin>
>>                 <plugin>
>>                     <groupId>org.apache.maven.plugins</groupId>
>>                     <artifactId>maven-compiler-plugin</artifactId>
>>                     <version>2.0.2</version>
>>                 </plugin>
>>             </plugins>
>>         </pluginManagement>
>>
>>         <plugins>
>>
>>             <plugin>
>>                 <groupId>net.alchim31.maven</groupId>
>>                 <artifactId>scala-maven-plugin</artifactId>
>>                 <executions>
>>                     <execution>
>>                         <id>scala-compile-first</id>
>>                         <phase>process-resources</phase>
>>                         <goals>
>>                             <goal>add-source</goal>
>>                             <goal>compile</goal>
>>                         </goals>
>>                     </execution>
>>                     <execution>
>>                         <id>scala-test-compile</id>
>>                         <phase>process-test-resources</phase>
>>                         <goals>
>>                             <goal>testCompile</goal>
>>                         </goals>
>>                     </execution>
>>                 </executions>
>>             </plugin>
>>
>>             <!-- Plugin to create a single jar that includes all
>> dependencies -->
>>             <plugin>
>>                 <artifactId>maven-assembly-plugin</artifactId>
>>                 <version>2.4</version>
>>                 <configuration>
>>                     <descriptorRefs>
>>                         <descriptorRef>jar-with-dependencies</descriptorRef>
>>                     </descriptorRefs>
>>                 </configuration>
>>                 <executions>
>>                     <execution>
>>                         <id>make-assembly</id>
>>                         <phase>package</phase>
>>                         <goals>
>>                             <goal>single</goal>
>>                         </goals>
>>                     </execution>
>>                 </executions>
>>             </plugin>
>>
>>         </plugins>
>>     </build>
>>
>>       <repositories>
>>
>>         <!-- needed for cdh build of Spark -->
>>         <repository>
>>             <id>releases</id>
>>             <url>10.10.1.29:8081/nexus/content/repositories/releases</url>
>>         </repository>
>>
>>         <repository>
>>             <id>cloudera</id>
>>             
>> <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
>>         </repository>
>>
>>     </repositories>
>>
>>     <dependencies>
>>
>>         <dependency>
>>             <groupId>org.scala-lang</groupId>
>>             <artifactId>scala-library</artifactId>
>>             <version>${scala.version}</version>
>>         </dependency>
>>
>>         <!--on node29-->
>>         <dependency>
>>             <groupId>org.apache.spark</groupId>
>>             <artifactId>spark-assembly</artifactId>
>>             <version>1.0.0-cdh3u4</version>
>>             <classifier>cdh3u4</classifier>
>>         </dependency>
>>
>>         <!--spark docs says I need hadoop-client, cdh3u3 repo no
>> longer exists-->
>>         <dependency>
>>             <groupId>org.apache.hadoop</groupId>
>>             <artifactId>hadoop-client</artifactId>
>>             <version>0.20.2-cdh3u4</version>
>>         </dependency>
>>
>>     </dependencies>
>> </project>
>>
>>
>> Here's what I get in the dependency tree:
>>
>> [INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ my-app ---
>> [INFO] com.mycompany.app:my-app:jar:1.0-SNAPSHOT
>> [INFO] +- org.scala-lang:scala-library:jar:2.10.4:compile
>> [INFO] +- org.apache.spark:spark-assembly:jar:cdh3u4:1.0.0-cdh3u4:compile
>> [INFO] \- org.apache.hadoop:hadoop-client:jar:0.20.2-cdh3u4:compile
>> [INFO]    \- org.apache.hadoop:hadoop-core:jar:0.20.2-cdh3u4:compile
>> [INFO]       +- com.cloudera.cdh:hadoop-ant:pom:0.20.2-cdh3u4:compile
>> [INFO]       +- xmlenc:xmlenc:jar:0.52:compile
>> [INFO]       +- 
>> org.apache.hadoop.thirdparty.guava:guava:jar:r09-jarjar:compile
>> [INFO]       +- commons-codec:commons-codec:jar:1.4:compile
>> [INFO]       +- commons-net:commons-net:jar:1.4.1:compile
>> [INFO]       |  \- (oro:oro:jar:2.0.8:compile - omitted for duplicate)
>> [INFO]       +- org.codehaus.jackson:jackson-core-asl:jar:1.5.2:compile
>> [INFO]       +- org.codehaus.jackson:jackson-mapper-asl:jar:1.5.2:compile
>> [INFO]       |  \-
>> (org.codehaus.jackson:jackson-core-asl:jar:1.5.2:compile - omitted for
>> duplicate)
>> [INFO]       +- commons-el:commons-el:jar:1.0:compile
>> [INFO]       |  \- commons-logging:commons-logging:jar:1.0.3:compile
>> [INFO]       +- hsqldb:hsqldb:jar:1.8.0.7:compile
>> [INFO]       \- oro:oro:jar:2.0.8:compile
>>
>>
>> While I don't see slf4j anywhere in there, it does manage to find it's
>> way into the jar somehow:
>> rfcompton@node19 /u/s/o/n/my-app> find . -name "*.jar" | xargs -tn1
>> jar tvf | grep -i "slf" | grep LocationAware
>> jar tvf ./target/my-app-1.0-SNAPSHOT.jar
>> jar tvf ./target/my-app-1.0-SNAPSHOT-jar-with-dependencies.jar
>>   3259 Mon Mar 25 21:49:34 PDT 2013
>> org/apache/commons/logging/impl/SLF4JLocationAwareLog.class
>>    455 Mon Mar 25 21:49:22 PDT 2013 org/slf4j/spi/LocationAwareLogger.class
>>    479 Fri Dec 13 16:44:40 PST 2013
>> parquet/org/slf4j/spi/LocationAwareLogger.class
>>
>> Here's a pig script that will fail with the slf4j error:
>>
>> REGISTER 
>> /usr/share/osi1/nonhome/my-app/target/my-app-1.0-SNAPSHOT-jar-with-dependencies.jar;
>>
>> edgeList0 = LOAD
>> '/user/rfcompton/twitter-mention-networks/bidirectional-network-current/part-r-00001'
>> USING PigStorage() AS (id1:long, id2:long, weight:int);
>>
>> ttt = LIMIT edgeList0 10;
>> DUMP ttt;
>>
>> (the error)
>> rfcompton@node19 /u/s/o/n/my-app> pig src/main/pig/testSparkJar.pig
>> 2014-05-28 12:43:58,076 [main] INFO  org.apache.pig.Main - Apache Pig
>> version 0.12.1 (r1585011) compiled Apr 05 2014, 01:41:34
>> 2014-05-28 12:43:58,078 [main] INFO  org.apache.pig.Main - Logging
>> error messages to:
>> /usr/share/osi1/nonhome/my-app/pig_1401306238074.log
>> 2014-05-28 12:43:58,722 [main] INFO  org.apache.pig.impl.util.Utils -
>> Default bootup file /home/isl/rfcompton/.pigbootup not found
>> 2014-05-28 12:43:59,195 [main] INFO
>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>> Connecting to hadoop file system at: hdfs://master:8020/
>> 2014-05-28 12:43:59,811 [main] INFO
>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>> Connecting to map-reduce job tracker at: node4:8021
>> 2014-05-28 12:44:00,987 [main] ERROR org.apache.pig.tools.grunt.Grunt
>> - ERROR 2998: Unhandled internal error.
>> org.slf4j.spi.LocationAwareLogger.log(Lorg/slf4j/Marker;Ljava/lang/String;ILjava/lang/String;[Ljava/lang/Object;Ljava/lang/Throwable;)V
>> Details at logfile: /usr/share/osi1/nonhome/my-app/pig_1401306238074.log
>>
>>
>> To confirm this is 1.0 related, I modified the pom.xml build with
>> 0.9.1 and saw no problems from pig. Looking into the 0.9.1 jar
>> revealed less dependence on slf4j (i.e
>> "parquet/org/slf4j/spi/LocationAwareLogger.class" appeared in
>> spark1.0).
>>
>> (after recompiling for 0.9.1)
>> rfcompton@node19 /u/s/o/n/my-app> find . -name "*.jar" | xargs -tn1
>> jar tvf | grep -i "slf" | grep LocationAware
>> jar tvf ./target/my-app-1.0-SNAPSHOT.jar
>> jar tvf ./target/my-app-1.0-SNAPSHOT-jar-with-dependencies.jar
>>    455 Mon Mar 25 21:49:22 PDT 2013 org/slf4j/spi/LocationAwareLogger.class
>>
>>
>> On Tue, May 27, 2014 at 2:53 PM, Sean Owen <so...@cloudera.com> wrote:
>>> Spark uses 1.7.5, and you should probably see 1.7.{4,5} in use through
>>> Hadoop. But those are compatible.
>>>
>>> That method appears to have been around since 1.3. What version does Pig 
>>> want?
>>>
>>> I usually do "mvn -Dverbose dependency:tree" to see both what the
>>> final dependencies are, and what got overwritten, to diagnose things
>>> like this.
>>>
>>> My hunch is that something is depending on an old slf4j in your build
>>> and it's overwriting Spark et al.
>>>
>>> On Tue, May 27, 2014 at 10:45 PM, Ryan Compton <compton.r...@gmail.com> 
>>> wrote:
>>>> I use both Pig and Spark. All my code is built with Maven into a giant
>>>> *-jar-with-dependencies.jar. I recently upgraded to Spark 1.0 and now
>>>> all my pig scripts fail with:
>>>>
>>>> Caused by: java.lang.RuntimeException: Could not resolve error that
>>>> occured when launching map reduce job: java.lang.NoSuchMethodError:
>>>> org.slf4j.spi.LocationAwareLogger.log(Lorg/slf4j/Marker;Ljava/lang/String;ILjava/lang/String;[Ljava/lang/Object;Ljava/lang/Throwable;)V
>>>> at 
>>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLauncher.java:598)
>>>> at java.lang.Thread.dispatchUncaughtException(Thread.java:1874)
>>>>
>>>>
>>>> Did Spark 1.0 change the version of slf4j? I can't seem to find it via
>>>> mvn dependency:tree

Reply via email to