Hello Dmitriy, thanks for your help.
My Thrift files should exist. i copied elephant-bird*.jar and the
example.jar (with com.example.thrift.VectorSequence and the thrift base
classes) in the pig lib folder.
and i also registered the jars within pig.
Here is my complete pig code:
register dist/elephant-bird-2.0-SNAPSHOT.jar;
register dist/example.jar;
register lib/google-collect-1.0.jar
raw_data = load '/tmp/thrift/vi_base64.txt.lzo' using
com.twitter.elephantbird.pig8.load.LzoThriftB64LinePigLoader('com.example.thrift.VectorSequence');
DUMP raw_data
i tried the secretDebugCmd param to see if the jars got registered. Here my
complete call...
pig -f -secretDebugCmd ./examples/src/pig/example.pig
/usr/lib/jvm/java-6-sun/bin/java -Xmx1000m
-Djava.library.path=/usr/lib/pig/lib/native/Linux-amd64-64
-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
-Dpig.log.dir=/usr/lib/pig/bin/../logs -Dpig.log.file=pig.log
-Dpig.home.dir=/usr/lib/pig/bin/.. -Dpig.root.logger=INFO,console,DRFA
-classpath
/usr/lib/pig/bin/../conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/usr/lib/pig/bin/../pig-0.8.0-CDH3B4-core.jar:/usr/lib/pig/bin/../build/pig-*-SNAPSHOT.jar:/usr/lib/pig/bin/../lib/automaton.jar:/usr/lib/pig/bin/../lib/elephant-bird-2.0-SNAPSHOT.jar:/usr/lib/pig/bin/../lib/hbase-0.20.6.jar:/usr/lib/pig/bin/../lib/hbase-0.20.6-test.jar:/usr/lib/pig/bin/../lib/example.jar:/usr/lib/pig/bin/../lib/zookeeper-hbase-1329.jar:/usr/lib/pig/bin/../build/ivy/lib/Pig/*.jar:/usr/lib/hadoop/hadoop-core-0.20.2-CDH3B4.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-CDH3B4.jar:/usr/lib/hadoop/lib/hadoop-lzo-0.4.9.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.2.0.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jdiff:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/jsp-2.1:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/example.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/conf
org.apache.pig.Main -f ./examples/src/pig/example.pig
Best regards,
Torben
2011/3/16 Dmitriy Ryaboy <[email protected]>
> Is the jar that has com.example.thrift.VectorSequence both "register"ed and
> on the pig classpath?
>
> D
>
>
> On Wed, Mar 16, 2011 at 3:23 AM, Torben Brodt <[email protected]> wrote:
>
>> Hey folks,
>> i still try to setup elephant bird in pig. I am using the pig-08 branch of
>> dvryaboy.
>> i managed to create my example loader using the pig8.util.ThriftToPig
>>
>> my pig code looks like this..
>> raw_data = load '/tmp/thrift/vi_base64.txt.lzo' using
>>
>> com.twitter.elephantbird.pig8.load.LzoThriftB64LinePigLoader('com.example.thrift.VectorSequence');
>>
>> When i run it, i get the following exception after the map/reduce phase:
>>
>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>> is not set
>>
>> Seems like i miss a classpath again? But of course the elephantbird libs
>> are
>> included, otherwise the script would fail much earlier withe the
>> LzoThriftB64LinePigLoader missing itself?
>> May the data be corrupted?
>>
>> See the stack trace attached, i hope you have some idea.
>>
>> Best regards,
>> Torben
>>
>> Backend error message
>> ---------------------
>> java.lang.RuntimeException: java.lang.RuntimeException:
>>
>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>> is not set
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:236)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.<init>(PigRecordReader.java:109)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:118)
>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>> at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at javax.security.auth.Subject.doAs(Subject.java:396)
>> at
>>
>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>> at org.apache.hadoop.mapred.Child.main(Child.java:234)
>> Caused by: java.lang.RuntimeException:
>>
>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>> is not set
>> at com.twitter.elephantbird.util.ThriftUtils.getTypeRef(Unknown Source)
>> at
>>
>> com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat.createRecordReader(Unknown
>> Source)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:227)
>> ... 9 more
>>
>> Pig Stack Trace
>> ---------------
>> ERROR 2997: Unable to recreate exception from backed error:
>> java.lang.RuntimeException: java.lang.RuntimeException:
>>
>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>> is not set
>>
>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
>> open iterator for alias raw_data. Backend error : Unable to recreate
>> exception from backed error: java.lang.RuntimeException:
>> java.lang.RuntimeException:
>>
>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>> is not set
>> at org.apache.pig.PigServer.openIterator(PigServer.java:742)
>> at
>> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
>> at
>>
>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
>> at
>>
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>> at
>>
>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
>> at org.apache.pig.Main.run(Main.java:406)
>> at org.apache.pig.Main.main(Main.java:107)
>> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
>> 2997:
>> Unable to recreate exception from backed error:
>> java.lang.RuntimeException:
>> java.lang.RuntimeException:
>>
>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>> is not set
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:221)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:151)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:337)
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:378)
>> at
>> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1198)
>> at org.apache.pig.PigServer.storeEx(PigServer.java:874)
>> at org.apache.pig.PigServer.store(PigServer.java:816)
>> at org.apache.pig.PigServer.openIterator(PigServer.java:728)
>> ... 7 more
>>
>> ================================================================================
>>
>
>