the exception still says that the configuration for the classname is not
set.

i added "print" to ThriftUtils:setClassConf (did not setup pig in eclipse
yet)
the variable is set to my thrift class, but it cannot be accessed later.
seems to be lost somewhere, what can be the reason?

INFO  org.apache.pig.tools.pigstats.ScriptState - Pig script settings are
added to the job
INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
- Setting up single store job
INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 1 map-reduce job(s) waiting for submission.
INFO
 org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 0% complete
set conf
elephantbird.thirft.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
to com.example.thrift.VectorSequence (<< my "print")

i thought the config would be the probem, and i cheated the Hadoop
Configuration and added the classname hard into ThriftUtils:getTypeRef

    if (className == null) {

 
if(genericClass.getName().equals("com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat"))
{
        className = "com.example.thrift.VectorSequence";
      } else {
        throw new RuntimeException(CLASS_CONF_PREFIX +
genericClass.getName() + " is not set (1)");
      }
    }

but then i got another exception
java.lang.ClassCastException:
com.twitter.elephantbird.mapreduce.io.ThriftWritable cannot be cast to
org.apache.thrift.TBase at
com.twitter.elephantbird.pig8.load.LzoThriftB64LinePigLoader.getNext(Unknown
Source)

i also tried the PigUtil patch
https://github.com/kevinweil/elephant-bird/pull/36/files

but no advancements :(

2011/3/16 Torben Brodt <[email protected]>

> Hello Dmitriy, thanks for your help.
> My Thrift files should exist. i copied elephant-bird*.jar and the
> example.jar (with com.example.thrift.VectorSequence and the thrift base
> classes) in the pig lib folder.
> and i also registered the jars within pig.
>
> Here is my complete pig code:
>
> register dist/elephant-bird-2.0-SNAPSHOT.jar;
> register dist/example.jar;
> register lib/google-collect-1.0.jar
> raw_data = load '/tmp/thrift/vi_base64.txt.lzo' using
> com.twitter.elephantbird.pig8.load.LzoThriftB64LinePigLoader('com.example.thrift.VectorSequence');
>
> DUMP raw_data
>
> i tried the secretDebugCmd param to see if the jars got registered. Here my
> complete call...
>
> pig -f -secretDebugCmd ./examples/src/pig/example.pig
>
> /usr/lib/jvm/java-6-sun/bin/java -Xmx1000m
> -Djava.library.path=/usr/lib/pig/lib/native/Linux-amd64-64
> -Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl
> -Dpig.log.dir=/usr/lib/pig/bin/../logs -Dpig.log.file=pig.log
> -Dpig.home.dir=/usr/lib/pig/bin/.. -Dpig.root.logger=INFO,console,DRFA
> -classpath
> /usr/lib/pig/bin/../conf:/usr/lib/jvm/java-6-sun/lib/tools.jar:/usr/lib/pig/bin/../pig-0.8.0-CDH3B4-core.jar:/usr/lib/pig/bin/../build/pig-*-SNAPSHOT.jar:/usr/lib/pig/bin/../lib/automaton.jar:/usr/lib/pig/bin/../lib/elephant-bird-2.0-SNAPSHOT.jar:/usr/lib/pig/bin/../lib/hbase-0.20.6.jar:/usr/lib/pig/bin/../lib/hbase-0.20.6-test.jar:/usr/lib/pig/bin/../lib/example.jar:/usr/lib/pig/bin/../lib/zookeeper-hbase-1329.jar:/usr/lib/pig/bin/../build/ivy/lib/Pig/*.jar:/usr/lib/hadoop/hadoop-core-0.20.2-CDH3B4.jar:/usr/lib/hadoop/lib/aspectjrt-1.6.5.jar:/usr/lib/hadoop/lib/aspectjtools-1.6.5.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/usr/lib/hadoop/lib/commons-codec-1.4.jar:/usr/lib/hadoop/lib/commons-daemon-1.0.1.jar:/usr/lib/hadoop/lib/commons-el-1.0.jar:/usr/lib/hadoop/lib/commons-httpclient-3.0.1.jar:/usr/lib/hadoop/lib/commons-logging-1.0.4.jar:/usr/lib/hadoop/lib/commons-logging-api-1.0.4.jar:/usr/lib/hadoop/lib/commons-net-1.4.1.jar:/usr/lib/hadoop/lib/core-3.1.1.jar:/usr/lib/hadoop/lib/hadoop-fairscheduler-0.20.2-CDH3B4.jar:/usr/lib/hadoop/lib/hadoop-lzo-0.4.9.jar:/usr/lib/hadoop/lib/hsqldb-1.8.0.10.jar:/usr/lib/hadoop/lib/hue-plugins-1.2.0.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.5.2.jar:/usr/lib/hadoop/lib/jackson-mapper-asl-1.5.2.jar:/usr/lib/hadoop/lib/jasper-compiler-5.5.12.jar:/usr/lib/hadoop/lib/jasper-runtime-5.5.12.jar:/usr/lib/hadoop/lib/jdiff:/usr/lib/hadoop/lib/jets3t-0.6.1.jar:/usr/lib/hadoop/lib/jetty-6.1.26.jar:/usr/lib/hadoop/lib/jetty-servlet-tester-6.1.26.jar:/usr/lib/hadoop/lib/jetty-util-6.1.26.jar:/usr/lib/hadoop/lib/jsch-0.1.42.jar:/usr/lib/hadoop/lib/jsp-2.1:/usr/lib/hadoop/lib/junit-4.5.jar:/usr/lib/hadoop/lib/kfs-0.2.2.jar:/usr/lib/hadoop/lib/log4j-1.2.15.jar:/usr/lib/hadoop/lib/mockito-all-1.8.2.jar:/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/oro-2.0.8.jar:/usr/lib/hadoop/lib/example.jar:/usr/lib/hadoop/lib/servlet-api-2.5-20081211.jar:/usr/lib/hadoop/lib/servlet-api-2.5-6.1.14.jar:/usr/lib/hadoop/lib/slf4j-api-1.4.3.jar:/usr/lib/hadoop/lib/slf4j-log4j12-1.4.3.jar:/usr/lib/hadoop/lib/xmlenc-0.52.jar:/usr/lib/hadoop/conf
> org.apache.pig.Main -f ./examples/src/pig/example.pig
>
> Best regards,
> Torben
>
>
> 2011/3/16 Dmitriy Ryaboy <[email protected]>
>
>> Is the jar that has com.example.thrift.VectorSequence both "register"ed
>> and on the pig classpath?
>>
>> D
>>
>>
>> On Wed, Mar 16, 2011 at 3:23 AM, Torben Brodt <[email protected]> wrote:
>>
>>> Hey folks,
>>> i still try to setup elephant bird in pig. I am using the pig-08 branch
>>> of
>>> dvryaboy.
>>> i managed to create my example loader using the pig8.util.ThriftToPig
>>>
>>> my pig code looks like this..
>>> raw_data = load '/tmp/thrift/vi_base64.txt.lzo' using
>>>
>>> com.twitter.elephantbird.pig8.load.LzoThriftB64LinePigLoader('com.example.thrift.VectorSequence');
>>>
>>> When i run it, i get the following exception after the map/reduce phase:
>>>
>>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>>> is not set
>>>
>>> Seems like i miss a classpath again? But of course the elephantbird libs
>>> are
>>> included, otherwise the script would fail much earlier withe the
>>> LzoThriftB64LinePigLoader missing itself?
>>> May the data be corrupted?
>>>
>>> See the stack trace attached, i hope you have some idea.
>>>
>>> Best regards,
>>> Torben
>>>
>>> Backend error message
>>> ---------------------
>>> java.lang.RuntimeException: java.lang.RuntimeException:
>>>
>>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>>> is not set
>>> at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:236)
>>> at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.<init>(PigRecordReader.java:109)
>>> at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigInputFormat.createRecordReader(PigInputFormat.java:118)
>>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:613)
>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
>>> at org.apache.hadoop.mapred.Child$4.run(Child.java:240)
>>> at java.security.AccessController.doPrivileged(Native Method)
>>> at javax.security.auth.Subject.doAs(Subject.java:396)
>>> at
>>>
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
>>> at org.apache.hadoop.mapred.Child.main(Child.java:234)
>>> Caused by: java.lang.RuntimeException:
>>>
>>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>>> is not set
>>> at com.twitter.elephantbird.util.ThriftUtils.getTypeRef(Unknown Source)
>>> at
>>>
>>> com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat.createRecordReader(Unknown
>>> Source)
>>> at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initNextRecordReader(PigRecordReader.java:227)
>>> ... 9 more
>>>
>>> Pig Stack Trace
>>> ---------------
>>> ERROR 2997: Unable to recreate exception from backed error:
>>> java.lang.RuntimeException: java.lang.RuntimeException:
>>>
>>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>>> is not set
>>>
>>> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to
>>> open iterator for alias raw_data. Backend error : Unable to recreate
>>> exception from backed error: java.lang.RuntimeException:
>>> java.lang.RuntimeException:
>>>
>>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>>> is not set
>>> at org.apache.pig.PigServer.openIterator(PigServer.java:742)
>>> at
>>> org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:612)
>>> at
>>>
>>> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:303)
>>> at
>>>
>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
>>> at
>>>
>>> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
>>> at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90)
>>> at org.apache.pig.Main.run(Main.java:406)
>>> at org.apache.pig.Main.main(Main.java:107)
>>> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
>>> 2997:
>>> Unable to recreate exception from backed error:
>>> java.lang.RuntimeException:
>>> java.lang.RuntimeException:
>>>
>>> elephantbird.thrift.class.for.com.twitter.elephantbird.mapreduce.input.LzoThriftB64LineInputFormat
>>> is not set
>>> at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getErrorMessages(Launcher.java:221)
>>> at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:151)
>>> at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:337)
>>> at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(HExecutionEngine.java:378)
>>> at
>>> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1198)
>>> at org.apache.pig.PigServer.storeEx(PigServer.java:874)
>>> at org.apache.pig.PigServer.store(PigServer.java:816)
>>> at org.apache.pig.PigServer.openIterator(PigServer.java:728)
>>> ... 7 more
>>>
>>> ================================================================================
>>>
>>
>>
>

Reply via email to