pig-user  

Re: pig using zebra, ClassNotFoundException on TableOutputFormat

Bennie Schut
Mon, 16 Nov 2009 00:02:25 -0800

Ah thanks. Working like a charm now. Now I can play with the
TableInserter part.

Santhosh Srinivasan wrote:
> Bennie,
>
> Include zebra-0.6.0-dev.jar in your classpath and then relaunch pig.
>
> Santhosh 
>
> -----Original Message-----
> From: Bennie Schut [mailto:bsc...@ebuddy.com] 
> Sent: Friday, November 13, 2009 3:03 AM
> To: pig-user@hadoop.apache.org
> Subject: pig using zebra, ClassNotFoundException on TableOutputFormat
>
> I'm looking into improving the performance of one of my pig jobs. I
> figured storing the data which I keep reusing in a binary/serialized
> format could help me a little with this and thus stumbled upon zebra.
> It seems like a nice abstraction and seems to do exactly what I want to
> achieve.
>
> I started with something simple but that doesn't work.
>
> register zebra-0.6.0-dev.jar;
> dim_calendar = load '/user/dwh/dim/calendar.csv' using PigStorage('\t')
> as (cldr_id: long, iso_date: chararray); outfile = order dim_calendar by
> iso_date parallel 1; store outfile into '/user/dwh/calendar.zebra' using
> org.apache.hadoop.zebra.pig.TableStorer('cldr_id: long,
> iso_date:string');
>
> On running this I get:
> ---------------
> ERROR 2117: Unexpected error when launching map reduce job.
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1002: Unable
> to store alias 97 at
> org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1003)
> at org.apache.pig.PigServer.registerQuery(PigServer.java:385)
> at
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:720)
> at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptPar
> ser.java:324)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java
> :168)
> at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java
> :144)
> at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
> at org.apache.pig.Main.main(Main.java:352)
> Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR
> 2117: Unexpected error when launching map reduce job.
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher.launchPig(MapReduceLauncher.java:194)
> at
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.execute(H
> ExecutionEngine.java:249)
> at
> org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:780)
> at org.apache.pig.PigServer.execute(PigServer.java:773)
> at org.apache.pig.PigServer.access$100(PigServer.java:89)
> at org.apache.pig.PigServer$Graph.execute(PigServer.java:951)
> at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:998)
> ... 7 more
> Caused by: java.lang.RuntimeException: Could not resolve error that
> occured when launching map reduce job: java.lang.RuntimeException:
> java.lang.RuntimeException: java.lang.ClassNotFoundException:
> org.apache.hadoop.zebra.pig.TableOutputFormat
> at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher$JobControlThreadExceptionHandler.uncaughtException(MapReduceLaunc
> her.java:428)
> at java.lang.Thread.dispatchUncaughtException(Thread.java:1831)
> -----
>
> Any idea why?
> TableOutputFormat is an inner class of TableStorer so I'm a little
> puzzled how it could find one but not the other.
> fyi.. I'm using hadoop-0.20.1 and pig/zebra from trunk but haven't
> updated pig in a few weeks.
>
> Thanks,
> Bennie.
>