Hi, Thanks Dmitriy, I wasn't aware of these changes, would make life much easier, given that users do not always have the permissions to add jars to the $PIG_HOME/lib dirs or classpaths. In any case I've been planning to make a major update on the hadoop-gpl-packaging rpms to include all of the latest changes.
Cheers, Gerrit On Mon, May 2, 2011 at 2:27 AM, Dmitriy Ryaboy <[email protected]> wrote: > Actually pig does some class loader magic for finding classes in registered > classes on the front end. > We recently added that to Elephant bird so that it works when the proto or > thrift classes aren't already on the classpath and are only registered -- I > believe I merged that into the 8 branch, so if Gerrit updates his packages > with the most recent version it should " just work" > > D > > On Fri, Apr 29, 2011 at 5:11 PM, Gerrit Jansen van Vuuren < > [email protected]> wrote: > > > Pig has a backend and front end. > > I.e. > > Front End: > > Pig JVM instance. > > Back End > > Pig classes running your M/R job on hadoop. > > > > When pig instantiates the same loader in the front and back end to get > > different information on loading the job files. e.g. Which files to Load? > > This is decided in the front end, Reading the actual file? This is done > in > > the back end. > > > > The java classes for your GPB message needs to be present in the Front > and > > Back end. > > > > How? > > REGISTER <jar> === Back End > > $PIG_HOME/lib/ == Front End > > > > > > Cheers, > > Gerrit > > > > On Sat, Apr 30, 2011 at 2:02 AM, Kris Coward <[email protected]> wrote: > > > > > > > > Here we go: > > > > > > META-INF/ > > > META-INF/MANIFEST.MF > > > com/work/logs/LogFormat$1.class > > > com/work/logs/LogFormat$Apa$Builder.class > > > com/work/logs/LogFormat$Apa.class > > > com/work/logs/LogFormat.class > > > com/work/logs/LogFormat$Cpu$Builder.class > > > com/work/logs/LogFormat$Cpu.class > > > com/work/logs/LogFormat$Evt$Builder.class > > > com/work/logs/LogFormat$Evt.class > > > com/work/logs/LogFormat$FirstMsg$Builder.class > > > com/work/logs/LogFormat$FirstMsg.class > > > com/work/logs/LogFormat$Gci$Builder.class > > > com/work/logs/LogFormat$Gci.class > > > com/work/logs/LogFormat$Inr$Builder.class > > > com/work/logs/LogFormat$Inr.class > > > com/work/logs/LogFormat$Ins$Builder.class > > > com/work/logs/LogFormat$Ins.class > > > com/work/logs/LogFormat$Mer$Builder.class > > > com/work/logs/LogFormat$Mer.class > > > com/work/logs/LogFormat$Mes$Builder.class > > > com/work/logs/LogFormat$Mes.class > > > com/work/logs/LogFormat$Mtu$Builder.class > > > com/work/logs/LogFormat$Mtu.class > > > com/work/logs/LogFormat$Nei$Builder.class > > > com/work/logs/LogFormat$Nei.class > > > com/work/logs/LogFormat$Nes$Builder.class > > > com/work/logs/LogFormat$Nes.class > > > com/work/logs/LogFormat$Ntr$Builder.class > > > com/work/logs/LogFormat$Ntr.class > > > com/work/logs/LogFormat$Nts$Builder.class > > > com/work/logs/LogFormat$Nts.class > > > com/work/logs/LogFormat$Pgr$Builder.class > > > com/work/logs/LogFormat$Pgr.class > > > com/work/logs/LogFormat$Psr$Builder.class > > > com/work/logs/LogFormat$Psr.class > > > com/work/logs/LogFormat$Pst$Builder.class > > > com/work/logs/LogFormat$Pst.class > > > com/work/logs/LogFormat$Ucc$Builder.class > > > com/work/logs/LogFormat$Ucc.class > > > > > > On Fri, Apr 29, 2011 at 04:16:05PM -0700, Dmitriy Ryaboy wrote: > > > > and the contents of '/home/kris/swineflu/logformats-0.1.2.jar' (jar > > -tf) > > > > > > > > D > > > > > > > > On Fri, Apr 29, 2011 at 1:15 PM, Kris Coward <[email protected]> wrote: > > > > > > > > > > > > > > Well I'll send up to the point where it fails and exits, since the > > rest > > > > > seems kinda superfluous.. here it is: > > > > > > > > > > REGISTER '/usr/local/hadoopgpl/lib/slf4j-api-1.5.8.jar' > > > > > REGISTER '/usr/local/hadoopgpl/lib/slf4j-log4j12-1.5.10.jar' > > > > > REGISTER '/usr/local/pig/lib/elephant-bird.jar' > > > > > REGISTER '/usr/local/pig/lib/hadoop-lzo.jar' > > > > > REGISTER '/usr/local/pig/lib/piggybank.jar' > > > > > REGISTER '/usr/local/pig/lib/jackson-core-asl-1.0.1.jar' > > > > > REGISTER '/usr/local/pig/lib/jackson-mapper-asl-1.0.1.jar' > > > > > REGISTER '/usr/local/pig/lib/jsp-2.1-6.1.4.jar' > > > > > REGISTER '/home/kris/swineflu/com.kontagent.swineflu.jar' > > > > > REGISTER '/home/kris/swineflu/logformats-0.1.2.jar' > > > > > > > > > > %declare storage > > > > > com.twitter.elephantbird.pig.proto.LzoProtobuffB64LinePigStore > > > > > %declare loader > > > > > com.twitter.elephantbird.pig.proto.LzoProtobuffB64LinePigStore > > > > > > > > > > -- load the raw data from HDFS > > > > > apaNew = LOAD '$infile/apa' USING $loader('apa'); > > > > > apaTable = LOAD '$firstfile/apa' USING $loader('firstp'); > > > > > > > > > > > > > > > (where $infile and $firstfile are passed as parameters at runtime, > > and > > > > > the files were tested as existing) > > > > > > > > > > Cheers, > > > > > Kris > > > > > > > > > > On Fri, Apr 29, 2011 at 01:00:55PM -0700, Dmitriy Ryaboy wrote: > > > > > > Odd.. can you send the full pig script including the register > > > statements? > > > > > > > > > > > > On Fri, Apr 29, 2011 at 11:38 AM, Kris Coward <[email protected]> > > > wrote: > > > > > > > > > > > > > > > > > > > > So I've recently added a protocol/schema to a collection I got > > from > > > > > > > someone else, recompiled it, and added it to my scripts and am > > > having > > > > > > > problems. > > > > > > > > > > > > > > More specifically, it built just fine, and when REGISTERed in > the > > > > > script > > > > > > > that uses it to store a relation, it seems to work fine, but > when > > I > > > try > > > > > > > to use it to read that same relation back, I get the error: > > > > > > > > > > > > > > [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2999: > > > Unexpected > > > > > > > internal error. Error instantiating > > > com.work.logs.LogFormat$FirstMsg > > > > > > > referred to by firstp > > > > > > > > > > > > > > With a stack trace of: > > > > > > > > > > > > > > java.lang.RuntimeException: Error instantiating > > > > > > > com.work.logs.LogFormat$FirstMsg referred to by firstp > > > > > > > at > > > > > > > > > > > > > > > > > > com.twitter.elephantbird.pig.proto.ProtobufClassUtil.loadProtoClass(Unknown > > > > > > > Source) > > > > > > > at > > > > > > > > > > > > > > > > > > com.twitter.elephantbird.pig.proto.LzoProtobuffB64LinePigStore.getSchema(Unknown > > > > > > > Source) > > > > > > > at > > > > > > > > > > > > > > > > org.apache.pig.impl.logicalLayer.LOLoad.determineSchema(LOLoad.java:186) > > > > > > > at > > > > > > > > > org.apache.pig.impl.logicalLayer.LOLoad.getSchema(LOLoad.java:151) > > > > > > > at > > > > > > > > > > > > > > > > > > org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:851) > > > > > > > at > > > > > > > > > > > > > > > > > > org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63) > > > > > > > at > > > > > org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1601) > > > > > > > at > > > > > org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:1551) > > > > > > > at > > > org.apache.pig.PigServer.registerQuery(PigServer.java:523) > > > > > > > at > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:868) > > > > > > > at > > > > > > > > > > > > > > > > > > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:388) > > > > > > > at > > > > > > > > > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > > > > > > > at > > > > > > > > > > > > > > > > > > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141) > > > > > > > at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:90) > > > > > > > at org.apache.pig.Main.run(Main.java:510) > > > > > > > at org.apache.pig.Main.main(Main.java:107) > > > > > > > Caused by: java.lang.ClassNotFoundException: > > > > > > > com.work.logs.LogFormat$FirstMsg > > > > > > > at > java.net.URLClassLoader$1.run(URLClassLoader.java:202) > > > > > > > at java.security.AccessController.doPrivileged(Native > > > Method) > > > > > > > at > > > java.net.URLClassLoader.findClass(URLClassLoader.java:190) > > > > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:307) > > > > > > > at > > > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) > > > > > > > at java.lang.ClassLoader.loadClass(ClassLoader.java:248) > > > > > > > ... 16 more > > > > > > > > > > > > > > The protocol itself is pretty simple, just: > > > > > > > > > > > > > > message FirstMsg{ > > > > > > > optional string uid = 1; > > > > > > > optional int64 timestamp = 2; > > > > > > > optional string type = 3; > > > > > > > } > > > > > > > > > > > > > > The other classes in the jar file seem to be loading just fine, > > > > > > > producing notices along the lines of: > > > > > > > > > > > > > > [main] INFO > com.twitter.elephantbird.pig.proto.ProtobufClassUtil > > - > > > > > Using > > > > > > > com.work.logs.LogFormat$Apa mapped by apa > > > > > > > > > > > > > > Any help figuring out why this is failing would be appreciated. > I > > > have > > > > > a > > > > > > > strong suspicion that it's something simple that I just keep > > > looking > > > > > > > past. > > > > > > > > > > > > > > Thanks, > > > > > > > Kris > > > > > > > > > > > > > > -- > > > > > > > Kris Coward > > > > > http://unripe.melon.org/ > > > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB > > 12B3 > > > > > > > > > > > > > > > > > -- > > > > > Kris Coward > > > http://unripe.melon.org/ > > > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3 > > > > > > > > > > > -- > > > Kris Coward > http://unripe.melon.org/ > > > GPG Fingerprint: 2BF3 957D 310A FEEC 4733 830E 21A4 05C7 1FEB 12B3 > > > > > >
