"Illustrate" is more or less non-functional in 0.6 (it had a glorious return in 0.8).
Like I said... :-) D On Sat, Nov 12, 2011 at 5:03 PM, B M D Gill <[email protected]> wrote: > Thanks Dimitry, will mention it to Amazon for sure. > > That was the first thing I tried and it didn't seem to make it work. Not > sure what I could be doing wrong. I get an Index out of bound error where > the index corresponds to the first instance of the optional field. Here is > the stack trace: > > Pig Stack Trace > --------------- > ERROR 2999: Unexpected internal error. Index: 29, Size: 29 > > java.lang.IndexOutOfBoundsException: Index: 29, Size: 29 > at java.util.ArrayList.RangeCheck(ArrayList.java:547) > at java.util.ArrayList.get(ArrayList.java:322) > at org.apache.pig.data.DefaultTuple.get(DefaultTuple.java:143) > at org.apache.pig.pen.util.ExampleTuple.get(ExampleTuple.java:80) > at > org.apache.pig.pen.AugmentBaseDataVisitor.visit(AugmentBaseDataVisitor.java:427) > at org.apache.pig.impl.logicalLayer.LOLoad.visit(LOLoad.java:210) > at org.apache.pig.impl.logicalLayer.LOLoad.visit(LOLoad.java:52) > at > org.apache.pig.pen.util.PreOrderDepthFirstWalker.depthFirst(PreOrderDepthFirstWalker.java:70) > at > org.apache.pig.pen.util.PreOrderDepthFirstWalker.depthFirst(PreOrderDepthFirstWalker.java:72) > at > org.apache.pig.pen.util.PreOrderDepthFirstWalker.walk(PreOrderDepthFirstWalker.java:55) > at org.apache.pig.impl.plan.PlanVisitor.visit(PlanVisitor.java:51) > at > org.apache.pig.pen.ExampleGenerator.getExamples(ExampleGenerator.java:121) > at org.apache.pig.PigServer.getExamples(PigServer.java:731) > at > org.apache.pig.tools.grunt.GruntParser.processIllustrate(GruntParser.java:557) > at > org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:246) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165) > at > org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141) > at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75) > at org.apache.pig.Main.main(Main.java:374) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) > ================================================================================ > > > > On Sun, Nov 13, 2011 at 12:30 AM, Dmitriy Ryaboy <[email protected]> wrote: > >> If you change the load statement to "load '$input' as (f1, f2, f3, f4, >> f5), f4 and f5 will be treated as null if they are absent in the raw >> logs. >> >> If you start relying on Pig heavily, lobby Amazon to upgrade their >> version of Pig (or at least provide both 0.6 and 0.9.1). At this >> point, 0.6 is positively ancient. But the extra field behavior worked >> that way then, too. >> >> D >> >> On Sat, Nov 12, 2011 at 4:08 PM, B M D Gill <[email protected]> wrote: >> > I'm a newbie running Pig 0.6 on Amazon Elastic Map Reduce. I need to >> make >> > a change to add additional fields to the log files that I run my pig jobs >> > on and am wondering how do I handle this schema in pig. >> > >> > My current inputs are tab separated fields that I input using the >> standard >> > pig storage function: >> > >> > LOAD '$INPUT' USING PigStorage('\t') as (f1, f2, f3); >> > >> > However some input files will now have additional fields f4, f5, f6 etc. >> at >> > the trailing edge of each line. How do I set up the load function to >> > handle these optional fields? Do I need to make changes to my logic to >> > deal with these fields possibly being empty or will Pig simply record >> their >> > value as null if they are absent? >> > >> > Thanks to anyone who can share some insight. >> > >> >
