Ignore the antlr runtime thing, I simply forgot to remove it. It's a weird hack that was necessary on my system to get pig trunk withouthadoop to work.
2011/6/18 Lakshminarayana Motamarri <[email protected]> > > Hi all, > > Thanks Jonathan, once again for ur response. > > First of all: > 1) what is *antlr-runtime-3.2.jar* > I don't find in my PIG installation path: /*/*/pig/ivy/* > > 2) Coming to the prev problem context of NULL: > You are right.. it would have worked.. > Later I also realized that, not just my rank columns, but the initial ID > column is also null in one of the case... i.e. the last line of the file.. > > so I am suppose to handle even that case... > i.e by *A2 = FILTER A BY appID is not null;* > > anyways it worked out great, got the results. thanks for ur help... > > Thanks & Regards, > Narayan. > > > On Fri, Jun 17, 2011 at 6:55 AM, Jonathan Coveney <[email protected]>wrote: > >> First, when troubleshooting (and just in general), I prefer to break steps >> out into multiple lines instead of trying to be overly expressive in one >> line. Pig scripts in general aren't so large that breaking it out doesn't >> aid a lot in debugging, but this is of course personal style. >> >> I create a file thing.txt, whose contents are as follows: >> >> 1,1 >> 1,2 >> 1,3 >> 1,4 >> , >> , >> 1, >> 2, >> ,3 >> 4, >> 6,6 >> 4,1 >> 2,3 >> >> >> 8, >> 9 >> 9 >> >> >> So there are some null lines, some lines with only one, the other, etc. >> Here is the script I ran. Caveat: I'm running pig trunk. >> >> register /home/jcoveney/pig/build/ivy/lib/Pig/antlr-runtime-3.2.jar; >> register /home/jcoveney/pig/contrib/piggybank/java/piggybank.jar; >> >> A = LOAD 'thing.txt' USING PigStorage(',') AS (rank1,rank2); >> B = FILTER A BY rank1 is not null OR rank2 is not null; >> C = FOREACH B GENERATE ( rank1 is null ? rank2 : rank1 ) as rank1, ( rank2 >> is null ? rank1 : rank2 ) as rank2; >> D = FOREACH C GENERATE >> org.apache.pig.piggybank.evaluation.math.MAX(rank1,rank2); >> >> This worked fine. >> >
