This worked for me, joined = JOIN Bytes1 BY Group1, Bytes2 BY Group2;
end = FOREACH joined GENERATE $0 AS something, $1/$3 AS ratio; Looks similar to your situation. -----Original Message----- From: Brian Adams [mailto:[email protected]] Sent: Tuesday, July 06, 2010 6:20 PM To: [email protected] Subject: RE: Sum Null Column Matt, Haven't tried that route yet. I can give that one a go. On Tue, 2010-07-06 at 18:23 -0400, Matthew Smith wrote: > Shouldn't you have to do it in another table? Add one more step? > > fullprep = FOREACH joined GENERATE $0 AS smsgram,$1 AS smscount,$2 AS > webgram,$3 AS webcount; > > addedStep = FOREACH fullprep $0,$1,$2,$3, SUM(fullprep.$1 + fullprep.$3) as > sumCount; > > Just my 2c > > > -----Original Message----- > From: Brian Adams [mailto:[email protected]] > Sent: Tuesday, July 06, 2010 6:11 PM > To: [email protected] > Subject: Re: Sum Null Column > > The Current line I have is this: > fullprep = FOREACH joined GENERATE $0 AS smsgram,$1 AS smscount,$2 AS > webgram,$3 AS webcount,SUM((int)$1,(int)$3) AS sumcount; > > I get the error: > ERROR 1045: Could not infer the matching function for > org.apache.pig.builtin.SUM as multiple or none of them fit. Please use > an explicit cast. > > I tried the conditional earlier and it didnt seem to work. > > On Tue, 2010-07-06 at 15:04 -0700, Dmitriy Ryaboy wrote: > > you can run a replacement loop if you want: > > > > zeroes = foreach foo generate ( x is null ? 0 : x) as x; > > > > or if you are dealing with empties instead of nulls > > zeroes = foreach foo generate (IsEmpty(x) ? 0 : x) as x; > > > > On Tue, Jul 6, 2010 at 2:57 PM, Wilkes, Chris <[email protected]> wrote: > > > > > SUM($1,$3) doesn't work? If the column is a non-double then it is > > > converted to a 0. > > > > > > > > > > > > > > > On Jul 6, 2010, at 2:47 PM, Brian Adams wrote: > > > > > > So I asked a question earlier, but figured it wasn't very clear and thus > > >> less likely to get answered, so here goes. > > >> > > >> I have 2 sources with seperate ngrams and counts and after doing a full > > >> out join i get: > > >> > > >> Example with numbers: > > >> dog,10,dog,20 > > >> cat,5,null/empty,null/empty > > >> null/empty, null/empty,mouse,10 > > >> > > >> If i want to sum up $1 with $3 the first row would give me > > >> dog,10,dog,20,30 > > >> > > >> However, when I cat the results out, the second row looks like > > >> cat,5,emtpy,emtpy,empty > > >> > > >> And Lastly, the 3rd appears like > > >> empty,empty,mouse,10,emtpy > > >> > > >> So if there is not a count in position $1 or $3, then I cannot add them > > >> together. > > >> > > >> How do I replace the emtpy with a 0 so I can perform the summation in > > >> the last column? > > >> > > >> Desired Output: > > >> dog,10,dog,20,30 > > >> cat,5,emtpy,0,5 > > >> empty,0,mouse,10,10 > > >> > > >> Thanks guys. > > >> > > >> > > >> > > >> > > >> > > >
