Dmitriy, I appreciate the help. I tried it without the if statement, however, and I still get a parser error: invalid alias: SUM
It's quite odd... anyone perhaps have some conditional sum type code in this vein that they know should work? 2010/11/29 Dmitriy Ryaboy <[email protected]> > It should work, but there is a syntax error that's causing the parser to > get > confused. You don't want the "if" in there -- just > > counted = foreach grouped generate group, SUM( limited.number2 is null? 0 > : > 1); > > On Mon, Nov 29, 2010 at 2:58 PM, Jonathan Coveney <[email protected] > >wrote: > > > I realize this may be a lowly question, but I've searched around and > > couldn't find anything definitive. I am also quite new to Pig and am > trying > > to get my head around the pig-esque way of doing things. > > > > I am trying to sum based on conditionality, and am not sure how to make > > this > > work. My system uses pig .6, if that is relevant. > > > > counted = foreach grouped generate group, SUM(if limited.number2 is > null? > > 0 > > : 1); > > > > grouped is a group of type {group: chararray,limited: {number1: > > chararray,number2: chararray} > > > > number1 isn't really relevant here. number2 > > > > The error I get is: > > > > [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during > > parsing. Invalid alias: SUM in {group: chararray,limited: {number1: > > chararray,number2: chararray}} > > > > But if I were to simply do SUM(limited.number2) it would work fine. > > > > My goal is to have a set of outputs that are group, and then the > > corresponding number of non-null characters in that group. I could of > > course > > do this in a much more roundabout way, but I want to know why this or > > something like it doesn't work...reading through the documentation, I see > > things like this > > > > D = FOREACH C GENERATE FLATTEN((IsEmpty(A) ? null : A)), > > FLATTEN((IsEmpty(B) ? null : B)) > > > > which seem to imply that you can work on that level for functions, but > > maybe > > not! Either way, I'd like to understand why it does or doesn't work, and > > the > > better paradigm for thinking about this sort of thing. > > >
