It should work, but there is a syntax error that's causing the parser to get confused. You don't want the "if" in there -- just
counted = foreach grouped generate group, SUM( limited.number2 is null? 0 : 1); On Mon, Nov 29, 2010 at 2:58 PM, Jonathan Coveney <[email protected]>wrote: > I realize this may be a lowly question, but I've searched around and > couldn't find anything definitive. I am also quite new to Pig and am trying > to get my head around the pig-esque way of doing things. > > I am trying to sum based on conditionality, and am not sure how to make > this > work. My system uses pig .6, if that is relevant. > > counted = foreach grouped generate group, SUM(if limited.number2 is null? > 0 > : 1); > > grouped is a group of type {group: chararray,limited: {number1: > chararray,number2: chararray} > > number1 isn't really relevant here. number2 > > The error I get is: > > [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during > parsing. Invalid alias: SUM in {group: chararray,limited: {number1: > chararray,number2: chararray}} > > But if I were to simply do SUM(limited.number2) it would work fine. > > My goal is to have a set of outputs that are group, and then the > corresponding number of non-null characters in that group. I could of > course > do this in a much more roundabout way, but I want to know why this or > something like it doesn't work...reading through the documentation, I see > things like this > > D = FOREACH C GENERATE FLATTEN((IsEmpty(A) ? null : A)), > FLATTEN((IsEmpty(B) ? null : B)) > > which seem to imply that you can work on that level for functions, but > maybe > not! Either way, I'd like to understand why it does or doesn't work, and > the > better paradigm for thinking about this sort of thing. >
