What about trying something with SPLIT and UNION: SPLIT EXAMPLE_SOURCE INTO GOOD IF number>5, BETTER IF (number>=2 AND number<=4), BEST IF (number>=5);
I did a few FOREACH and a UNION, and got this: (a,6,best) (b,5,best) (d,8,best) (a,6,good) (d,8,good) (a,2,better) (b,2,better) (c,3,better) (d,3,better) (d,4,better) -- Ryan Hoegg On Wed, Sep 14, 2011 at 4:24 PM, Eli Finkelshteyn <[email protected]>wrote: > Sorry, bad example, I guess. I want something I can do case statements > with. In this case I could map instead, but if I wanted to use less > straight-forward cases (i.e. one case where number == 1, another where > number between 2 and 4, another where number greater than 5, etc...), it > would be much more difficult to do with mapping. > > Again, I know this is something I can do with udfs, but it seemed like > something light enough to be built into PIG itself, so I was hoping there > was a way to do it without needing to write a udf every time I have a new > transformation to make. > > Eli > > On 9/14/11 5:07 PM, Ryan Hoegg wrote: > >> What about putting the mappings into their own relation? I tried this >> with >> 0.9.0: >> >> example.txt: >> a,1 >> a,2 >> b,2 >> c,1 >> d,3 >> d,4 >> >> mapping.txt: >> 1,one >> 2,two >> 3,three >> 4,four >> >> MAPPINGS = LOAD 'mapping.txt' USING PigStorage(',') AS >> (number:int,name:chararray); >> EXAMPLE_SOURCE = LOAD 'example.txt' USING PigStorage(',') AS >> (item:chararray,number:int); >> MAPPED = JOIN EXAMPLE_SOURCE BY number LEFT OUTER, MAPPINGS BY number; >> PRETTY = FOREACH MAPPED GENERATE item, name; >> DUMP PRETTY; >> (a,one) >> (c,one) >> (a,two) >> (b,two) >> (d,three) >> (d,four) >> >> -- >> Ryan Hoegg >> >> On Wed, Sep 14, 2011 at 3:27 PM, Eli >> Finkelshteyn<iefinkel@gmail.**com<[email protected]> >> >wrote: >> >> Hi, >>> I'd like to generate based on exclusive conditions (something like the >>> CASE >>> statement in SQL). An example: >>> >>> Say I have data that looks like: >>> >>> (a, 1) >>> (a, 2) >>> (b, 2) >>> (c, 1) >>> (d, 3) >>> (d, 4) >>> >>> And I want to just convert each of the numbers to their written forms to >>> get: >>> >>> (a, one) >>> (a, two) >>> (b, two) >>> (c, one) >>> (d, three) >>> (d, four) >>> >>> Would I need to write a udf for that, or is there some simple way to do >>> it >>> using cases? I know I can do a bunch of bidirectional generates one on >>> top >>> of the other to achieve this, like: >>> >>> FOREACH rel GENERATE $0, (($1==1) ? 'one' : (($1 == 2) ? 'two' : (($1 == >>> 3) >>> ? 'three' : 'four'))); >>> >>> but that seems too messy. I'd appreciate any advice. >>> >>> Thanks! >>> Eli >>> >>> >>> >>> >
