What about trying something with SPLIT and UNION:

SPLIT EXAMPLE_SOURCE INTO GOOD IF number>5, BETTER IF (number>=2 AND
number<=4), BEST IF (number>=5);

I did a few FOREACH and a UNION, and got this:
(a,6,best)
(b,5,best)
(d,8,best)
(a,6,good)
(d,8,good)
(a,2,better)
(b,2,better)
(c,3,better)
(d,3,better)
(d,4,better)

--
Ryan Hoegg

On Wed, Sep 14, 2011 at 4:24 PM, Eli Finkelshteyn <[email protected]>wrote:

> Sorry, bad example, I guess. I want something I can do case statements
> with. In this case I could map instead, but if I wanted to use less
> straight-forward cases (i.e. one case where number == 1, another where
> number between 2 and 4, another where number greater than 5, etc...), it
> would be much more difficult to do with mapping.
>
> Again, I know this is something I can do with udfs, but it seemed like
> something light enough to be built into PIG itself, so I was hoping there
> was a way to do it without needing to write a udf every time I have a new
> transformation to make.
>
> Eli
>
> On 9/14/11 5:07 PM, Ryan Hoegg wrote:
>
>> What about putting the mappings into their own relation?  I tried this
>> with
>> 0.9.0:
>>
>> example.txt:
>> a,1
>> a,2
>> b,2
>> c,1
>> d,3
>> d,4
>>
>> mapping.txt:
>> 1,one
>> 2,two
>> 3,three
>> 4,four
>>
>> MAPPINGS = LOAD 'mapping.txt' USING PigStorage(',') AS
>> (number:int,name:chararray);
>> EXAMPLE_SOURCE = LOAD 'example.txt' USING PigStorage(',') AS
>> (item:chararray,number:int);
>> MAPPED = JOIN EXAMPLE_SOURCE BY number LEFT OUTER, MAPPINGS BY number;
>> PRETTY = FOREACH MAPPED GENERATE item, name;
>> DUMP PRETTY;
>> (a,one)
>> (c,one)
>> (a,two)
>> (b,two)
>> (d,three)
>> (d,four)
>>
>> --
>> Ryan Hoegg
>>
>> On Wed, Sep 14, 2011 at 3:27 PM, Eli 
>> Finkelshteyn<iefinkel@gmail.**com<[email protected]>
>> >wrote:
>>
>>  Hi,
>>> I'd like to generate based on exclusive conditions (something like the
>>> CASE
>>> statement in SQL). An example:
>>>
>>> Say I have data that looks like:
>>>
>>> (a, 1)
>>> (a, 2)
>>> (b, 2)
>>> (c, 1)
>>> (d, 3)
>>> (d, 4)
>>>
>>> And I want to just convert each of the numbers to their written forms to
>>> get:
>>>
>>> (a, one)
>>> (a, two)
>>> (b, two)
>>> (c, one)
>>> (d, three)
>>> (d, four)
>>>
>>> Would I need to write a udf for that, or is there some simple way to do
>>> it
>>> using cases? I know I can do a bunch of bidirectional generates one on
>>> top
>>> of the other to achieve this, like:
>>>
>>> FOREACH rel GENERATE $0, (($1==1) ? 'one' : (($1 == 2) ? 'two' : (($1 ==
>>> 3)
>>> ? 'three' : 'four')));
>>>
>>> but that seems too messy. I'd appreciate any advice.
>>>
>>> Thanks!
>>> Eli
>>>
>>>
>>>
>>>
>

Reply via email to