how do I use this bag? Is there a way for me to specify it in grunt?

BagFactory.getInstance().newSortedBag(comparator)

?

On Mon, Feb 22, 2010 at 10:34 AM, hc busy <[email protected]> wrote:

> ok, it sounds like I have a plan. So I need to write a  UDF from tuple to
> bag(t2b) and bag to tuple(b2t), and then I do
>
> exploded= foreach foo generate id, FLATTEN(t2b(field1, field2, field3));
> implode= group exploded by id;
> implode= foreach implode generate id, flatten(b2t(implode));
>
> to (almost) recover original table, except for field order may be messed
> up. Is there a way to write a udf like flatten that preserve order?
>
>
> Thanks!
>
>
>
>
> On Mon, Feb 22, 2010 at 9:57 AM, Dmitriy Ryaboy <[email protected]>wrote:
>
>> Same thing -- a udf to convert a tuple into a bag, then flatten.
>> Don't rely on any order you see in bags during testing -- there is
>> explicitly no guarantee there, it may change on you version to version and
>> execution to execution.
>>
>> -D
>>
>> On Mon, Feb 22, 2010 at 9:45 AM, hc busy <[email protected]> wrote:
>>
>> > Thanks, Dmitriy and Rekha . So I understand the flatten on bag explodes
>> to
>> > multiple rows now.
>> >
>> > The BagConcat seems to work. Actually, doing a simple example using the
>> > group by, it would appear that the bag contains the results in the order
>> > that they were before entering the group by. (so, if I group after an
>> order
>> > by x desc, then when I dump the table it prints the bag, but contents
>> are
>> > reversed)... So, actually, for my purposes, not having results in order
>> is
>> > okay.
>> >
>> > what about instead of charsplit, the data I have is this:
>> >
>> > 1,a,b,c,d
>> > 2,a,s,d,f
>> >
>> > and I want to explode it into
>> > 1,a
>> > 1,b
>> > 1,c
>> > 1,d
>> > 2,a
>> > 2,s
>> > 2,d
>> > 2,f
>> >
>> > (sorry, I made a mistake in the original question, the string is not a
>> > string but a tuple.) I think I may be able to get it into:
>> >
>> > 1, (a,b,c,d)
>> > 2, (a,s,d,f)
>> >
>> > but still, I need to explode it into several rows to operate on them
>> > separately.
>> >
>> >
>> >
>> > On Sun, Feb 21, 2010 at 8:03 PM, Rekha Joshi <[email protected]>
>> > wrote:
>> >
>> > > You would require a udf for this.Please check if you already have an
>> > > existing one in latest pig-udf.jar.
>> > > Or since this is a pretty simple one , you can write one yourself -
>> take
>> > > the tuple, assess the type , append the strings and return it from
>> your
>> > > exec() method.
>> > >
>> > > Cheers,
>> > > /R
>> > >
>> > >
>> > > On 2/19/10 11:51 PM, "hc busy" <[email protected]> wrote:
>> > >
>> > > Guys, I know this must be a common use case, but how do you explode
>> and
>> > > implode in pig?
>> > >
>> > > so, I have a file like this...
>> > >
>> > > 1, asdf
>> > > 2, qewrty
>> > > 3, zcxvb
>> > >
>> > >
>> > > and I want to apply an explode operation to it:
>> > >
>> > > 1, a
>> > > 1, s
>> > > 1, d
>> > > 1, f
>> > > 2, q
>> > > 2, e
>> > > 2, w
>> > > 2, r
>> > > 2, t
>> > > 2, y
>> > > 3, z
>> > > 3, c
>> > > 3, x
>> > > 3, v
>> > > 3, b
>> > >
>> > > and after some work... I have this file:
>> > >
>> > > 1, aa
>> > > 1, ss
>> > > 1, dd
>> > > 1, ff
>> > > 2, qq
>> > > 2, ee
>> > > 2, ww
>> > > 2, rr
>> > > 2, tt
>> > > 2, yy
>> > > 3, zz
>> > > 3, cc
>> > > 3, xx
>> > > 3, vv
>> > > 3, bb
>> > >
>> > >
>> > > and I want to perform an implode:
>> > >
>> > > 1, aassddff
>> > > 2, qqeewwrrttyy
>> > > 3, zzccxxvvbb
>> > >
>> > >
>> > > well, obviously this is a dumb example, but I'd like to do those
>> things.
>> > > Can
>> > > somebody help me with this? I looked in the piggy bank and didn't see
>> > > anything that would do this for me.
>> > >
>> > > Thanks!
>> > >
>> > >
>> >
>>
>
>

Reply via email to