Oh, sure.  Please find more info about UDF here:
http://pig.apache.org/docs/r0.10.0/udf.html

On Tue, Sep 25, 2012 at 8:16 PM, jamal sasha <[email protected]> wrote:

> Hi,
>   Thanks for replying.
> Err I am a new here.
> I am trying to find the info as in what is UDF?
>
>
> On Tue, Sep 25, 2012 at 10:41 PM, Cheolsoo Park <[email protected]
> >wrote:
>
> > Hi,
> >
> > in = load 'in.txt' using PigStorage(',') as (merchant:int, customer:int,
> > amount:float);
> > perMerchant = group in by merchant;
> > avg = foreach perMerchant generate group, AVG(in.amount);
> > dump avg;
> >
> > This returns (merchant_id, avg of amount) as follows:
> >
> > (1233,203.1999969482422)
> > (1234,264.6000061035156)
> >
> > Regarding standard deviation, you can write your own UDF that computes
> it.
> > Please take a look at AVG.java to see how it compute the average.
> > Basically, you need to modify the exec() method to compute standard
> > deviation instead of average.
> >
> > Thanks,
> > Cheolsoo
> >
> > On Tue, Sep 25, 2012 at 6:36 PM, jamal sasha <[email protected]>
> > wrote:
> >
> > > Hi,
> > >    I have a huge text file of form
> > > data is saved in directory data/data1.txt, data2.txt and so on
> > >  merchant_id, user_id, amount
> > >   1234, 9123, 299.2
> > >   1233, 9199, 203.2
> > >   1234, 0124, 230
> > >   and so on..
> > >
> > > What I want to do is for each merchant, find the average amount..
> > > so basically in the end i want to save the output in file.
> > > something like
> > > merchant_id, average_amount
> > >  1234, avg_amt_1234 a
> > >   and so on.
> > > How do I calculate the standard deviation as well?
> > >
> > > Sorry for asking such a basic question. :(
> > > Any help would be appreciated. :)
> > > Jamal
> > >
> >
>

Reply via email to