Hi Russel,

I know what Johnny wrote is correct. But out of curiosity, why would you
need to sort the input? Thanks!

Houssam

On Wed, Mar 27, 2013 at 2:04 AM, Russell Jurney <[email protected]>wrote:

> Beware: you must first sort the input.
>
> D = foreach b { sorted = order B by $0; generate group, COR(sorted.$0,
> sorted.$1, ... );
>
> ,
> On Tue, Mar 26, 2013 at 5:11 PM, Johnny Zhang <[email protected]>
> wrote:
>
> > Hi, Renato:
> > For CORRELATION, I guess you can do something like
> > A = load 'random.txt' using PigStorage(':') as
> > (f1:double,f2:double,.........,f500:double);
> > B = group A all;
> > D = foreach B generate group,COR(A.$0,A.$1,A.$2,A.$3,.......A.$499);
> >
> > For COVARIANCE, I guess the UDF is COV.
> >
> > Johnny
> >
> >
> > On Tue, Mar 26, 2013 at 3:28 PM, Renato Marroquín Mogrovejo <
> > [email protected]> wrote:
> >
> > > Hi all,
> > >
> > > Could anyone be kind enough to point me to some examples on using the
> > > COVARIANCE and the CORRELATION UDFS described in here?[1]
> > >
> > >
> > > Renato M.
> > >
> > >
> > > [1] https://issues.apache.org/jira/browse/PIG-277
> > >
> >
>
>
>
> --
> Russell Jurney twitter.com/rjurney [email protected]
> datasyndrome.com
>

Reply via email to