Beware: you must first sort the input.

D = foreach b { sorted = order B by $0; generate group, COR(sorted.$0,
sorted.$1, ... );


On Tue, Mar 26, 2013 at 5:11 PM, Johnny Zhang <[email protected]> wrote:

> Hi, Renato:
> For CORRELATION, I guess you can do something like
> A = load 'random.txt' using PigStorage(':') as
> (f1:double,f2:double,.........,f500:double);
> B = group A all;
> D = foreach B generate group,COR(A.$0,A.$1,A.$2,A.$3,.......A.$499);
>
> For COVARIANCE, I guess the UDF is COV.
>
> Johnny
>
>
> On Tue, Mar 26, 2013 at 3:28 PM, Renato Marroquín Mogrovejo <
> [email protected]> wrote:
>
> > Hi all,
> >
> > Could anyone be kind enough to point me to some examples on using the
> > COVARIANCE and the CORRELATION UDFS described in here?[1]
> >
> >
> > Renato M.
> >
> >
> > [1] https://issues.apache.org/jira/browse/PIG-277
> >
>



-- 
Russell Jurney twitter.com/rjurney [email protected] datasyndrome.com

Reply via email to