I want to compute the Average for 1 column dataset
1
2
3
4
5

and I am not able to do without grouping.

However I got an average with 

avg = foreach (group dividends all) generate AVG(dividends);

But 

avg       = foreach (filter dividends by A>-10000000.0) generate AVG(A);

 says use explicit cast.

My script is very small

dividends = load 'myfile.txt' as (A:double);
dump dividends
--grouped   = filter dividends by A>-10000000.0;
avg       = foreach (filter dividends by A>-10000000.0) generate AVG(A);



<file try.pig, line 5, column 65> Multiple matching functions for 
org.apache.pig.builtin.AVG with input schema: ({{(bytearray)}}, {{(double)}}). 
Please use an explicit cast.


On Mar 4, 2013, at 8:30 PM, Prashant Kommireddi <[email protected]> wrote:

> Hi Preeti,
> 
> Using FILTER or not depends on your requirements and has nothing to do with
> SUM or AVG.
> 
> SUM, AVG accept bags as input, so as long as you are able to provide that
> it should be fine. (Though its very common that users use GROUP BY to
> rollup on a key before using these UDFs).
> 
> For example:
> 
> grunt> cat data
> 1    5
> 5    8
> 
> grunt> A = load 'data';
> grunt> B = foreach A generate TOBAG($0, $1) as bagg;
> grunt> dump B;
> ({(1),(5)})
> ({(5),(8)})
> 
> grunt> C = foreach B generate AVG(bagg);
> grunt> dump C;
> (3.0)
> (6.5)
> 
> -Prashant
> 
> 
> On Mon, Mar 4, 2013 at 3:50 PM, Preeti Gupta <[email protected]>wrote:
> 
>> Hello,
>> 
>> Can I compute SUM or AVG without using GROUPBY OR FILTER?
>> 

Reply via email to