I want to compute the Average for 1 column dataset
1
2
3
4
5
and I am not able to do without grouping.
However I got an average with
avg = foreach (group dividends all) generate AVG(dividends);
But
avg = foreach (filter dividends by A>-10000000.0) generate AVG(A);
says use explicit cast.
My script is very small
dividends = load 'myfile.txt' as (A:double);
dump dividends
--grouped = filter dividends by A>-10000000.0;
avg = foreach (filter dividends by A>-10000000.0) generate AVG(A);
<file try.pig, line 5, column 65> Multiple matching functions for
org.apache.pig.builtin.AVG with input schema: ({{(bytearray)}}, {{(double)}}).
Please use an explicit cast.
On Mar 4, 2013, at 8:30 PM, Prashant Kommireddi <[email protected]> wrote:
> Hi Preeti,
>
> Using FILTER or not depends on your requirements and has nothing to do with
> SUM or AVG.
>
> SUM, AVG accept bags as input, so as long as you are able to provide that
> it should be fine. (Though its very common that users use GROUP BY to
> rollup on a key before using these UDFs).
>
> For example:
>
> grunt> cat data
> 1 5
> 5 8
>
> grunt> A = load 'data';
> grunt> B = foreach A generate TOBAG($0, $1) as bagg;
> grunt> dump B;
> ({(1),(5)})
> ({(5),(8)})
>
> grunt> C = foreach B generate AVG(bagg);
> grunt> dump C;
> (3.0)
> (6.5)
>
> -Prashant
>
>
> On Mon, Mar 4, 2013 at 3:50 PM, Preeti Gupta <[email protected]>wrote:
>
>> Hello,
>>
>> Can I compute SUM or AVG without using GROUPBY OR FILTER?
>>