Hive does not support that right now, but Hive supports subquery so we can
do:

SELECT SQRT( (n*totalsqr - tot * tot) / (n*(n-1)) )
FROM (SELECT COUNT(1) as n, SUM(col) as total, SUM( col*col ) AS totalsqr
FROM t) t2;

Zheng

On Sat, May 30, 2009 at 4:08 PM, Matt Pestritto <[email protected]> wrote:

> Ah - Thanks so much. I didn't know you could reference a column that has an
> aggregate function call again in the same select statement.  This is much
> cleaner that the approach that I took.  I'll give it a shot.
>
> Thanks again.
>
>
> On Wed, May 27, 2009 at 4:24 AM, Amr Awadallah <[email protected]> wrote:
>
>> I agree that a builtin for std dev is a good idea.
>>
>> that said, you can achieve this easy in one pass, just use:
>>
>> select sum( pow(col,2) ) as totsqr, sum( col ) as tot, count(1) as n, pow(
>> (n*totsqr - pow(tot,2) )/(n*(n-1)), 0.5) as stddev
>> from ....
>>
>>
>> Matt Pestritto wrote:
>>
>>> Hi.
>>>
>>> Are there plans to write a standard deviation aggregate function ?  I had
>>> to build my own which translated into multiple hive queries.  While it
>>> works, a build-in function would have been much easier.
>>>
>>> Thanks
>>> -Matt
>>>
>>
>


-- 
Yours,
Zheng

Reply via email to