Hi Helen,

Assuming you want to calculate stddev_samp,  Spark correctly points  STDDEV
to STDDEV_SAMP.

In below replace sales with your table name and AMOUNT_SOLD with the column
you want to do the calculation

SELECT
        
SQRT((SUM(POWER(AMOUNT_SOLD,2))-(COUNT(1)*POWER(AVG(AMOUNT_SOLD),2)))/(COUNT(1)-1))
AS MYSTDDEV,
        STDDEV(amount_sold) AS STDDEV,
        STDDEV_SAMP(amount_sold) AS STDDEV_SAMP,
        STDDEV_POP(amount_sold) AS STDDEV_POP
from    sales;

for me it returned

+--------------------+---------------------+--------------------+---------------------+--+
|      mystddev      |       stddev        |    stddev_samp     |
stddev_pop      |
+--------------------+---------------------+--------------------+---------------------+--+
| 260.7270919450411  | 260.7270722861637   | 260.7270722861637  |
260.72704617042166  |
+--------------------+---------------------+--------------------+---------------------+--+

HTH

Mich Talebzadeh,
Distinguished Technologist, Solutions Architect & Engineer
London
United Kingdom


   view my Linkedin profile
<https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>


 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.




On Tue, 19 Sept 2023 at 13:14, Helene Bøe <helene.b...@hydro.com.invalid>
wrote:

> Hi!
>
>
>
> I am applying the stddev function (so actually stddev_samp), however when
> comparing with the sample standard deviation in Excel the resuls do not
> match.
>
> I cannot find in your documentation any more specifics on how the sample
> standard deviation is calculated, so I cannot compare the difference toward
> excel, which uses
>
> .
>
> I am trying to avoid using Excel at all costs, but if the stddev_samp
> function is not calculating the standard deviation correctly I have a
> problem.
>
> I hope you can help me resolve this issue.
>
>
>
> Kindest regards,
>
>
>
> *Helene Bøe*
> *Graduate Project Engineer*
> Recycling Process & Support
>
> M: +47 980 00 887
> helene.b...@hydro.com
> <https://intra.hydro.com/EPiServer/CMS/Content/en/%2c%2c9/?epieditmode=False>
>
> Norsk Hydro ASA
> Drammensveien 264
> NO-0283 Oslo, Norway
> www.hydro.com
> <https://intra.hydro.com/EPiServer/CMS/Content/en/%2c%2c9/?epieditmode=False>
>
>
> NOTICE: This e-mail transmission, and any documents, files or previous
> e-mail messages attached to it, may contain confidential or privileged
> information. If you are not the intended recipient, or a person responsible
> for delivering it to the intended recipient, you are hereby notified that
> any disclosure, copying, distribution or use of any of the information
> contained in or attached to this message is STRICTLY PROHIBITED. If you
> have received this transmission in error, please immediately notify the
> sender and delete the e-mail and attached documents. Thank you.
>
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to