[ 
https://issues.apache.org/jira/browse/KYLIN-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Yanghong updated KYLIN-3361:
----------------------------------
    Description: 
(x ~1~ - +x+) ^2^ + (x ~2~ - +x+) ^2^ + ... + (x ~n~ - +x+) ^2^ = x ~1~ ^2^ + x 
~2~ ^2^ + ... + x ~n~ ^2^ - n +x+ ^2^, where +x+ is the average of x ~1~, x 
~2~, ..., x ~n~. Therefore, to compute stddev, what kylin need to do is to 
pre-calculate sum(x ~i~ ^2^), sum(x ~i~) and count

 

var(X) = E(X ^2^) - E(X) ^2^

var ^'^(X) = n * var(X)

             = n*(E(X ^2^) - E(X) ^2^)

             = S(X ^2^) - S(X) ^2^/n

             = S(X ~1~ ^2^) + S(X ~2~ ^2^) - S(X ~1~ + X ~2~) ^2^/(n ~1~ + n 
~2~)

             = S(X ~1~ ^2^) - S(X ~1~ ) ^2^/(n ~1~) + S(X ~2~ ^2^) - S(X ~2~) 
^2^/(n ~2~)  + S(X ~1~) ^2^/(n ~1~) + S(X ~2~) ^2^/(n ~2~) - S(X ~1~ + X ~2~) 
^2^/(n ~1~ + n ~2~)

             = var ^'^(X ~1~) + var ^'^(X ~2~) + (n ~1~ S(X ~2~)- n ~2~ S(X 
~1~)) ^2^ / (n ~1~ n ~2~ (n ~1~ + n ~2~))

             = var ^'^(X ~1~) + var ^'^(X ~2~) + (S(X ~2~) - S(X ~1~) n ~2~ / n 
~1~ ) ^2^ n ~1~ / (n ~2~ (n ~1~ + n ~2~))

  was:
(x ~1~ - +x+) ^2^ + (x ~2~ - +x+) ^2^ + ... + (x ~n~ - +x+) ^2^ = x ~1~ ^2^ + x 
~2~ ^2^ + ... + x ~n~ ^2^ - n +x+ ^2^, where +x+ is the average of x ~1~, x 
~2~, ..., x ~n~. Therefore, to compute stddev, what kylin need to do is to 
pre-calculate sum(x ~i~ ^2^), sum(x ~i~) and count

 

var(X) = E(X ^2^) - E(X) ^2^

var ^'^(X) = n * var(X)

             = n*(E(X ^2^) - E(X) ^2^)

             = S(X ^2^) - S(X) ^2^/n

             = S(X ~1~ ^2^) + S(X ~2~ ^2^) - S(X ~1~ + X ~2~) ^2^/(n ~1~ + n 
~2~)

             = S(X ~1~ ^2^) - S(X ~1~ ) ^2^/(n ~1~) + S(X ~2~ ^2^) - S(X ~2~) 
^2^/(n ~2~)  + S(X ~1~) ^2^/(n ~1~) + S(X ~2~) ^2^/(n ~2~) - S(X ~1~ + X ~2~) 
^2^/(n ~1~ + n ~2~)

             = var ^'^(X ~1~) + var ^'^(X ~2~) + (n ~1~ X ~2~ - n ~2~ X ~1~) 
^2^ / (n ~1~ n ~2~ (n ~1~ + n ~2~))

             = var ^'^(X ~1~) + var ^'^(X ~2~) + (X ~2~ - X ~1~ n ~2~ / n ~1~ ) 
^2^  n ~1~ / (n ~2~ (n ~1~ + n ~2~))


> Support stddev as a new measure
> -------------------------------
>
>                 Key: KYLIN-3361
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3361
>             Project: Kylin
>          Issue Type: New Feature
>            Reporter: Zhong Yanghong
>            Assignee: Zhong Yanghong
>            Priority: Major
>             Fix For: v3.1.0
>
>
> (x ~1~ - +x+) ^2^ + (x ~2~ - +x+) ^2^ + ... + (x ~n~ - +x+) ^2^ = x ~1~ ^2^ + 
> x ~2~ ^2^ + ... + x ~n~ ^2^ - n +x+ ^2^, where +x+ is the average of x ~1~, x 
> ~2~, ..., x ~n~. Therefore, to compute stddev, what kylin need to do is to 
> pre-calculate sum(x ~i~ ^2^), sum(x ~i~) and count
>  
> var(X) = E(X ^2^) - E(X) ^2^
> var ^'^(X) = n * var(X)
>              = n*(E(X ^2^) - E(X) ^2^)
>              = S(X ^2^) - S(X) ^2^/n
>              = S(X ~1~ ^2^) + S(X ~2~ ^2^) - S(X ~1~ + X ~2~) ^2^/(n ~1~ + n 
> ~2~)
>              = S(X ~1~ ^2^) - S(X ~1~ ) ^2^/(n ~1~) + S(X ~2~ ^2^) - S(X ~2~) 
> ^2^/(n ~2~)  + S(X ~1~) ^2^/(n ~1~) + S(X ~2~) ^2^/(n ~2~) - S(X ~1~ + X ~2~) 
> ^2^/(n ~1~ + n ~2~)
>              = var ^'^(X ~1~) + var ^'^(X ~2~) + (n ~1~ S(X ~2~)- n ~2~ S(X 
> ~1~)) ^2^ / (n ~1~ n ~2~ (n ~1~ + n ~2~))
>              = var ^'^(X ~1~) + var ^'^(X ~2~) + (S(X ~2~) - S(X ~1~) n ~2~ / 
> n ~1~ ) ^2^ n ~1~ / (n ~2~ (n ~1~ + n ~2~))



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to