[ https://issues.apache.org/jira/browse/KYLIN-3361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhong Yanghong updated KYLIN-3361: ---------------------------------- Description: (x ~1~ - +x+) ^2^ + (x ~2~ - +x+) ^2^ + ... + (x ~n~ - +x+) ^2^ = x ~1~ ^2^ + x ~2~ ^2^ + ... + x ~n~ ^2^ - n +x+ ^2^, where +x+ is the average of x ~1~, x ~2~, ..., x ~n~. Therefore, to compute stddev, what kylin need to do is to pre-calculate sum(x ~i~ ^2^), sum(x ~i~) and count var(X) = E(X ^2^) - E(X) ^2^ var ^'^(X) = n * var(X) = n*(E(X ^2^) - E(X) ^2^) = S(X ^2^) - S(X) ^2^/n = S(X ~1~ ^2^) + S(X ~2~ ^2^) - S(X ~1~ + X ~2~) ^2^/(n ~1~ + n ~2~) = S(X ~1~ ^2^) - S(X ~1~ ) ^2^/(n ~1~) + S(X ~2~ ^2^) - S(X ~2~) ^2^/(n ~2~) + S(X ~1~) ^2^/(n ~1~) + S(X ~2~) ^2^/(n ~2~) - S(X ~1~ + X ~2~) ^2^/(n ~1~ + n ~2~) = var ^'^(X ~1~) + var ^'^(X ~2~) + (n ~1~ S(X ~2~)- n ~2~ S(X ~1~)) ^2^ / (n ~1~ n ~2~ (n ~1~ + n ~2~)) = var ^'^(X ~1~) + var ^'^(X ~2~) + (S(X ~2~) - S(X ~1~) n ~2~ / n ~1~ ) ^2^ n ~1~ / (n ~2~ (n ~1~ + n ~2~)) was: (x ~1~ - +x+) ^2^ + (x ~2~ - +x+) ^2^ + ... + (x ~n~ - +x+) ^2^ = x ~1~ ^2^ + x ~2~ ^2^ + ... + x ~n~ ^2^ - n +x+ ^2^, where +x+ is the average of x ~1~, x ~2~, ..., x ~n~. Therefore, to compute stddev, what kylin need to do is to pre-calculate sum(x ~i~ ^2^), sum(x ~i~) and count var(X) = E(X ^2^) - E(X) ^2^ var ^'^(X) = n * var(X) = n*(E(X ^2^) - E(X) ^2^) = S(X ^2^) - S(X) ^2^/n = S(X ~1~ ^2^) + S(X ~2~ ^2^) - S(X ~1~ + X ~2~) ^2^/(n ~1~ + n ~2~) = S(X ~1~ ^2^) - S(X ~1~ ) ^2^/(n ~1~) + S(X ~2~ ^2^) - S(X ~2~) ^2^/(n ~2~) + S(X ~1~) ^2^/(n ~1~) + S(X ~2~) ^2^/(n ~2~) - S(X ~1~ + X ~2~) ^2^/(n ~1~ + n ~2~) = var ^'^(X ~1~) + var ^'^(X ~2~) + (n ~1~ X ~2~ - n ~2~ X ~1~) ^2^ / (n ~1~ n ~2~ (n ~1~ + n ~2~)) = var ^'^(X ~1~) + var ^'^(X ~2~) + (X ~2~ - X ~1~ n ~2~ / n ~1~ ) ^2^ n ~1~ / (n ~2~ (n ~1~ + n ~2~)) > Support stddev as a new measure > ------------------------------- > > Key: KYLIN-3361 > URL: https://issues.apache.org/jira/browse/KYLIN-3361 > Project: Kylin > Issue Type: New Feature > Reporter: Zhong Yanghong > Assignee: Zhong Yanghong > Priority: Major > Fix For: v3.1.0 > > > (x ~1~ - +x+) ^2^ + (x ~2~ - +x+) ^2^ + ... + (x ~n~ - +x+) ^2^ = x ~1~ ^2^ + > x ~2~ ^2^ + ... + x ~n~ ^2^ - n +x+ ^2^, where +x+ is the average of x ~1~, x > ~2~, ..., x ~n~. Therefore, to compute stddev, what kylin need to do is to > pre-calculate sum(x ~i~ ^2^), sum(x ~i~) and count > > var(X) = E(X ^2^) - E(X) ^2^ > var ^'^(X) = n * var(X) > = n*(E(X ^2^) - E(X) ^2^) > = S(X ^2^) - S(X) ^2^/n > = S(X ~1~ ^2^) + S(X ~2~ ^2^) - S(X ~1~ + X ~2~) ^2^/(n ~1~ + n > ~2~) > = S(X ~1~ ^2^) - S(X ~1~ ) ^2^/(n ~1~) + S(X ~2~ ^2^) - S(X ~2~) > ^2^/(n ~2~) + S(X ~1~) ^2^/(n ~1~) + S(X ~2~) ^2^/(n ~2~) - S(X ~1~ + X ~2~) > ^2^/(n ~1~ + n ~2~) > = var ^'^(X ~1~) + var ^'^(X ~2~) + (n ~1~ S(X ~2~)- n ~2~ S(X > ~1~)) ^2^ / (n ~1~ n ~2~ (n ~1~ + n ~2~)) > = var ^'^(X ~1~) + var ^'^(X ~2~) + (S(X ~2~) - S(X ~1~) n ~2~ / > n ~1~ ) ^2^ n ~1~ / (n ~2~ (n ~1~ + n ~2~)) -- This message was sent by Atlassian Jira (v8.3.4#803005)