Jian Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18413 )
Change subject: IMPALA-11205: Implement Statistical functions : CORR(), COVAR_SAMP() and COVAR_POP() ...................................................................... Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc@296 PS18, Line 296: // r = covar / (√(xvar * yvar) could you explain more about how the `CorrState` is updated and used to calculate the final correlation of two variables? it would be helpful for readers unfamiliar with the algorithms if you could add links about the "Pearson's correlation" and "Welford's online algorithm". -- To view, visit http://gerrit.cloudera.org:8080/18413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ad627c953ba24d9cde2d5549bdd0d27a9c0d06 Gerrit-Change-Number: 18413 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jian Zhang <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Fri, 03 Jun 2022 01:38:45 +0000 Gerrit-HasComments: Yes
