Gergely Fürnstáhl has posted comments on this change. ( http://gerrit.cloudera.org:8080/18413 )
Change subject: IMPALA-11205: Implement Statistical functions : CORR(), COVAR_SAMP() and COVAR_POP() ...................................................................... Patch Set 25: Code-Review+1 (3 comments) Math seems good to me http://gerrit.cloudera.org:8080/#/c/18413/25/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/18413/25/be/src/exprs/aggregate-functions-ir.cc@300 PS25, Line 300: struct CorrState { : int64_t count; // number of elements : double xavg; // average of x elements : double yavg; // average of y elements : double xvar; // n times the variance of x elements : double yvar; // n times the variance of y elements : double covar; // n times the covariance : }; As I understand, both CorrState and CovarState calculates xavg,yavg and covar the same way. I think we could restructure this a bit to spare code duplication. E.g.: CovarState::update, remove and merge could encapsulate the logic regarding the common members, and CorrState could have a CovarState member, and call its functions meanwhile calculating xvar and yvar too http://gerrit.cloudera.org:8080/#/c/18413/25/be/src/exprs/aggregate-functions-ir.cc@324 PS25, Line 324: if (state->count > 1 nit: do we need this? for first items state->xavg = deltaX = x, so we will add 0 to state->covar, does not change anything. This way we pay a branching for every count unnecessarily http://gerrit.cloudera.org:8080/#/c/18413/25/be/src/exprs/aggregate-functions-ir.cc@331 PS25, Line 331: // vx_n = vx_(n - 1) + (x_n - mx_n) * (x_n - mx_(n - 1)) nit: the order is a bit misleading as deltax=(x_n - mx_(n - 1)) -- To view, visit http://gerrit.cloudera.org:8080/18413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ad627c953ba24d9cde2d5549bdd0d27a9c0d06 Gerrit-Change-Number: 18413 Gerrit-PatchSet: 25 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jian Zhang <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Wed, 15 Jun 2022 14:57:23 +0000 Gerrit-HasComments: Yes
