Gergely Fürnstáhl has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18413 )

Change subject: IMPALA-11205: Implement Statistical functions : CORR(), 
COVAR_SAMP()  and COVAR_POP()
......................................................................


Patch Set 25: Code-Review+1

(3 comments)

Math seems good to me

http://gerrit.cloudera.org:8080/#/c/18413/25/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18413/25/be/src/exprs/aggregate-functions-ir.cc@300
PS25, Line 300: struct CorrState {
              :   int64_t count; // number of elements
              :   double xavg; // average of x elements
              :   double yavg; // average of y elements
              :   double xvar; // n times the variance of x elements
              :   double yvar; // n times the variance of y elements
              :   double covar; // n times the covariance
              : };
As I understand, both CorrState and CovarState calculates xavg,yavg and covar 
the same way. I think we could restructure this a bit to spare code duplication.

E.g.:
CovarState::update, remove and merge could encapsulate the logic regarding the 
common members, and CorrState could have a CovarState member, and call its 
functions meanwhile calculating xvar and yvar too


http://gerrit.cloudera.org:8080/#/c/18413/25/be/src/exprs/aggregate-functions-ir.cc@324
PS25, Line 324: if (state->count > 1
nit: do we need this? for first items state->xavg = deltaX = x, so we will add 
0 to state->covar, does not change anything. This way we pay a branching for 
every count unnecessarily


http://gerrit.cloudera.org:8080/#/c/18413/25/be/src/exprs/aggregate-functions-ir.cc@331
PS25, Line 331: // vx_n = vx_(n - 1) + (x_n - mx_n) * (x_n - mx_(n - 1))
nit: the order is a bit misleading as deltax=(x_n - mx_(n - 1))



--
To view, visit http://gerrit.cloudera.org:8080/18413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32ad627c953ba24d9cde2d5549bdd0d27a9c0d06
Gerrit-Change-Number: 18413
Gerrit-PatchSet: 25
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Gergely Fürnstáhl <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jian Zhang <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Wed, 15 Jun 2022 14:57:23 +0000
Gerrit-HasComments: Yes

Reply via email to