Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18413 )
Change subject: IMPALA-11205: Implement Statistical functions : CORR(), COVAR_SAMP() and COVAR_POP() ...................................................................... Patch Set 17: (9 comments) http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@293 PS17, Line 293: // using a stable one-pass algorithm nit: please also mention "Welford's online algorithm". http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@355 PS17, Line 355: NULL nit: we prefer nullptr to NULL. http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@358 PS17, Line 358: if (!isnan(src1.val) && !isnan(src2.val)) { nit: we can merge this check to line 354, i.e. if (src1.is_null || isnan(src1.val) || src2.is_null || isnan(src2.val)) return; http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@371 PS17, Line 371: if (!isnan(src1.val) && !isnan(src2.val)) { nit: same as above. We can merge the check to line 367. http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@412 PS17, Line 412: if(src.ptr != NULL) { nit: add a space after "if" and replace NULL with nullptr. http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@416 PS17, Line 416: dst_state->count = src_state->count; : dst_state->xavg = src_state->xavg; : dst_state->yavg = src_state->yavg; : dst_state->xvar = src_state->xvar; : dst_state->yvar = src_state->yvar; : dst_state->covar = src_state->covar; nit: Can we simplify this by using memcpy? i.e. memcpy(dst, &src, sizeof(CorrState) http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@423 PS17, Line 423: if (nA != 0 && nB != 0) { nit: replace "if" with "else if" to save one check, or add a "return" at the end of the above if-branch. http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@453 PS17, Line 453: sqrt(state->xvar) / sqrt(state->yvar) sqrt() is expensive. Let's change this to sqrt(state->xvar * state->yvar). http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@493 PS17, Line 493: both terms nit: both terms? Maybe you want to add another equation after line 491: // c_n = c_(n - 1) + (x_n - mx_n) * (y_n - my_(n - 1)) -- To view, visit http://gerrit.cloudera.org:8080/18413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ad627c953ba24d9cde2d5549bdd0d27a9c0d06 Gerrit-Change-Number: 18413 Gerrit-PatchSet: 17 Gerrit-Owner: Anonymous Coward <pranav.lo...@cloudera.com> Gerrit-Reviewer: Anonymous Coward <pranav.lo...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Comment-Date: Tue, 31 May 2022 06:57:09 +0000 Gerrit-HasComments: Yes