pranav.lo...@cloudera.com has posted comments on this change. ( http://gerrit.cloudera.org:8080/18413 )
Change subject: IMPALA-11205: Implement Statistical functions : CORR(), COVAR_SAMP() and COVAR_POP() ...................................................................... Patch Set 18: (11 comments) > Patch Set 17: > > (9 comments) http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@293 PS17, Line 293: // online algorithm. This is calcula > nit: please also mention "Welford's online algorithm". Done http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@337 PS17, Line 337: double deltaY = y - state->yavg; > If state->count is 1, it will be decreased to 0. We will hit divide-by-zero Yeah, it's been taken care of in the new patch. http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@342 PS17, Line 342: state->xvar = 0.0; > If the original count is 2, it will be decreased to 1 at line 337. Then we Sure, updated in the latest patch http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@355 PS17, Line 355: vx_ > nit: we prefer nullptr to NULL. Done http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@358 PS17, Line 358: state->yvar -= deltaY * (y - state->yavg); > nit: we can merge this check to line 354, i.e. Done http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@371 PS17, Line 371: > nit: same as above. We can merge the check to line 367. Done http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@412 PS17, Line 412: > nit: add a space after "if" and replace NULL with nullptr. Done http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@416 PS17, Line 416: DCHECK(dst->ptr != NULL); : DCHECK_EQ(sizeof(CorrState), dst->len); : CorrState* dst_state = reinterpret_cast<CorrState*>(dst->ptr); : if (src.ptr != nullptr) { : long nA = dst_state->count; : long nB = src_state->count; > nit: Can we simplify this by using memcpy? i.e. This was causing impala crash so didn't add this as of now http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@423 PS17, Line 423: dst_state->count = src_state->count; > nit: replace "if" with "else if" to save one check, or add a "return" at th Done http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@453 PS17, Line 453: > sqrt() is expensive. Let's change this to sqrt(state->xvar * state->yvar). Done http://gerrit.cloudera.org:8080/#/c/18413/17/be/src/exprs/aggregate-functions-ir.cc@493 PS17, Line 493: > nit: both terms? Maybe you want to add another equation after line 491: Done -- To view, visit http://gerrit.cloudera.org:8080/18413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ad627c953ba24d9cde2d5549bdd0d27a9c0d06 Gerrit-Change-Number: 18413 Gerrit-PatchSet: 18 Gerrit-Owner: Anonymous Coward <pranav.lo...@cloudera.com> Gerrit-Reviewer: Anonymous Coward <pranav.lo...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Comment-Date: Wed, 01 Jun 2022 07:05:59 +0000 Gerrit-HasComments: Yes