Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/18413 )
Change subject: IMPALA-11205: Implement Statistical functions : CORR(), COVAR_SAMP() and COVAR_POP() ...................................................................... Patch Set 20: (19 comments) http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc File be/src/exprs/aggregate-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@341 PS20, Line 341: nit: remove the space before ")" http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@342 PS20, Line 342: state->count = 0; : state->xavg = 0.0; : state->yavg = 0.0; : state->xvar = 0.0; : state->yvar = 0.0; : state->covar = 0.0; nit: these can be simplified by memset(), just like what we do at line 313. memset(state, 0, sizeof(CorrState)) http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@348 PS20, Line 348: } : else nit: put "}" and "else" at the same line, i.e. "} else {" http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@355 PS20, Line 355: (x_n - mx_(n - 1)) * (y_n - my_n) This comment seems incorrect. 'deltaX' is actually (x_n - mx_n). 'y - state->yavg' is (y_n - my_(n-1)). BTW, please fix the idention of line 355 ~ 360. http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@375 PS20, Line 375: the number of calls to Update() or Remove() This is not clear to me. Do you mean 'count'? http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@393 PS20, Line 393: CorrUpdateState(val1, val2, state); nit: please add brackets {} for multi-line if-statement. http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@398 PS20, Line 398: the number of calls to Update() or Remove() Could you explain more? Why do we need to check the number of Update/Remove calls? http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@407 PS20, Line 407: CorrRemoveState(val1, val2, state); nit: please add brackets {} for multi-line if-statement. http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@451 PS20, Line 451: sqrt((state->xvar * state->yvar)) We should check whether 'state->xvar * state->yvar' become negative before calling sqrt(). Please split the if-statement at line 455 into two: One for checking counts and vars before this line, and the other one for checking 'r == 0.0' after this line. BTW, don't need duplicated parentheses here. http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@452 PS20, Line 452: // xvar and yvar become negative in certain cases due to floating point rounding error. : // Since these values are very small, they can be ignored and rounded to 0. : DCHECK(state->xvar >= -1E-8 && state->yvar >= -1E-8); move this to the begining of this method, and split it into two DCHECKs so we know exactly which one breaks if it happens. DCHECK(state->xvar >= -1E-8); DCHECK(state->yvar >= -1E-8); http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@467 PS20, Line 467: ctx->Free(src.ptr); : return CorrGetValue(ctx, src); This is a use-after-free pattern. We should not free the resource if we still want to use it. Chang it to DoubleVal r = CorrGetValue(ctx, src); ctx->Free(src.ptr); return r; http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@507 PS20, Line 507: nit: remove the space before ")". http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@508 PS20, Line 508: state->count = 0; : state->xavg = 0.0; : state->yavg = 0.0; : state->covar = 0.0; nit: use memset instead. memset(state, 0, sizeof(CovarState)) http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@512 PS20, Line 512: } : else { nit: put them into one line, i.e. "} else {" http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@535 PS20, Line 535: the number of calls to Update() or Remove() Need explanation here too http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@553 PS20, Line 553: CovarUpdateState(val1, val2, state); nit: add brackets {} for multi-line if-statement. http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@567 PS20, Line 567: CovarRemoveState(val1, val2, state); nit: add brackets {} for multi-line if-statement. http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@625 PS20, Line 625: ctx->Free(src.ptr); : return CovarSampleGetValue(ctx, src); This is a use-after-free pattern too. Please fix it. http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@636 PS20, Line 636: ctx->Free(src.ptr); : return CovarPopulationGetValue(ctx, src); This is a use-after-free pattern too. Please fix it. -- To view, visit http://gerrit.cloudera.org:8080/18413 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I32ad627c953ba24d9cde2d5549bdd0d27a9c0d06 Gerrit-Change-Number: 18413 Gerrit-PatchSet: 20 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Jian Zhang <[email protected]> Gerrit-Reviewer: Kurt Deschler <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Wed, 08 Jun 2022 03:15:25 +0000 Gerrit-HasComments: Yes
