Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18413 )

Change subject: IMPALA-11205: Implement Statistical functions : CORR(), 
COVAR_SAMP()  and COVAR_POP()
......................................................................


Patch Set 20:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@341
PS20, Line 341:
nit: remove the space before ")"


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@342
PS20, Line 342:     state->count = 0;
              :     state->xavg = 0.0;
              :     state->yavg = 0.0;
              :     state->xvar = 0.0;
              :     state->yvar = 0.0;
              :     state->covar = 0.0;
nit: these can be simplified by memset(), just like what we do at line 313.

 memset(state, 0, sizeof(CorrState))


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@348
PS20, Line 348:   }
              :   else
nit: put "}" and "else" at the same line, i.e. "} else {"


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@355
PS20, Line 355: (x_n - mx_(n - 1)) * (y_n - my_n)
This comment seems incorrect. 'deltaX' is actually (x_n - mx_n). 'y - 
state->yavg' is (y_n - my_(n-1)).

BTW, please fix the idention of line 355 ~ 360.


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@375
PS20, Line 375: the number of calls to Update() or Remove()
This is not clear to me. Do you mean 'count'?


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@393
PS20, Line 393:     CorrUpdateState(val1, val2, state);
nit: please add brackets {} for multi-line if-statement.


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@398
PS20, Line 398: the number of calls to Update() or Remove()
Could you explain more? Why do we need to check the number of Update/Remove 
calls?


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@407
PS20, Line 407:     CorrRemoveState(val1, val2, state);
nit: please add brackets {} for multi-line if-statement.


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@451
PS20, Line 451: sqrt((state->xvar * state->yvar))
We should check whether 'state->xvar * state->yvar' become negative before 
calling sqrt(). Please split the if-statement at line 455 into two: One for 
checking counts and vars before this line, and the other one for checking 'r == 
0.0' after this line.

BTW, don't need duplicated parentheses here.


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@452
PS20, Line 452:   // xvar and yvar become negative in certain cases due to 
floating point rounding error.
              :   // Since these values are very small, they can be ignored and 
rounded to 0.
              :   DCHECK(state->xvar >= -1E-8 && state->yvar >= -1E-8);
move this to the begining of this method, and split it into two DCHECKs so we 
know exactly which one breaks if it happens.

 DCHECK(state->xvar >= -1E-8);
 DCHECK(state->yvar >= -1E-8);


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@467
PS20, Line 467:   ctx->Free(src.ptr);
              :   return CorrGetValue(ctx, src);
This is a use-after-free pattern. We should not free the resource if we still 
want to use it. Chang it to

 DoubleVal r = CorrGetValue(ctx, src);
 ctx->Free(src.ptr);
 return r;


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@507
PS20, Line 507:
nit: remove the space before ")".


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@508
PS20, Line 508:     state->count = 0;
              :     state->xavg = 0.0;
              :     state->yavg = 0.0;
              :     state->covar = 0.0;
nit: use memset instead.

 memset(state, 0, sizeof(CovarState))


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@512
PS20, Line 512:   }
              :   else {
nit: put them into one line, i.e. "} else {"


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@535
PS20, Line 535: the number of calls to Update() or Remove()
Need explanation here too


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@553
PS20, Line 553:     CovarUpdateState(val1, val2, state);
nit: add brackets {} for multi-line if-statement.


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@567
PS20, Line 567:     CovarRemoveState(val1, val2, state);
nit: add brackets {} for multi-line if-statement.


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@625
PS20, Line 625:   ctx->Free(src.ptr);
              :   return CovarSampleGetValue(ctx, src);
This is a use-after-free pattern too. Please fix it.


http://gerrit.cloudera.org:8080/#/c/18413/20/be/src/exprs/aggregate-functions-ir.cc@636
PS20, Line 636:   ctx->Free(src.ptr);
              :   return CovarPopulationGetValue(ctx, src);
This is a use-after-free pattern too. Please fix it.



--
To view, visit http://gerrit.cloudera.org:8080/18413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32ad627c953ba24d9cde2d5549bdd0d27a9c0d06
Gerrit-Change-Number: 18413
Gerrit-PatchSet: 20
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jian Zhang <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Wed, 08 Jun 2022 03:15:25 +0000
Gerrit-HasComments: Yes

Reply via email to