[email protected] has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18413 )

Change subject: IMPALA-11205: Implement Statistical functions : CORR(), 
COVAR_SAMP()  and COVAR_POP()
......................................................................


Patch Set 19:

(10 comments)

> Patch Set 18:
>
> (1 comment)

http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc@296
PS18, Line 296: // 
https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Online
> could you explain more about how the `CorrState` is updated and used to cal
Added a few links


http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc@408
PS18, Line 408:       tm_src2.ToSubsecondUnixTime(UTCPTR, &val2)) {
> Skipping NaN here seems questionable? Please add comments or handle appropr
Yeah, its fixed now.


http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc@422
PS18, Line 422:     long nB = src_state->count;
> Does (nB == 0) need handling?
nB is the count of src->state. nB == 0 has been accounted for in CorrInit() and 
CorrRemove(). We alse make sure that src.ptr is not equal to nullptr.


http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc@462
PS18, Line 462:   return DoubleVal(corr);
> Add an assert that negative values are small and comment that this can happ
Sure,added, I've taken the tolerance value to be around -1E-8 as it seemed safe 
to me. We can probably go for smaller values like -1E-10 as well. Please let me 
know what would be a suitable tolerance value.


http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc@543
PS18, Line 543:   DCHECK_EQ(sizeof(CovarState), dst->len);
> Skipping Nan?
Updated in the latest patch.


http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc@575
PS18, Line 575:   }
> Skipping NaN
Updated in the latest patch.


http://gerrit.cloudera.org:8080/#/c/18413/18/be/src/exprs/aggregate-functions-ir.cc@589
PS18, Line 589:       return;
> handle (nB == 0) ?
nB is the count of src->state. nB == 0 has been accounted for in CorrInit() and 
CorrRemove(). We alse make sure that src.ptr is not equal to nullptr.


http://gerrit.cloudera.org:8080/#/c/18413/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test
File testdata/workloads/functional-query/queries/QueryTest/aggregation.test:

http://gerrit.cloudera.org:8080/#/c/18413/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@1526
PS18, Line 1526:  from tpcds.store;
> This may be non-deterministic (flaky) without an order by on the window cla
Yeah, its fixed in the latest patch


http://gerrit.cloudera.org:8080/#/c/18413/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@1544
PS18, Line 1544: select id, double_col, corr(double_col, int_col) over 
(partition by month order by id) from functional.alltypes
> This may be non-deterministic (flaky) without an order by on the window cla
Yeah, its fixed in the latest patch


http://gerrit.cloudera.org:8080/#/c/18413/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@1578
PS18, Line 1578: select s_store_sk, corr(s_number_employees, s_floor_space) 
over (partition by s_city order by s_store_sk
> Check if order by city is strong ordering.
It wasn't strong ordering, I've changed it to s_store_sk which contains unique 
values.



--
To view, visit http://gerrit.cloudera.org:8080/18413
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I32ad627c953ba24d9cde2d5549bdd0d27a9c0d06
Gerrit-Change-Number: 18413
Gerrit-PatchSet: 19
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Jian Zhang <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Mon, 06 Jun 2022 17:23:00 +0000
Gerrit-HasComments: Yes

Reply via email to