Dmitry Lychagin has posted comments on this change.

Change subject: WIP: stddev
......................................................................


Patch Set 4:

(18 comments)

Looks good. My comments are relatively minor.

https://asterix-gerrit.ics.uci.edu/#/c/2990/4//COMMIT_MSG
Commit Message:

Line 7: WIP: stddev
Follow the commit message template. So it should be something like:

[ASTERIXDB-2459][FUN] Add sttdev() aggregate function

- user model changes: yes
- storage format changes: no
- interface changes: no

Details:
copy your details here


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate-sql/serial_stddev_double_null/serial_stddev_double_null.2.update.sqlpp
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate-sql/serial_stddev_double_null/serial_stddev_double_null.2.update.sqlpp:

Line 25: select element {'id':1,'gid':1,'val':double(5.32)};
It looks like the intent was to test cases when valplus is null, but in this 
line the 'valplus' value is 'missing'. Let's add another record with 'valplus': 
null, so we test both cases of null and missing. This also applies to other 
testcases (*_null), we should add 'valplus': null to all of them


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/agg_null_rec/agg_null_rec.3.query.sqlpp
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/agg_null_rec/agg_null_rec.3.query.sqlpp:

Line 43: )), 'stddev':test.coll_stddev((
should be "strict_stddev". "coll_" prefix still works but is deprecated


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/agg_null_rec_1/agg_null_rec_1.3.query.sqlpp
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/agg_null_rec_1/agg_null_rec_1.3.query.sqlpp:

Line 56:     select element i.val
should it be i.val (like in avg/sum above) or i.valplus (like in  min/max 
above)? Could you double check?


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/agg_number_rec/agg_number_rec.3.query.sqlpp
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/agg_number_rec/agg_number_rec.3.query.sqlpp:

Line 43: )),'stddev':test.coll_stddev((
should be "strict_stddev"


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/stddev_float_null/stddev_float_nu.1.ddl.sqlpp
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/stddev_float_null/stddev_float_nu.1.ddl.sqlpp:

Line 1: /*
type in the name of this file. should be '_null', not '_nu'


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/stddev_float_null/stddev_float_nu.3.query.sqlpp
File 
asterixdb/asterix-app/src/test/resources/runtimets/queries_sqlpp/aggregate/stddev_float_null/stddev_float_nu.3.query.sqlpp:

Line 1: /*
type in the name of this file. should be '_null', not '_nu'


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-app/src/test/resources/runtimets/results/aggregate-sql/serial_stddev_float/serial_stddev_float.1.adm
File 
asterixdb/asterix-app/src/test/resources/runtimets/results/aggregate-sql/serial_stddev_float/serial_stddev_float.1.adm:

Line 1: { "gid": 1, "stddev": 00.9574271077563381 }
why two 0s before the decimal point?


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-app/src/test/resources/runtimets/results/aggregate-sql/sum_int8_null/sum_int8_null.1.adm
File 
asterixdb/asterix-app/src/test/resources/runtimets/results/aggregate-sql/sum_int8_null/sum_int8_null.1.adm:

Line 1: -112
this seems to be wrong. please file an issue for this and meanwhile change the 
input data to avoid overflow


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-om/src/main/java/org/apache/asterix/om/functions/BuiltinFunctions.java
File 
asterixdb/asterix-om/src/main/java/org/apache/asterix/om/functions/BuiltinFunctions.java:

Line 1406:         addPrivateFunction(SERIAL_LOCAL_SQL_STDDEV, 
LocalAvgTypeComputer.INSTANCE, true);
should be LocalSingleVarStatisticsTypeComputer.INSTANCE


Line 1407:         addPrivateFunction(SERIAL_INTERMEDIATE_SQL_STDDEV, 
LocalAvgTypeComputer.INSTANCE, true);
should be LocalSingleVarStatisticsTypeComputer.INSTANCE


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/serializable/std/AbstractSerializableSingleVariableStatisticsAggregateFunction.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/serializable/std/AbstractSerializableSingleVariableStatisticsAggregateFunction.java:

Line 173:         moments m = new moments(m1, m2, count);
let's not create a new object for each tuple. We need to figure out a way to 
avoid it. One possible solution is to maintain each of 3 values in its own 
AMutableDouble/Int64 and have a method that'd update them for each tuple. 
Perhaps you could reuse aDouble, aInt64 instance you already have, just need to 
create another one for the 2nd double. 
It'd be nice if we could share the updateStepMoments() and combineMoments() 
with AbstractSingleVarStatisticsAggregateFunction instead of duplicating it 
there. 
Creating a helper class could be ok as long as we do not create its new 
instance for every tuple we process. (so once in init() is fine).


Line 194:             case TINYINT: {
> MAJOR SonarQube violation:
Remove { } from case blocks to fix this SonarQube complaint


Line 279:         moments m = new moments(m1, m2, count);
same comment. avoid object creation


Line 303:                 int offset5 = 
ARecordSerializerDeserializer.getFieldOffsetById(serBytes, offset, 
COUNT_FIELD_ID,
should be "offset3"


https://asterix-gerrit.ics.uci.edu/#/c/2990/4/asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/AbstractSingleVarStatisticsAggregateFunction.java
File 
asterixdb/asterix-runtime/src/main/java/org/apache/asterix/runtime/aggregates/std/AbstractSingleVarStatisticsAggregateFunction.java:

Line 96:                 new IAType[] { BuiltinType.ADOUBLE, 
BuiltinType.ADOUBLE, BuiltinType.AINT64 }, false);
LocalSingleVarStatisticsTypeComputer says that "count"'s type in AINT32, but 
this runtime code produces AINT64. We need to fix 
LocalSingleVarStatisticsTypeComputer to return AINT64 for the "count" field. It 
looks like similar problem exists in LocalAvgTypeComputer. It needs to be fixed 
there too. Can you file a bug and fix that in a separate change after this one?


Line 121:         double delta = val - m1;
Need to figure out how to reuse this code together with 
AbstractSerializableSingleVariableStatisticsAggregateFunction instead of having 
a duplicate code here.


PS4, Line 258: offset5
should be "offset3", not "5"?


-- 
To view, visit https://asterix-gerrit.ics.uci.edu/2990
To unsubscribe, visit https://asterix-gerrit.ics.uci.edu/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia709669a9d20358f11ad28f453ae8ad8551f6334
Gerrit-PatchSet: 4
Gerrit-Project: asterixdb
Gerrit-Branch: master
Gerrit-Owner: James Fang <[email protected]>
Gerrit-Reviewer: Anon. E. Moose #1000171
Gerrit-Reviewer: Dmitry Lychagin <[email protected]>
Gerrit-Reviewer: Jenkins <[email protected]>
Gerrit-HasComments: Yes

Reply via email to