Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17153 )
Change subject: IMPALA-10558: Implement ds_theta_exclude() function ...................................................................... Patch Set 2: (8 comments) Thanks for these changes! I had some comments mostly nits and around test coverage. http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc File be/src/exprs/datasketches-functions-ir.cc: http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc@128 PS2, Line 128: // a_not_b nit: this comment is not needed as doesn't give extra info http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc@131 PS2, Line 131: datasketches::theta_sketch::unique_ptr first_sketch_ptr; : if (!first_serialized_sketch.is_null && first_serialized_sketch.len > 0) { : try { : first_sketch_ptr = datasketches::theta_sketch::deserialize( : (void*)first_serialized_sketch.ptr, first_serialized_sketch.len); : } catch (const std::exception&) { : LogSketchDeserializationError(ctx); : return StringVal::null(); : } : } This part seems pretty identical to the section L141-150. Can you move it to a function to avoid code repetition? http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions-ir.cc@155 PS2, Line 155: first_sketch_ptr.operator bool() I'm not sure I understand the condition in this format :) Could you please explain what goes on here? http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions.h File be/src/exprs/datasketches-functions.h: http://gerrit.cloudera.org:8080/#/c/17153/2/be/src/exprs/datasketches-functions.h@73 PS2, Line 73: 'serialized_sketch' Could you mention both sketch params? http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test File testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test: http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@330 PS2, Line 330: for A is an empty sketch. When A is empty and B is null. http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@331 PS2, Line 331: select ds_theta_estimate(ds_theta_exclude(ds_theta_sketch(f2), null)) Could you please add another test where A is null and B is empty? (the opposite of this one) http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@332 PS2, Line 332: from functional_parquet.emptytable; Another test would be where A and B are both empty. http://gerrit.cloudera.org:8080/#/c/17153/2/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@379 PS2, Line 379: ==== I miss a test where the result of an a-not-b is a non-empty sketch (where the estimate is greater than zero). -- To view, visit http://gerrit.cloudera.org:8080/17153 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I05119fd8c652c07ff248a99e44b0da3541e46ca3 Gerrit-Change-Number: 17153 Gerrit-PatchSet: 2 Gerrit-Owner: Fucun Chu <chufu...@hotmail.com> Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Comment-Date: Fri, 12 Mar 2021 12:12:42 +0000 Gerrit-HasComments: Yes