Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/16645 )
Change subject: IMPALA-10279: Import CPC functionality from DataSketches ...................................................................... Patch Set 6: (5 comments) http://gerrit.cloudera.org:8080/#/c/16645/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/16645/6//COMMIT_MSG@11 PS6, Line 11: First, I updated our existing snapshot of DataSketches to the : following commit:8ae0ab5703783c8c360d48e577a6d450d3514ece : "https://github.com/apache/incubator-datasketches-cpp/pull/172" I see that your patch doesn't change any of the existing files in be/src/thirdparty/datasketches. Does this mean that none of the files in hll/include/ kll/include/ common/include/ directories change in Apache DataCketches between the commits of c1a6f8edb49699520f248d3d02019b87429b4241 and 8ae0ab5703783c8c360d48e577a6d450d3514ece (I just want to be 100% sure that we don't miss any updates on these files) http://gerrit.cloudera.org:8080/#/c/16645/6/be/src/exprs/datasketches-test.cc File be/src/exprs/datasketches-test.cc: http://gerrit.cloudera.org:8080/#/c/16645/6/be/src/exprs/datasketches-test.cc@82 PS6, Line 82: get an estimate for quantiles I think here you don't get the quantiles estimates but the cardinality estimates. http://gerrit.cloudera.org:8080/#/c/16645/6/be/src/exprs/datasketches-test.cc@101 PS6, Line 101: and prints the result This code doesn't print the results. (and shouldn;t. I guess this is a copy-paste comment from the official site) http://gerrit.cloudera.org:8080/#/c/16645/6/be/src/exprs/datasketches-test.cc@114 PS6, Line 114: EXPECT_EQ(149796, (int)sketch.get_estimate()); I'm not 100% familiar with the characteristics of the CPC algorithm. Are you sure that this result is deterministic? http://gerrit.cloudera.org:8080/#/c/16645/6/be/src/thirdparty/datasketches/README.md File be/src/thirdparty/datasketches/README.md: http://gerrit.cloudera.org:8080/#/c/16645/6/be/src/thirdparty/datasketches/README.md@12 PS6, Line 12: The git branch of the snapshot I used as a source for the files: 2.1.0-incubating : The hash: c1a6f8edb49699520f248d3d02019b87429b4241 : : Browse the source files here: : https://github.com/apache/incubator-datasketches-cpp/tree/2.1.0-incubating In case you updated the content of this folder to a newer version of Apache DataSketches then please update this section too. -- To view, visit http://gerrit.cloudera.org:8080/16645 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1e8d0c2e80df95fa84af82c64d493df9bbb34a8c Gerrit-Change-Number: 16645 Gerrit-PatchSet: 6 Gerrit-Owner: Fucun Chu <[email protected]> Gerrit-Reviewer: Gabor Kaszab <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Comment-Date: Thu, 29 Oct 2020 07:35:48 +0000 Gerrit-HasComments: Yes
