asolimando commented on code in PR #5444:
URL: https://github.com/apache/hive/pull/5444#discussion_r1877445584
##########
ql/src/test/org/apache/hadoop/hive/ql/optimizer/calcite/stats/TestFilterSelectivityEstimator.java:
##########
@@ -159,7 +159,7 @@ public void testIsHistogramAvailableWhenEmptyArray() {
@Test
public void testLessThanSelectivity() {
- Assert.assertEquals(0.6153846153846154, lessThanSelectivity(KLL, 3),
DELTA);
Review Comment:
> > What about moving the evaluate method into a map which is filled when
invoking register for each skect function we support?
>
> @asolimando But we don't know what `evaluate `methods of sketch funtion we
should support. In fact, we can run all `evaluate` methods of each function,
but we just pick the first one & we can not determine which `evaluate ` method
is better or is what we need.
>
I think we do, and if we pick the wrong one, we will notice because the unit
tests results won't match.
What I propose is to manually fix which signature we want for the overload
function, and then wrap it into another function providing values for the extra
arguments, if any (a sort of function composition, if you like).
@okumin it's true that those data structures are approximated anyway, but
moving from "extremes included" to "extremes excluded" without any way to
control this, could have significant effects depending on the dataset, and it
might affect end-users' results.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]