At Tue, 08 Jan 2019 16:26:38 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyot...@lab.ntt.co.jp> wrote in <20190108.162638.106314087.horiguchi.kyot...@lab.ntt.co.jp> > Hello. > > At Fri, 21 Dec 2018 11:50:28 -0500, Tom Lane <t...@sss.pgh.pa.us> wrote in > <28533.1545411...@sss.pgh.pa.us> > > seem that that's just moving the problem around, but I think it > > might be possible to show that such a value couldn't be computed > > by scalarltsel given a histogram with no more than 10000 members. > > (I haven't tried to actually prove that, but it seems intuitive > > that the set of possible results would be quantized with no more > > than about 5 digits precision.)
I think we don't need a perfect proof for that. The fact that exactly 1/3 is quite natural and common but 1/3 + ε is not would be enough. > FWIW, I got the following result on my environment. It seems > different enough if this holds on all supported platforms, though > there still is a case where the result of a sequence of > arithmetics makes false match. Simple selectivity of a relation theoretically cannot match with the epsilon. (Of couse on *my* environment.) (0.333..) binary format: 3f d5 55 55 55 55 55 55 x = 0.333333333333333315 231 matches, 79 no_matches (0.3{13}42..) binary format: 3f d5 55 55 55 55 55 f1 x = 0.333333333333341975 0 matches, 310 no_matches (0.3{15}42..) binary format: 3f d5 55 55 55 55 55 57 x = 0.333333333333333426 0 matches, 310 no_matches It seems that 0.3{13}42 is correctly 0.3{15}42, which makes just two LSBs difference from 1/3. I believe C is well standardized on the translation. Other DEFAULT_*_SELs are not compared in this way. The attached small patch fixes the case mentioned in this thread, but I'm not sure where to put a test. Static assertion is not usable. Assertion in somewhere perhaps in clauselist_selectivity seems somewhat overdone.. I don't find a suitable place in regression test.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
#include <stdio.h> #include <math.h> int test(double x) { double d = 1.0; double d0 = 0; unsigned char *c_x = (unsigned char *) &x; int nmatches = 0; int nnomatches = 0; int i; fprintf(stderr, "binary format: "); for (i = 7 ; i >= 0 ; i--) fprintf(stderr, "%s%02x", i < 7 ? " " : "", c_x[i]); fprintf(stderr, "\n"); fprintf(stderr, "x = %20.18f\n", x); while (d != d0) { double z = floor(d * 3); double z1 = z + 1.0; double y = d / z; double y1 = d / z1; /* Check if both sides of d * 3 doesn't make match */ if (y == x || y1 == x) nmatches++; else nnomatches++; d0 = d; d = d * 10; } fprintf(stderr, " %d matches, %d no_matches\n", nmatches, nnomatches); } int main(void) { test(0.3333333333333333); test(0.333333333333342); test(0.33333333333333342); return 0; }
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c index 3739b9817a..cdeaac22c8 100644 --- a/src/backend/optimizer/path/clausesel.c +++ b/src/backend/optimizer/path/clausesel.c @@ -109,7 +109,7 @@ clauselist_selectivity(PlannerInfo *root, ListCell *l; int listidx; - /* + /* * If there's exactly one clause, just go directly to * clause_selectivity(). None of what we might do below is relevant. */ diff --git a/src/include/utils/selfuncs.h b/src/include/utils/selfuncs.h index 5cc4cf15e2..15a8d2402a 100644 --- a/src/include/utils/selfuncs.h +++ b/src/include/utils/selfuncs.h @@ -33,8 +33,12 @@ /* default selectivity estimate for equalities such as "A = b" */ #define DEFAULT_EQ_SEL 0.005 -/* default selectivity estimate for inequalities such as "A < b" */ -#define DEFAULT_INEQ_SEL 0.3333333333333333 +/* + * default selectivity estimate for inequalities such as "A < b" + * The last two digits prevent it from making a false match with 1/3 computed + * from histogram and/or MCV. + */ +#define DEFAULT_INEQ_SEL 0.33333333333333342 /* default selectivity estimate for range inequalities "A > b AND A < c" */ #define DEFAULT_RANGE_INEQ_SEL 0.005