Hello. At Fri, 21 Dec 2018 11:50:28 -0500, Tom Lane <t...@sss.pgh.pa.us> wrote in <28533.1545411...@sss.pgh.pa.us> > "Yuzuko Hosoya" <hosoya.yuz...@lab.ntt.co.jp> writes: > > From: Kyotaro HORIGUCHI [mailto:horiguchi.kyot...@lab.ntt.co.jp] > >> At Thu, 20 Dec 2018 17:21:29 +0900, "Yuzuko Hosoya" > >> <hosoya.yuz...@lab.ntt.co.jp> wrote in > >> <008701d4983d$02e731c0$08b59540$@lab.ntt.co.jp> > >>> To handle such cases I've thought up of an idea based on a previous > >>> commit[1] which solved a similar problem related to > >>> DEFAULT_NUM_DISTINCT. Just like this modification, I added a flag > >>> which shows whether the selectivity > > >> The commit [1] added almost no complexity, but this does. Affects many > >> functions to introduce tighter coupling between operator-selectivity > >> functions and clause selectivity functions. isdefault travells alone > >> too long just to bind remote functions. We might need more pricipled > >> way to handle that. > > Yeah, I don't think that this approach is a reasonable tradeoff to > eliminate a narrow case. In particular, what's been done here to > clause_selectivity is totally unprincipled and poorly thought out: > you can't add an output parameter that's set in only some cases. > Yet, there's no good way to decide how to set it in many cases. > For instance, what would you do for an AND or OR where some of > the branches are default estimates and some not? > > >> Around the [1], there was a discussion about introducing the notion of > >> uncertainty to selectivity. The isdefualt is a kind of uncertainty > >> value indicating '0/100% uncertain'. So my feeling is saying that > >> it's better that Selectivity has certinty component then building a > >> selectivity arithmetics involving uncertainty, though I don't have a > >> clear picture. > > > I see. Indeed if we would change Selectivity so that it includes > > information about default/non-default, such problems I mentioned > > would be solved. But I'm afraid that this modification would lead > > to a big impact, since there are lots of functions that calculate > > selectivity. I also don't have a concreate idea, so I will think > > about it. Besides that, I'd like to ask community whether such > > modification would be acceptable or not. > > The planner definitely has a lot of problems that could be addressed > with some sort of uncertainty framework ... but it'd be a huge > undertaking, which is why nobody's tried yet.
Sure. > A smaller-footprint way to fix the immediate problem might be to > change the values of DEFAULT_INEQ_SEL and friends so that they're > even less likely to be matched by accident. That is, instead of > 0.3333333333333333, use 0.333333333333342 or some such. It might Sounds reasonable. > seem that that's just moving the problem around, but I think it > might be possible to show that such a value couldn't be computed > by scalarltsel given a histogram with no more than 10000 members. > (I haven't tried to actually prove that, but it seems intuitive > that the set of possible results would be quantized with no more > than about 5 digits precision.) FWIW, I got the following result on my environment. It seems different enough if this holds on all supported platforms, though there still is a case where the result of a sequence of arithmetics makes false match. x = 0.33333333333333331 d = 1.000000e+30 match d = 1.000000e+31: 0.000000 match x = 0.33333333333334197 d = 0.000000e+00 match d = 1.000000e+00: 0.000000 match ==== t.c #include <stdio.h> #include <math.h> int test(double x) { double d = 1.0; double d0 = 0; fprintf(stderr, "x = %19.17f\n", x); while (d != d0) { double z = floor(d * 3); double z1 = z + 1.0; double y = d / z; double y1 = d / z1; /* Check if both sides of d * 3 doesn't make match */ if (y != x && y1 != x) { fprintf(stderr, " d = %e match\n", d0); fprintf(stderr, " d = %e: %f match\n", d); return 0; } d0 = d; d = d * 10; } } int main(void) { test(0.3333333333333333); test(0.333333333333342); return 0; } ==== -- Kyotaro Horiguchi NTT Open Source Software Center