On Tue, Apr 4, 2017 at 1:00 PM, Claudio Freire <klaussfre...@gmail.com> wrote: >>> If you ask me, I'd just leave: >>> >>> + if (rqlist->hibound == DEFAULT_INEQ_SEL || rqlist->lobound == >>> DEFAULT_INEQ_SEL) >>> + { >>> + /* No point in checking null selectivity, DEFAULT_INEQ_SEL >>> implies we have no stats */ >>> + s2 = DEFAULT_RANGE_INEQ_SEL; >>> + } >>> + else >>> + { >>> ... >>> + s2 = rqlist->hibound + rqlist->lobound - 1.0; >>> + nullsel = nulltestsel(root, IS_NULL, rqlist->var, varRelid); >>> + notnullsel = 1.0 - nullsel; >>> + >>> + /* Adjust for double-exclusion of NULLs */ >>> + s2 += nullsel; >>> + if (s2 <= 0.0) { >>> + if (s2 <= (-1.0e-4 * notnullsel - 1.0e-6)) >>> + { >>> + /* Most likely contradictory clauses found */ >>> + s2 = (1.0e-10 < notnullsel) ? 1.0e-10 : notnullsel; >>> + } >>> + else >>> + { >>> + /* Could be a rounding error */ >>> + s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel; >>> + } >>> + } >>> + } >>> >>> Where (-1.0e-4 * notnullsel - 1.0e-6) is just a very rough (and overly >>> cautious) estimation of the amount of rounding error that could be >>> there with 32-bit floats. >>> >>> The above assumes a non-DEFAULT_INEQ_SEL value in lobound/hibound is >>> an estimation based on stats, maybe approximate, but one that makes >>> sense (ie: preserves the monotonicity of the CPF), and as such >>> negative values are either sign of a contradiction or rounding error. >>> The previous code left the possibility of negatives coming out of >>> default selectivities creeping up on non-DEFAULT_INEQ_SEL estimates, >>> but that doesn't seem possible coming out of scalarineqsel. >>> >>> But I'd like it if Tom could comment on that, since he's the one that >>> did that in commit 547bb4a7f2bdccad9253a99211ce84b3f9de485a (way back >>> in 2004). >>> >>> Barring that, I'd just change the >>> >>> s2 = DEFAULT_RANGE_INEQ_SEL; >>> >>> To >>> >>> s2 = DEFAULT_RANGE_INEQ_SEL * notnullsel; >>> >>> Which makes a lot more sense. >> >> I think to fix this properly the selfuncs would have to return the >> estimate, and nullfrac separately, so the nullfrac could just be >> applied once per expression. There's already hacks in there to get rid >> of the double null filtering. I'm not proposing that as something for >> this patch. It would be pretty invasive to change this. > > There's no need, you can compute the nullfrac with nulltestsel. While > there's some rework involved, it's not a big deal (it just reads the > stats tuple), and it's a clean solution.
I'm marking this Waiting on Author until we figure out what to do with those issues (the null issue quoted above, and the quadratic behavior) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers