OK, time to revive this old thread ... On 09/23/2017 05:27 PM, Tom Lane wrote: > Tomas Vondra <tomas.von...@2ndquadrant.com> writes: >>>> [ scalarineqsel may fall over when used by extension operators ] > >> What about using two-pronged approach: > >> 1) fall back to mid bucket in back branches (9.3 - 10) >> 2) do something smarter (along the lines you outlined) in PG11 > > Sure. We need to test the fallback case anyway. >
Attached is a minimal fix adding a flag to convert_numeric_to_scalar, tracking when it fails because of unsupported data type. If any of the 3 calls (value + lo/hi boundaries) sets it to 'true' we simply fall back to the default estimate (0.5) within the bucket. >>> [ sketch of a more extensible design ] > >> Sounds reasonable to me, I guess - I can't really think about anything >> simpler giving us the same flexibility. > > Actually, on further thought, that's still too simple. If you look > at convert_string_to_scalar() you'll see it's examining all three > values concurrently (the probe value, of one datatype, and two bin > boundary values of possibly a different type). The reason is that > it's stripping off whatever common prefix those strings have before > trying to form a numeric equivalent. While certainly > convert_string_to_scalar is pretty stupid in the face of non-ASCII > sort orders, the prefix-stripping is something I really don't want > to give up. So the design I sketched of considering each value > totally independently isn't good enough. > > We could, perhaps, provide an API that lets an operator estimation > function replace convert_to_scalar in toto, but that seems like > you'd still end up duplicating code in many cases. Not sure about > how to find a happy medium. > I plan to work on this improvement next, once I polish a couple of other patches for the upcoming commit fest. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c index fcc8323..5f6019a 100644 --- a/src/backend/utils/adt/selfuncs.c +++ b/src/backend/utils/adt/selfuncs.c @@ -176,7 +176,7 @@ static bool estimate_multivariate_ndistinct(PlannerInfo *root, static bool convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue, Datum lobound, Datum hibound, Oid boundstypid, double *scaledlobound, double *scaledhibound); -static double convert_numeric_to_scalar(Datum value, Oid typid); +static double convert_numeric_to_scalar(Datum value, Oid typid, bool *unknown_type); static void convert_string_to_scalar(char *value, double *scaledvalue, char *lobound, @@ -4071,10 +4071,15 @@ convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue, case REGDICTIONARYOID: case REGROLEOID: case REGNAMESPACEOID: - *scaledvalue = convert_numeric_to_scalar(value, valuetypid); - *scaledlobound = convert_numeric_to_scalar(lobound, boundstypid); - *scaledhibound = convert_numeric_to_scalar(hibound, boundstypid); - return true; + { + bool unknown_type = false; + + *scaledvalue = convert_numeric_to_scalar(value, valuetypid, &unknown_type); + *scaledlobound = convert_numeric_to_scalar(lobound, boundstypid, &unknown_type); + *scaledhibound = convert_numeric_to_scalar(hibound, boundstypid, &unknown_type); + + return (!unknown_type); + } /* * Built-in string types @@ -4147,7 +4152,7 @@ convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue, * Do convert_to_scalar()'s work for any numeric data type. */ static double -convert_numeric_to_scalar(Datum value, Oid typid) +convert_numeric_to_scalar(Datum value, Oid typid, bool *unknown_type) { switch (typid) { @@ -4185,9 +4190,11 @@ convert_numeric_to_scalar(Datum value, Oid typid) /* * Can't get here unless someone tries to use scalarineqsel() on an - * operator with one numeric and one non-numeric operand. + * operator with one numeric and one non-numeric operand. Or more + * precisely, operand that we don't know how to transform to scalar. */ - elog(ERROR, "unsupported type: %u", typid); + *unknown_type = true; + return 0; }