OK, time to revive this old thread ...
On 09/23/2017 05:27 PM, Tom Lane wrote:
> Tomas Vondra <[email protected]> writes:
>>>> [ scalarineqsel may fall over when used by extension operators ]
>
>> What about using two-pronged approach:
>
>> 1) fall back to mid bucket in back branches (9.3 - 10)
>> 2) do something smarter (along the lines you outlined) in PG11
>
> Sure. We need to test the fallback case anyway.
>
Attached is a minimal fix adding a flag to convert_numeric_to_scalar,
tracking when it fails because of unsupported data type. If any of the 3
calls (value + lo/hi boundaries) sets it to 'true' we simply fall back
to the default estimate (0.5) within the bucket.
>>> [ sketch of a more extensible design ]
>
>> Sounds reasonable to me, I guess - I can't really think about anything
>> simpler giving us the same flexibility.
>
> Actually, on further thought, that's still too simple. If you look
> at convert_string_to_scalar() you'll see it's examining all three
> values concurrently (the probe value, of one datatype, and two bin
> boundary values of possibly a different type). The reason is that
> it's stripping off whatever common prefix those strings have before
> trying to form a numeric equivalent. While certainly
> convert_string_to_scalar is pretty stupid in the face of non-ASCII
> sort orders, the prefix-stripping is something I really don't want
> to give up. So the design I sketched of considering each value
> totally independently isn't good enough.
>
> We could, perhaps, provide an API that lets an operator estimation
> function replace convert_to_scalar in toto, but that seems like
> you'd still end up duplicating code in many cases. Not sure about
> how to find a happy medium.
>
I plan to work on this improvement next, once I polish a couple of other
patches for the upcoming commit fest.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fcc8323..5f6019a 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -176,7 +176,7 @@ static bool estimate_multivariate_ndistinct(PlannerInfo *root,
static bool convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue,
Datum lobound, Datum hibound, Oid boundstypid,
double *scaledlobound, double *scaledhibound);
-static double convert_numeric_to_scalar(Datum value, Oid typid);
+static double convert_numeric_to_scalar(Datum value, Oid typid, bool *unknown_type);
static void convert_string_to_scalar(char *value,
double *scaledvalue,
char *lobound,
@@ -4071,10 +4071,15 @@ convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue,
case REGDICTIONARYOID:
case REGROLEOID:
case REGNAMESPACEOID:
- *scaledvalue = convert_numeric_to_scalar(value, valuetypid);
- *scaledlobound = convert_numeric_to_scalar(lobound, boundstypid);
- *scaledhibound = convert_numeric_to_scalar(hibound, boundstypid);
- return true;
+ {
+ bool unknown_type = false;
+
+ *scaledvalue = convert_numeric_to_scalar(value, valuetypid, &unknown_type);
+ *scaledlobound = convert_numeric_to_scalar(lobound, boundstypid, &unknown_type);
+ *scaledhibound = convert_numeric_to_scalar(hibound, boundstypid, &unknown_type);
+
+ return (!unknown_type);
+ }
/*
* Built-in string types
@@ -4147,7 +4152,7 @@ convert_to_scalar(Datum value, Oid valuetypid, double *scaledvalue,
* Do convert_to_scalar()'s work for any numeric data type.
*/
static double
-convert_numeric_to_scalar(Datum value, Oid typid)
+convert_numeric_to_scalar(Datum value, Oid typid, bool *unknown_type)
{
switch (typid)
{
@@ -4185,9 +4190,11 @@ convert_numeric_to_scalar(Datum value, Oid typid)
/*
* Can't get here unless someone tries to use scalarineqsel() on an
- * operator with one numeric and one non-numeric operand.
+ * operator with one numeric and one non-numeric operand. Or more
+ * precisely, operand that we don't know how to transform to scalar.
*/
- elog(ERROR, "unsupported type: %u", typid);
+ *unknown_type = true;
+
return 0;
}