Over in the abbreviated keys for numeric thread, Tomas Vondra reports
a case where the ad-hoc cost model of abbreviated keys/sortsupport for
text is far too conservative, and aborts a case where abbreviated keys
can still greatly help.

Previously, I proposed additions to the cost model that dealt with all
this, by passing down a reltuples hint from the optimizer or a
relcache entry to tuplesort. Robert seemed to think that this was a
little bit much, and that it could be discussed at a later point. It's
clear that we need to do something about cases like the one reported
by Tomas, though.

Attached patch adds a decay to the threshold that (estimated)
abbreviated key cardinality must cross as a proportion of the
(estimated) cardinality of the original set of strings to be sorted,
while also quadrupling the initial required proportional cardinality
to 20% of full key cardinality (but for just the first 10,000 strings,
before this threshold is decayed aggressively). This seems
significantly less invasive than what I previously proposed, while
still being effective. I suggest that this be treated as a bugfix,
since Tomas reports a 5% -7% regression with his case. OTOH, with this
patch, I saw the same case have a ~300% improvement in performance.

The adjusted cost model strongly prefers to decide to abort early,
which seems desirable. In one way this makes it a little more
vulnerable to skewed sets, since early data matters more, but in
another more important way it's less vulnerable, since perhaps that
skew indicates a general skew in the data, a skew that makes the idea
of measuring the cardinality of abbreviated keys as a proxy for full
key cardinality less useful.

Thoughts?
-- 
Peter Geoghegan
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index 3edd283..d7ff7ad 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -66,6 +66,7 @@ typedef struct
 	bool				collate_c;
 	hyperLogLogState	abbr_card;	/* Abbreviated key cardinality state */
 	hyperLogLogState	full_card;	/* Full key cardinality state */
+	double				prop_card;	/* Required cardinality proportion */
 #ifdef HAVE_LOCALE_T
 	pg_locale_t locale;
 #endif
@@ -1845,6 +1846,7 @@ btsortsupport_worker(SortSupport ssup, Oid collid)
 		 */
 		if (abbreviate)
 		{
+			tss->prop_card = 0.20;
 			initHyperLogLog(&tss->abbr_card, 10);
 			initHyperLogLog(&tss->full_card, 10);
 			ssup->abbrev_full_comparator = ssup->comparator;
@@ -2125,7 +2127,7 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
 	Assert(ssup->abbreviate);
 
 	/* Have a little patience */
-	if (memtupcount < 20)
+	if (memtupcount < 100)
 		return false;
 
 	abbrev_distinct = estimateHyperLogLog(&tss->abbr_card);
@@ -2151,8 +2153,9 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
 	{
 		double norm_abbrev_card = abbrev_distinct / (double) memtupcount;
 
-		elog(DEBUG_elog_output, "abbrev_distinct after %d: %f (key_distinct: %f, norm_abbrev_card: %f)",
-			 memtupcount, abbrev_distinct, key_distinct, norm_abbrev_card);
+		elog(DEBUG_elog_output, "abbrev_distinct after %d: %f (key_distinct: %f, norm_abbrev_card: %f, prop_card: %f)",
+			 memtupcount, abbrev_distinct, key_distinct, norm_abbrev_card,
+			 tss->prop_card);
 	}
 #endif
 
@@ -2172,8 +2175,38 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
 	 * abbreviated comparison with a cheap memcmp()-based authoritative
 	 * resolution are equivalent.
 	 */
-	if (abbrev_distinct > key_distinct * 0.05)
+	if (abbrev_distinct > key_distinct * tss->prop_card)
+	{
+		/*
+		 * When we have exceeded 10,000 tuples, decay required cardinality
+		 * aggressively for next call.
+		 *
+		 * This is useful because the number of comparisons required on average
+		 * increases at a linearithmic rate, and at roughly 10,000 tuples that
+		 * factor will start to dominate over the linear costs of string
+		 * transformation (this is a conservative estimate).  The decay rate is
+		 * chosen to be a little less aggressive than halving -- which (since
+		 * we're called at points at which memtupcount has doubled) would never
+		 * see the cost model actually abort past the first call following a
+		 * decay.  This decay rate is mostly a precaution against a sudden,
+		 * violent swing in how well abbreviated cardinality tracks full key
+		 * cardinality.  The decay also serves to prevent a marginal case from
+		 * being aborted too late, when too much has already been invested in
+		 * string transformation.
+		 *
+		 * It's possible for sets of several million distinct strings with mere
+		 * tens of thousands of distinct abbreviated keys to still benefit very
+		 * significantly.  This will generally occur provided each abbreviated
+		 * key is a proxy for a roughly uniform number of the set's full keys.
+		 * If it isn't so, we hope to catch that early and abort.  If it isn't
+		 * caught early, by the time the problem is apparent it's probably not
+		 * worth aborting.
+		 */
+		if (memtupcount > 10000)
+			tss->prop_card *= 0.65;
+
 		return false;
+	}
 
 	/*
 	 * Abort abbreviation strategy.
@@ -2187,8 +2220,8 @@ bttext_abbrev_abort(int memtupcount, SortSupport ssup)
 	 * lose but much to gain, which our strategy reflects.
 	 */
 #ifdef DEBUG_ABBREV_KEYS
-	elog(DEBUG_elog_output, "would have aborted abbreviation due to worst-case at %d. abbrev_distinct: %f, key_distinct: %f",
-		 memtupcount, abbrev_distinct, key_distinct);
+	elog(DEBUG_elog_output, "would have aborted abbreviation due to worst-case at %d. abbrev_distinct: %f, key_distinct: %f, prop_card: %f",
+		 memtupcount, abbrev_distinct, key_distinct, tss->prop_card);
 	/* Actually abort only when debugging is disabled */
 	return false;
 #endif
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to