Re: Support loser tree for k-way merge

John Naylor Wed, 03 Dec 2025 21:00:38 -0800

On Thu, Dec 4, 2025 at 1:14 AM Sami Imseih <[email protected]> wrote:
> Can we drive the decision for what to do based on optimizer
> stats, i.e. n_distinct and row counts? Not sure what the calculation would
> be specifically, but something else to consider.


It's happened multiple times before that someone proposes a change
that makes sorting faster on some inputs, but turns out to regress on
low cardinality (I've done it myself). It seems to be pretty hard not
to regress that case. Occasionally the author proposes to take
optimizer stats into account, and that was rejected because
cardinality stats are often wildly wrong.

Further, underestimation is far more common than overestimation, in
which case IIUC the planner would just continue to choose the existing
heap method.

> We can still provide the GUC to  override the optimizer decisions,
> but at least the optimizer, given up-to-date stats, may get it right most
> of the time.

I don't have much faith that people will properly set a GUC whose
effects depends on the input characteristics and memory settings.

The new method might be a better overall trade-off, but we'd need some
more comprehensive measurements to know what we're dealing with.

--
John Naylor
Amazon Web Services

Re: Support loser tree for k-way merge

Reply via email to