Re: MergeAppend could consider sorting cheapest child path

Andrei Lepikhov Tue, 03 Jun 2025 06:54:07 -0700

On 3/6/2025 15:38, Alexander Korotkov wrote:

On Tue, Jun 3, 2025 at 4:23 PM Andrei Lepikhov <[email protected]> wrote:

To establish a stable foundation for discussion, I conducted simple
tests - see, for example, a couple of queries in the attachment. As I
see it, Sort->Append works faster: in my test bench, it takes 1250ms on
average versus 1430ms, and it also has lower costs - the same for data
with and without massive numbers of duplicates. Playing with sizes of
inputs, I see the same behaviour.


I run your tests.  For Sort(Append()) case I've got actual
time=811.047..842.473.  For MergeAppend case I've got actual time
actual time=723.678..967.004.  That looks interesting.  At some point
we probably should teach our Sort node to start returning tuple before
finishing the last merge stage.

However, I think costs are not adequate to the timing.  Our cost model
predicts that startup cost of MergeAppend is less than startup cost of
Sort(Append()).  And that's correct.  However, in fast total time of
MergeAppend is bigger than total time of Sort(Append()).  The
differences in these two cases are comparable.  I think we need to
just our cost_sort() to reflect that.

May you explain your idea? As I see (and have shown in the previousmessage), the total cost of the Sort->Append is fewer thanMergeAppend->Sort.Additionally, as I mentioned earlier, the primary reason for choosingMergeAppend in the regression test was a slight total cost differencethat triggered the startup cost comparison.May you show the query and its explain, that is a subject of concern foryou?


--
regards, Andrei Lepikhov

Re: MergeAppend could consider sorting cheapest child path

Reply via email to