Re: [HACKERS] Using quicksort for every external sort run

Tomas Vondra Sun, 03 Apr 2016 05:25:44 -0700

Hi,

So, let me sum this up, the way I understand the current status.



1) overall, the patch seems to be a clear performance improvement

There's far more "green" cells than "red" ones in the spreadsheets, andthe patch often shaves off 30-75% of the sort duration. Improvements arepretty much all over the board, for all data sets (low/high/uniquecardinality, initial ordering) and data types.



2) it's unlikely we can improve the performance further

The regressions are limited to low work_mem settings, which we believeare not representative (or at least not as much as the higher work_memvalues), for two main reasons.

Firstly, if you need to sort a lot of data (e.g. 10M, as benchmarked),it's quite reasonable to use larger work_mem values. It'd be a bitbackwards to reject a patch that gets you 2-4x speedup with enoughmemory, on the grounds that it may have negative impact withunreasonably small work_mem values.

Secondly, master is faster only if there's enough on-CPU cache for thereplacement sort (for the memtuples heap), but the benchmark is notrealistic in this respect as it only ran 1 query at a time, so it usedthe whole cache (6MB for i5, 12MB for Xeon).

In reality there will be multiple processes running at the same time(e.g backends when running parallel query), significantly reducing theamount of cache per process, making the replacement sort inefficient andthus eliminating the regressions (by making the master slower).



3) replacement_sort_mem GUC

I'm not quite sure what's the plan with this GUC. It was useful fordevelopment, but it seems to me it's pretty difficult to tune it inpractice (especially if you don't know the internals, which usersgenerally don't).

The current patch includes the new GUC right next to work_mem, whichseems rather unfortunate - I do expect users to simply mess withassuming "more is better" which seems to be rather poor idea.

So I think we should either remove the GUC entirely, or move it to thedeveloper section next to trace_sort (and removing it from the conf).

I'm wondering whether 16MB default is not a bit too much, actually. Asexplained before, that's not the amount of cache we should expect perprocess, so maybe ~2-4MB would be a better default value?

Also, not what I'm re-reading the docs for the GUC, I realize it alsodepends on how the input data is correlated - that seems like a ratheruseless criteria for tuning, though, because that varies per sort node,so using that for a GUC value set in postgresql.conf does not seem verywise. Actually even on per-query basis that's rather dubious, as itdepends on how the sort node gets data (some nodes preserve ordering,some don't).

BTW couldn't we tune the value automatically for each sort, using thepg_stats.correlation for the sort keys, when available (increasing thereplacement_sort_mem when correlation is close to 1.0)? Wouldn't thatimprove at least some of the regressions?



regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

Reply via email to