Heikki Linnakangas <hlinn...@iki.fi> writes: > I'm talking about the code that reads a bunch of from each tape, loading > them into the memtuples array. That code was added by Tom Lane, back in > 1999:
> commit cf627ab41ab9f6038a29ddd04dd0ff0ccdca714e > Author: Tom Lane <t...@sss.pgh.pa.us> > Date: Sat Oct 30 17:27:15 1999 +0000 > Further performance improvements in sorting: reduce number of comparisons > during initial run formation by keeping both current run and next-run > tuples in the same heap (yup, Knuth is smarter than I am). And, during > merge passes, make use of available sort memory to load multiple tuples > from any one input 'tape' at a time, thereby improving locality of > access to the temp file. > So apparently there was a benefit back then, but is it still worthwhile? I'm fairly sure that the point was exactly what it said, ie improve locality of access within the temp file by sequentially reading as many tuples in a row as we could, rather than grabbing one here and one there. It may be that the work you and Peter G. have been doing have rendered that question moot. But I'm a bit worried that the reason you're not seeing any effect is that you're only testing situations with zero seek penalty (ie your laptop's disk is an SSD). Back then I would certainly have been testing with temp files on spinning rust, and I fear that this may still be an issue in that sort of environment. The relevant mailing list thread seems to be "sort on huge table" in pgsql-hackers in October/November 1999. The archives don't seem to have threaded that too successfully, but here's a message specifically describing the commit you mention: https://www.postgresql.org/message-id/2726.941493808%40sss.pgh.pa.us and you can find the rest by looking through the archive summary pages for that interval. The larger picture to be drawn from that thread is that we were seeing very different performance characteristics on different platforms. The specific issue that Tatsuo-san reported seemed like it might be down to weird read-ahead behavior in a 90s-vintage Linux kernel ... but the point that this stuff can be environment-dependent is still something to take to heart. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers