On Thu, Apr 13, 2017 at 9:51 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > I'm fairly sure that the point was exactly what it said, ie improve > locality of access within the temp file by sequentially reading as many > tuples in a row as we could, rather than grabbing one here and one there. > > It may be that the work you and Peter G. have been doing have rendered > that question moot. But I'm a bit worried that the reason you're not > seeing any effect is that you're only testing situations with zero seek > penalty (ie your laptop's disk is an SSD). Back then I would certainly > have been testing with temp files on spinning rust, and I fear that this > may still be an issue in that sort of environment.
I actually think Heikki's work here would particularly help on spinning rust, especially when less memory is available. He specifically justified it on the basis of it resulting in a more sequential read pattern, particularly when multiple passes are required. > The larger picture to be drawn from that thread is that we were seeing > very different performance characteristics on different platforms. > The specific issue that Tatsuo-san reported seemed like it might be > down to weird read-ahead behavior in a 90s-vintage Linux kernel ... > but the point that this stuff can be environment-dependent is still > something to take to heart. BTW, I'm skeptical of the idea of Heikki's around killing polyphase merge itself at this point. I think that keeping most tapes active per pass is useful now that our memory accounting involves handing over an even share to each maybe-active tape for every merge pass, something established by Heikki's work on external sorting. Interestingly enough, I think that Knuth was pretty much spot on with his "sweet spot" of 7 tapes, even if you have modern hardware. Commit df700e6 (where the sweet spot of merge order 7 was no longer always used) was effective because it masked certain overheads that we experience when doing multiple passes, overheads that Heikki and I mostly removed. This was confirmed by Robert's testing of my merge order cap work for commit fc19c18, where he found that using 7 tapes was only slightly worse than using many hundreds of tapes. If we could somehow be completely effective in making access to logical tapes perfectly sequential, then 7 tapes would probably be noticeably *faster*, due to CPU caching effects. Knuth was completely correct to say that it basically made no difference once more than 7 tapes are used to merge, because he didn't have logtape.c fragmentation to worry about. -- Peter Geoghegan VMware vCenter Server https://www.vmware.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers