On Tue, Aug 5, 2014 at 8:55 PM, Noah Misch <n...@leadboat.com> wrote: >> However, with work_mem set low enough to get an external sort, the >> difference is more interesting. If I set work_mem to 10 MB, then the >> query takes about 10.7 seconds to execute with a suitably patched >> Postgres. Whereas on master, it consistently takes a full 69 seconds. >> That's the largest improvement I've seen so far, for any case. > > Comparator cost affects external sorts more than it affects internal sorts. > When I profiled internal and external int4 sorting, btint4cmp() was 0.37% of > the internal sort profile and 10.26% of the external sort profile.
I took another look at this. If I run "dd if=/dev/zero of=/home/pg/test", I can see with iotop that that has about 45 M/s "total disk write" fairly sustainably, with occasional mild blips during write-back. This is a Crucial mobile SSD, with an ext4/lvm file system, and happens to be what is close at hand. If I run the same external sorting query with a patched Postgres, I see 24 M/s total disk write throughout. With master, it's about 6 M/s, and falls to 0 during the final 36-way merge. I'm not sure if the same thing occurs with patched during the final merge, because the available resolution isn't good enough to be able to tell. Anyway, it's pretty clear that when patched, the external sort on text is, if not totally I/O bound, much closer to being I/O bound. A good external sort algorithm should be at least close to totally I/O bound. This makes I/O parallelism a viable strategy for speeding up sorts, where it might not otherwise be. I've heard of people using a dedicated temp tablespace disk with Postgres to speed up sorting, but that always seemed to be about reducing the impact on the heap filesystem, or vice versa. I've never heard of anyone using multiple disks to speed up sorting with Postgres (which I don't presume means it hasn't been done at least somewhat effectively). However, with external sort benchmarks (like http://sortbenchmark.org), using I/O parallelism strategically seems to be table stakes for external sort entrants. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers