On Tue, Jan 21, 2020 at 8:11 AM Andres Freund <and...@anarazel.de> wrote: > FWIW, I think we should just flat out delete all this logic, and replace > it with a few explicit PrefetchBuffer() calls. Just by chance I > literally just now sped up a VACUUM by more than a factor of 10, by > manually prefetching buffers. At least the linux kernel readahead logic > doesn't deal well with reading and writing to different locations in the > same file, and that's what the ringbuffer pretty invariably leads to for > workloads that aren't cached.
Interesting. Andrew Gierth made a similar observation on FreeBSD, and showed that by patching his kernel to track sequential writes and sequential reads separately he could improve performance, and I reproduced the same speedup in a patch of my own based on his description (that, erm, I've lost). It's not only VACUUM, it's anything that is writing to a lot of sequential blocks, since the writeback trails along behind by some distance (maybe a ring buffer, maybe all of shared buffers, whatever). The OS sees you flipping back and forth between single block reads and writes and thinks it's random. I didn't investigate this much but it seemed that ZFS was somehow smart enough to understand what was happening at some level but other filesystems were not.