On Sun, Dec 15, 2024 at 10:10 AM Tomas Vondra <to...@vondra.me> wrote: > > I've been looking at some other vacuum-related patches, so I took a look > at these remaining bits too. I don't have much to say about the code > (seems perfectly fine to me), so I decided to do a bit of testing.
Thanks for doing this! > I did a simple test (see the attached .sh script) that runs vacuum on a > table with different fractions of rows updated. The table has ~100 rows > per page, with 50% fill factor, and the updates touch ~1%, 0.1%, 0.05% > rows, and then even smaller fractions up to 0.0001%. This determines how > many pages get touched. With 1% fraction almost every page gets > modified, then the fraction quickly drops. With 0.0001% only about 0.01% > of pages gets modified. > > Attached is a CSV with raw results from two machines, and also a PDF > with a comparison of the two build (master vs. patched). In vast > majority of cases, the patched build is much faster, usually 2-3x. > > There are a couple cases where it regresses by ~30%, but only on one of > the machines with older storage (RAID on SATA SSDs), with 1% rows > updated (which means almost all pages get modified). So this is about > sequential access. It's a bit weird, but probably not a fatal issue. Actually, while rebasing these with the intent to start investigating the regressions you report, I noticed something quite wrong with my code. In lazy_scan_heap(), I had put read_stream_next_buffer() before a few expensive operations (like a round of index vacuuming and dead item reaping if the TID store is full). It returns the pinned buffer, so this could mean a buffer remaining pinned for a whole round of vacuuming of items from the TID store. Not good. Anyway, this version has fixed that. I do wonder if there is any chance that this affected your benchmarks. I've attached a new v13. Perhaps you could give it another go and see if the regressions are still there? > There is also a couple smaller regressions on the "xeon" machine with > M.2 SSD, for lower update fractions - 0.05% rows updated means 5% pages > need vacuum. But I think this is more a sign of us being too aggressive > in detecting (and forcing) sequential patterns - on master, we end up > scanning 50% pages, thanks to SKIP_PAGES_THRESHOLD. I'll start a new > thread about that ... Hmm. If only 5% of pages need vacuum, then you are right, readahead seems like it would often be wasted (depending on _which_ 5% needs vacuuming). But, the read stream API will only prefetch and build up larger I/Os of blocks we actually need. So, it seems like this would behave the same on master and with the patch. That is, both would do extra unneeded I/O because of SKIP_PAGES_THRESHOLD. Is the 5% of the table that needs vacuuming dispersed randomly throughout or concentrated? If it is concentrated and readahead would be useful, then maybe we need to increase read_ahead_kb. You mentioned off-list that the read_ahead_kb on this machine for this SSD was 128kB -- the same as io_combine_limit. If we want to read ahead, issuing 128 kB I/Os might be thwarting us and increasing latency. - Melanie
v13-0001-Use-streaming-I-O-in-VACUUM-s-first-phase.patch
Description: Binary data
v13-0002-Use-streaming-I-O-in-VACUUM-s-third-phase.patch
Description: Binary data