On Mon, Sep 16, 2024 at 6:05 PM Tomas Vondra <to...@vondra.me> wrote: > For example, one of the slowed down queries is query 702 (top of page 8 > in the PDF). The query is pretty simple: > > explain (analyze, timing off, buffers off) > select id1,id2 from t_1000000_1000_1_2 > where NOT (id1 in (:list)) AND (id2 = :value); > > and it was executed on a table with random data in two columns, each > with 1000 distinct values.
I cannot recreate this problem using the q702.sql repro you provided. Feels like I'm missing a step, because I find that skip scan wins nicely here. > This is perfectly random data, so a great > match for the assumptions in costing etc. FWIW, I wouldn't say that this is a particularly sympathetic case for skip scan. It's definitely still a win, but less than other cases I can imagine. This is due to the relatively large number of rows returned by the scan. Plus 1000 distinct leading values for a skip array isn't all that low, so we end up scanning over 1/3 of all of the leaf pages in the index. BTW, be careful to distinguish between leaf pages and internal pages when interpreting "Buffers:" output with the patch. Generally speaking, the patch repeats many internal page accesses, which needs to be taken into account when compare "Buffers:" counts against master. It's not uncommon for 3/4 or even 4/5 of all index page hits to be for internal pages with the patch. Whereas on master the number of internal page hits is usually tiny. This is one reason why the additional context provided by "Index Searches:" can be helpful. > But with uncached data, this runs in ~50 ms on master, but takes almost > 200 ms with skipscan (these timings are from my laptop, but similar to > the results). Even 50ms seems really slow for your test case -- with or without my patch applied. Are you sure that this wasn't an assert-enabled build? There's lots of extra assertions for the code paths used by skip scan for this, which could explain the apparent regression. I find that this same query takes only ~2.056 ms with the patch. When I disabled skip scan locally via "set skipscan_prefix_cols = 0" (which should give me behavior that's pretty well representative of master), it takes ~12.039 ms. That's exactly what I'd expect for this query: a solid improvement, though not the really enormous ones that you'll see when skip scan is able to avoid reading many of the index pages that master reads. -- Peter Geoghegan