Thank you for taking the time to look this through. After reading your answer it obviously makes perfect sense, I was just thrown off by a fact that when it started happening, it happened on every query execution (for the same test query that I used). But I failed to think about trying different search keywords to see if that made any difference in the query plan / estimations.
I'll investigate this further and will try to clean the indexes up and make sure the filter has correct index to use. And yes, the schema attachment seems a bit off, the DDL copy didn't include the ordering for the fields inside composite indexes I think, so that's why they appear the same. Very much appreciated of your time. With best, Henrik. On Wed, May 31, 2023 at 6:31 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > "Henrik Peinar (nodeSWAT.com)" <hen...@nodeswat.com> writes: > > *Quick description: *After upgrading our Aurora PostgreSQL engine from > v11 > > to v15, one of our often run SQL's started taking tens of seconds instead > > of running sub 100ms. Explain analyze showed that the query planner had > > switched to using incremental sort. Running ANALYZE on the table fixed > the > > issue temporarily, but it came back after few days. > > Hmm. I think it's quite accidental that you get one plan over the other, > because it looks like the key difference is something the planner doesn't > account for. In the fast case you have > > -> Index Scan using products_pinned_at_listed_at_ix on products > (cost=0.43..3929423.12 rows=26523 width=1463) (actual time=2.325..32.872 > rows=61 loops=1) > Filter: ((desc_tsv @@ '''nike'':*'::tsquery) AND (status = > 'published'::prod_status)) > Rows Removed by Filter: 3376 > > The index is only being used to produce ordered output here: the filter > condition isn't related to the index. And what we see is that the > query is fast because the desired rows are found in the first 3376+61 > rows visited in this direction. Meanwhile in the slow case you have > > -> Index Scan Backward using products_pinned_at_ix on > products (cost=0.43..1172249.47 rows=26139 width=1460) (actual > time=5.263..5203.180 rows=18411 loops=1) > Filter: ((desc_tsv @@ '''nike'':*'::tsquery) AND > (status = 'published'::prod_status)) > Rows Removed by Filter: 5415895 > > Again, the filter condition isn't exploiting the index, we're just using > the index to (partially) satisfy the ORDER BY. This one takes a long time > because it has to trawl through 5415895+61 rows before reaching the LIMIT. > > So AFAICS, the runtime differential is mostly/entirely because the rows > satisfying the filter condition are located near one end of the range of > pinned_at. That is not a correlation that the planner knows anything > about, so it's unable to see that these two ways of scanning the table > will have significantly different costs. Moreover, I think you'd be > mistaken to draw any great conclusions from this specific example, > because with some other search term(s) the results might be totally > different due to the required rows not falling in the same place. > > What I'd think about if this type of query is important is to set up > an index that can actually be used to satisfy the filter condition, > so that you're not forcing it into "scan the whole table till you > find the rows you want". It looks like you already have such an > index, ie a GIN index on the desc_tsv column, although I don't trust > that your schema attachment is actually accurate because if it is > then you have a bunch of duplicative indexes. You might try > dropping the other indexes to see if you can coerce the planner > into using that one, and then seeing what the cost estimate is. > > regards, tom lane >