Re: Inconsistent query performance based on relation hit frequency

Achilleas Mantzios - cloud Thu, 27 Jun 2024 06:28:15 -0700


On 6/27/24 03:50, Laura Hausmann wrote:

Heya, I hope the title is somewhat descriptive. I'm working on adecentralized social media platform and have encountered the followingperformance issue/quirk, and would like to ask for input, since I'mnot sure I missed anything.
I'm running PostgreSQL 16.2 on x86_64-pc-linux-gnu, compiled by gcc(GCC) 13.2.1 20230801, 64-bit, running on an Arch Linux box with 128GBof RAM & an 8c16t Ryzen 3700X CPU. Disk is a NVME RAID0.
Postgres configuration: https://paste.depesz.com/s/iTv
I'm using autovacuum defaults & am running a manual VACUUM ANALYZE onthe entire database nightly.
The relevant database parts consist of a table with posts (note), atable with users (user), and a table with follow relationships(following). The query in question takes the most recent n (e.g. 50)posts, filtered by the users follow relations.
The note table on my main production instance grows by about 200kentries per week.
Schema & tuple counts: https://paste.depesz.com/s/cfI
Here's the shortest query I can reproduce the issue with:https://paste.depesz.com/s/RoCSpecifically, it works well for users that follow a relatively largeamount of users (https://explain.depesz.com/s/tJnB), and is very slowfor users that follow a low amount of users / users that postinfrequently (https://explain.depesz.com/s/Mtyr).
From what I can tell, this is because this query causes postgres toscan the note table from the bottom (most recent posts first),discarding anything by users that are not followed.
Curiously, rewriting the query like this(https://paste.depesz.com/s/8rN) causes the opposite problem, thisquery is fast for users with a low following count(https://explain.depesz.com/s/yHAz#query), and slow for users with ahigh following count (https://explain.depesz.com/s/1v6L,https://explain.depesz.com/s/yg3N).
These numbers are even further apart (to the point of 10-30s querytimeouts) in the most extreme outlier cases I've observed, and onlower-end hardware.
I've sidestepped the issue by running either of these queries based ona heuristic that checks whether there are more than 250 matching postsin the past 7 days, recomputed once per day for every user, but itfeels more like a hack than a proper solution.
I'm able to make the planner make a sensible decision in both cases bysetting enable_sort = off, but that tanks performance for the rest ofmy application, is even more of a hack, and doesn't seem to work inall cases.
I've been able to reproduce this issue with mock data(https://paste.depesz.com/s/CnY), though it's not generating quite thesame query plans and is behaving a bit differently.

Before deep dive into everybody's favorite topic you may simplify yourquery :

select o.* from objects o where o."userId" = :userid UNION select o.*from objects o where o."userId" IN


(SELECT r."followeeId" FROM relationships r WHERE r."followerId"= :userid)

postgres@[local]/laura=# explain (analyze, buffers) select o.* fromobjects o where o."userId" = 1 UNION select o.* from objects o whereo."userId" IN (SELECT r."followeeId" FROM relati

onships r WHERE r."followerId"=1) ORDER BY id DESC ;
                                                                                
        QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---

Sort (cost=8622.04..8767.98 rows=58376 width=40) (actualtime=1.041..1.053 rows=314 loops=1)

  Sort Key: o.id DESC
  Sort Method: quicksort  Memory: 39kB
  Buffers: shared hit=1265

-> HashAggregate (cost=3416.92..4000.68 rows=58376 width=40)(actual time=0.900..1.006 rows=314 loops=1)

        Group Key: o.id, o."userId", o.data
        Batches: 1  Memory Usage: 1585kB
        Buffers: shared hit=1265

-> Append (cost=0.42..2979.10 rows=58376 width=40) (actualtime=0.024..0.816 rows=314 loops=1)

              Buffers: shared hit=1265

-> Index Scan using "objects_userId_idx" on objects o (cost=0.42..3.10 rows=17 width=21) (actual time=0.003..0.003 rows=0loops=1)

                    Index Cond: ("userId" = 1)
                    Buffers: shared hit=3

-> Nested Loop (cost=0.70..2684.12 rows=58359 width=21)(actual time=0.020..0.794 rows=314 loops=1)

                    Buffers: shared hit=1262

-> Index Only Scan using"relationships_followerId_followeeId_idx" on relationships r (cost=0.28..7.99 rows=315 width=4) (actual time=0.011..0.030 rows=315loops=

1)
                          Index Cond: ("followerId" = 1)
                          Heap Fetches: 0
                          Buffers: shared hit=3

-> Index Scan using "objects_userId_idx" onobjects o_1 (cost=0.42..6.65 rows=185 width=21) (actualtime=0.002..0.002 rows=1 loops=315)

                          Index Cond: ("userId" = r."followeeId")
                          Buffers: shared hit=1259
Planning:
  Buffers: shared hit=8
Planning Time: 0.190 ms
Execution Time: 1.184 ms
(26 rows)

Time: 1.612 ms

postgres@[local]/laura=# explain (analyze, buffers) select o.* fromobjects o where o."userId" = 4 UNION select o.* from objects o whereo."userId" IN (SELECT r."followeeId" FROM relati

onships r WHERE r."followerId"=4) ORDER BY id DESC ;
                                                                                
      QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------Sort (cost=27.53..28.03 rows=202 width=40) (actual time=0.015..0.016rows=0 loops=1)

  Sort Key: o.id DESC
  Sort Method: quicksort  Memory: 25kB
  Buffers: shared hit=5

-> HashAggregate (cost=17.77..19.79 rows=202 width=40) (actualtime=0.013..0.013 rows=0 loops=1)

        Group Key: o.id, o."userId", o.data
        Batches: 1  Memory Usage: 40kB
        Buffers: shared hit=5

-> Append (cost=0.42..16.26 rows=202 width=40) (actualtime=0.011..0.012 rows=0 loops=1)

              Buffers: shared hit=5

-> Index Scan using "objects_userId_idx" on objects o (cost=0.42..3.10 rows=17 width=21) (actual time=0.005..0.005 rows=0loops=1)

                    Index Cond: ("userId" = 4)
                    Buffers: shared hit=3

-> Nested Loop (cost=0.70..12.14 rows=185 width=21)(actual time=0.005..0.005 rows=0 loops=1)

                    Buffers: shared hit=2

-> Index Only Scan using"relationships_followerId_followeeId_idx" on relationships r (cost=0.28..1.39 rows=1 width=4) (actual time=0.005..0.005 rows=0loops=1)

                          Index Cond: ("followerId" = 4)
                          Heap Fetches: 0
                          Buffers: shared hit=2

-> Index Scan using "objects_userId_idx" onobjects o_1 (cost=0.42..8.90 rows=185 width=21) (never executed)

                          Index Cond: ("userId" = r."followeeId")
Planning:
  Buffers: shared hit=8
Planning Time: 0.201 ms
Execution Time: 0.048 ms
(25 rows)

Time: 0.490 ms

I'd appreciate any and all input on the situation. If I've left outany information that would be useful in figuring this out, please tell me.
Thanks in advance,
Laura Hausmann

Re: Inconsistent query performance based on relation hit frequency

Reply via email to