Hi
so 17. 1. 2026 v 15:47 odesÃlatel Lillian Berry <[email protected]>
napsal:
> Logs on our PostgreSQL cluster show frequent slow reads on a simple query
> by unique index.
>
> This issue has been present for some years, and survived multiple server
> moves as well as PostgreSQL version upgrades.
> Vacuuming or reindexing the table does not seem to help.
>
> Jan 17 13:38:13 database-node1-bd9aa148
> pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[172632]:
> pluralkit-db-dataLOG: duration: 2322.842 ms execute <unnamed>: SELECT *
> FROM "members" WHERE "hid" = $1
> Jan 17 13:38:15 database-node1-bd9aa148
> pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[157893]:
> pluralkit-db-dataLOG: duration: 2155.033 ms execute _auto8: SELECT * FROM
> "members" WHERE "hid" = $1
> Jan 17 13:38:16 database-node1-bd9aa148
> pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[157826]:
> pluralkit-db-dataLOG: duration: 2164.851 ms execute _auto10: SELECT *
> FROM "members" WHERE "hid" = $1
> Jan 17 13:38:19 database-node1-bd9aa148
> pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[171076]:
> pluralkit-db-dataLOG: duration: 2039.040 ms execute _auto3: SELECT * FROM
> "members" WHERE "hid" = $1
> Jan 17 13:38:20 database-node1-bd9aa148
> pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[162682]:
> pluralkit-db-dataLOG: duration: 2189.700 ms execute _auto8: SELECT * FROM
> "members" WHERE "hid" = $1
> Jan 17 13:38:22 database-node1-bd9aa148
> pzqzs8zyvqrq1rq25z4drws8dhnbxa59-run-postgres[159824]:
> pluralkit-db-dataLOG: duration: 2137.666 ms execute _auto10: SELECT *
> FROM "members" WHERE "hid" = $1
>
> Parameter detail has been redacted for privacy.
>
> This is not reproducible on manual queries, whether using the same values
> or other random ones.e
There can be lot of possible reasons:
1. virtualization
2. network issues
3. application server lag
4. maybe by using transparent huge pages
It is pretty hard to identify these issues, because source of this issue
can be everything in every layer
At the end I executed every second just simple select SELECT 1; and I
checked the time
It can help with removing some hypotheses like application server issues or
network issues.
Regards
Pavel
> pluralkit=# explain (analyze, buffers, settings) select * from members
> where hid = 'exampl';
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------------------------
> Index Scan using members_hid_key on members (cost=0.43..8.45 rows=1
> width=838) (actual time=0.098..0.098 rows=0.00 loops=1)
> Index Cond: (hid = 'exampl'::bpchar)
> Index Searches: 1
> Buffers: shared hit=3
> Settings: effective_cache_size = '40GB'
> Planning Time: 0.161 ms
> Execution Time: 0.124 ms
> (7 rows)
>
> Table info:
>
> pluralkit=# \d+ members
>
> Table "public.members"
> Column | Type | Collation |
> Nullable | Default | Storage |
> Compression | Stats target | Description
>
> ------------------------+-----------------------------+-----------+----------+------------------------------------------+----------+-------------+--------------+-------------
> id | integer | | not
> null | nextval('members_id_seq'::regclass) | plain | |
> |
> hid | character(6) | | not
> null | | extended | |
> |
> system | integer | | not
> null | nextval('members_system_seq'::regclass) | plain | |
> |
> color | character(6) | |
> | | extended | |
> |
> avatar_url | text | |
> | | extended | |
> |
> name | text | | not
> null | | extended | |
> |
> birthday | date | |
> | | plain | |
> |
> pronouns | text | |
> | | extended | |
> |
> description | text | |
> | | extended | |
> |
> created | timestamp without time zone | | not
> null | timezone('utc'::text, CURRENT_TIMESTAMP) | plain | |
> |
> display_name | text | |
> | | extended | |
> |
> proxy_tags | proxy_tag[] | | not
> null | ARRAY[]::proxy_tag[] | extended | |
> |
> keep_proxy | boolean | | not
> null | false | plain | |
> |
> member_visibility | integer | | not
> null | 1 | plain | |
> |
> message_count | integer | | not
> null | 0 | plain | |
> |
> description_privacy | integer | | not
> null | 1 | plain | |
> |
> name_privacy | integer | | not
> null | 1 | plain | |
> |
> avatar_privacy | integer | | not
> null | 1 | plain | |
> |
> birthday_privacy | integer | | not
> null | 1 | plain | |
> |
> pronoun_privacy | integer | | not
> null | 1 | plain | |
> |
> metadata_privacy | integer | | not
> null | 1 | plain | |
> |
> allow_autoproxy | boolean | | not
> null | true | plain | |
> |
> banner_image | text | |
> | | extended | |
> |
> uuid | uuid | |
> | public.gen_random_uuid() | plain | |
> |
> last_message_timestamp | timestamp without time zone | |
> | | plain | |
> |
> webhook_avatar_url | text | |
> | | extended | |
> |
> proxy_privacy | integer | | not
> null | 1 | plain | |
> |
> tts | boolean | | not
> null | false | plain | |
> |
> banner_privacy | integer | | not
> null | 1 | plain | |
> |
> Indexes:
> "members_pkey" PRIMARY KEY, btree (id)
> "members_by_system" btree (system)
> "members_hid_key" UNIQUE CONSTRAINT, btree (hid)
> "members_uuid_idx" btree (uuid)
> Check constraints:
> "members_avatar_privacy_check" CHECK (avatar_privacy = ANY (ARRAY[1,
> 2]))
> "members_banner_privacy_check" CHECK (banner_privacy = ANY (ARRAY[1,
> 2]))
> "members_birthday_privacy_check" CHECK (birthday_privacy = ANY
> (ARRAY[1, 2]))
> "members_description_privacy_check" CHECK (description_privacy = ANY
> (ARRAY[1, 2]))
> "members_member_privacy_check" CHECK (member_visibility = ANY
> (ARRAY[1, 2]))
> "members_metadata_privacy_check" CHECK (metadata_privacy = ANY
> (ARRAY[1, 2]))
> "members_name_privacy_check" CHECK (name_privacy = ANY (ARRAY[1, 2]))
> "members_pronoun_privacy_check" CHECK (pronoun_privacy = ANY (ARRAY[1,
> 2]))
> "members_proxy_privacy_check" CHECK (proxy_privacy = ANY (ARRAY[1, 2]))
> Foreign-key constraints:
> "members_system_fkey" FOREIGN KEY (system) REFERENCES systems(id) ON
> DELETE CASCADE
> Referenced by:
> TABLE "autoproxy" CONSTRAINT "autoproxy_autoproxy_member_fkey" FOREIGN
> KEY (autoproxy_member) REFERENCES members(id) ON DELETE SET NULL
> TABLE "group_members" CONSTRAINT "group_members_member_id_fkey"
> FOREIGN KEY (member_id) REFERENCES members(id) ON DELETE CASCADE
> TABLE "member_guild" CONSTRAINT "member_guild_member_fkey" FOREIGN KEY
> (member) REFERENCES members(id) ON DELETE CASCADE
> TABLE "switch_members" CONSTRAINT "switch_members_member_fkey" FOREIGN
> KEY (member) REFERENCES members(id) ON DELETE CASCADE
> Access method: heap
>
> pluralkit=# SELECT relname, relpages, reltuples, relallvisible, relkind,
> relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class
> WHERE relname='members';
> -[ RECORD 1 ]--+--------------
> relname | members
> relpages | 816042
> reltuples | 1.3412575e+07
> relallvisible | 297904
> relkind | r
> relnatts | 29
> relhassubclass | f
> reloptions |
> pg_table_size | 6687031296
>
> postgresql.conf:
>
> archive_command =
> '/nix/store/dz9vwg5grhvpz4v7rscdl95n7s5z8c3x-pluralkit-scripts-1/bin/pk-walg
> data 5432 pluralkit wal-push %p'
> archive_mode = 'yes'
> archive_timeout = 60
> effective_cache_size = '40GB'
> hba_file = '/nix/store/0pihm0qzk2pdn1d344p8kihpa58hd7s6-pg_hba.conf'
> listen_addresses = 'localhost, 100.123.150.13,
> fdaa:9:e856:a7b:910f:0:a:102'
> log_destination = 'stderr'
> log_line_prefix = 'pluralkit-db-data'
> log_min_duration_statement = 1000
> max_connections = 2000
> max_wal_size = '1GB'
> min_wal_size = '80MB'
> port = 5432
> shared_buffers = '50GB'
>
> Environment:
>
> PostgreSQL 18.1 on x86_64-pc-linux-gnu, compiled by clang version 20.1.8,
> 64-bit
> installed from nixpkgs
> NixOS 25.11 on Linux 6.12.61 on AMD EPYC 7351P
> 2x Intel SSDPELKX020T7 in RAID1 (md-raid), with LUKS, ext4 filesystem
>
> Same issue previously appeared on
> PostgreSQL 14, both self-compiled and from nixpkgs
> NixOS 24.10 on AMD Ryzen 7950X
> ext4 filesystem directly on single Kioxia enterprise SSD (sorry I don't
> remember the model)
>
> Kindly,
> Lillian
>
>
>