Re: Properly handle OOM death?
On Mon, Mar 13, 2023 at 06:43:01PM +0100, Peter J. Holzer wrote: > On 2023-03-13 09:21:18 -0800, Israel Brewster wrote: > > I’m running a postgresql 13 database on an Ubuntu 20.04 VM that is a bit > > more > > memory constrained than I would like, such that every week or so the various > > processes running on the machine will align badly and the OOM killer will > > kick > > in, killing off postgresql, as per the following journalctl output: > > > > Mar 12 04:04:23 novarupta systemd[1]: postgresql@13-main.service: A process > > of this unit has been killed by the OOM killer. > > Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Failed > > with result 'oom-kill'. > > Mar 12 04:04:32 novarupta systemd[1]: postgresql@13-main.service: Consumed > > 5d 17h 48min 24.509s CPU time. > > > > And the service is no longer running. > > I might be misreading this, but it looks to me that systemd detects that > *some* process in the group was killed by the oom killer and stops the > service. Yeah. I found this old message on google. I'm surprised there aren't more, similar complaints about this. It's as Peter said: it (sometimes) causes systemd to actively *stop* the cluster after OOM, when it would've come back online on its own if the init (supervisor) process didn't interfere. My solution was to set: /usr/lib/systemd/system/postgresql@.service OOMPolicy=continue I suggest that the default unit files should do likewise. -- Justin
Re: Supported RHEL version for PostgreSQL 11
On Sat, Aug 28, 2021 at 04:08:33PM +0800, Sudhir Guna wrote: > The RHEL upgrade from 7.5 to 8.4 is not done as an incremental version > upgrade rather directly from 7.5. I think the process should be: 1) upgrade from RH7 to RH8; 2) reindex relevant indexes (see wiki link below); 3) upgrade from postgres 9.6 to 11; You may need to install updated PGDG packages for the RH8. How are you planning to upgrade postgres ? Using pg_upgrade, or pg_dump, or ?? > Postgres is installed using PGDG packages. > > On Thu, Aug 26, 2021 at 1:15 AM Justin Pryzby wrote: > > > On Wed, Aug 25, 2021 at 11:47:57PM +0800, Sudhir Guna wrote: > > > Hi All, > > > > > > I would like to clarify on OS upgrade for PostgreSQL. > > > > > > Currently we are on PostgreSQL 9.6 on RHEL 7.5. We would like to do a > > > direct upgrade from RHEL 7.5 to RHEL 8.4 and then upgrade PostgreSQL 9.6 > > to > > > PostgreSQL 11. > > > > > > Is it ok to do a direct upgrade from RHEL 7.5 to RHEL 8.4 > > > > Please describe what you mean by a "direct upgrade" ? > > > > You should be careful, since the sort order of indexes can change: > > https://wiki.postgresql.org/wiki/Locale_data_changes > > | Version 8 uses the new locale data. Therefore, caution will be necessary > > when upgrading. > > > > > Is PostgreSQL 11 compatible with RHEL 8.4 > > > > How did you install postgres ? > > Compiled from source ? > > Using RHEL packages ? > > Using PGDG packages ? > > > > PDDG has pacakges for PG11/RH8 > > https://yum.postgresql.org/rpmchart/#pg11
Re: Supported RHEL version for PostgreSQL 11
On Wed, Aug 25, 2021 at 11:47:57PM +0800, Sudhir Guna wrote: > Hi All, > > I would like to clarify on OS upgrade for PostgreSQL. > > Currently we are on PostgreSQL 9.6 on RHEL 7.5. We would like to do a > direct upgrade from RHEL 7.5 to RHEL 8.4 and then upgrade PostgreSQL 9.6 to > PostgreSQL 11. > > Is it ok to do a direct upgrade from RHEL 7.5 to RHEL 8.4 Please describe what you mean by a "direct upgrade" ? You should be careful, since the sort order of indexes can change: https://wiki.postgresql.org/wiki/Locale_data_changes | Version 8 uses the new locale data. Therefore, caution will be necessary when upgrading. > Is PostgreSQL 11 compatible with RHEL 8.4 How did you install postgres ? Compiled from source ? Using RHEL packages ? Using PGDG packages ? PDDG has pacakges for PG11/RH8 https://yum.postgresql.org/rpmchart/#pg11 -- Justin
Re: Row estimates for empty tables
On Fri, Jul 24, 2020 at 09:14:04PM +0200, Pavel Stehule wrote: > pá 24. 7. 2020 v 16:38 odesílatel Christophe Pettus > napsal: > > Since we already special-case parent tables for partition sets, would a > > storage parameter that lets you either tell the planner "no, really, zero > > is reasonable here" or sets a minimum number of rows to plan for be > > reasonable? I happened to get bit by this tracking down an issue where > > several tables in a large query had zero rows, and the planner's assumption > > of a few pages worth caused some sub-optimal plans. The performance hit > > wasn't huge, but they were being joined to some *very* large tables, and > > the differences added up. > > I did this patch ten years ago. GoodData application > https://www.gooddata.com/ uses Postgres lot, and this application stores > some results in tables (as guard against repeated calculations). Lot of > these tables have zero or one row. > > Although we ran an ANALYZE over all tables - the queries on empty tables > had very bad plans, and I had to fix it by this patch. Another company uses > a fake one row in table - so there is no possibility to have a really empty > table. > > It is an issue for special, not typical applications (this situation is > typical for some OLAP patterns) - it is not too often - but some clean > solution (instead hacking postgres) can be nice. On Mon, Aug 24, 2020 at 09:43:49PM +0200, Pavel Stehule wrote: > This patch is just a workaround that works well 10 years (but for one > special use case) - nothing more. Without this patch that application > cannot work ever. My own workaround was here: https://www.postgresql.org/message-id/20200427181034.ga28...@telsasoft.com |... 1) create an child table: CREATE TABLE x_child() INHERITS(x) |and, 2) change the query to use "select from ONLY". | |(1) allows the planner to believe that the table really is empty, a conclusion |it otherwise avoids and (2) avoids decending into the child (for which the |planner would likewise avoid the conclusion that it's actually empty). -- Justin
v12 and pg_restore -f-
This is a "heads up" for others upgrading to v12. I found a solution for our use case, but it'd be easy to miss this, even if you read the release notes. I saw this and updated our scripts with pg_restore -f- https://www.postgresql.org/docs/12/release-12.html |In pg_restore, require specification of -f - to send the dump contents to standard output (Euler Taveira) |Previously, this happened by default if no destination was specified, but that was deemed to be unfriendly. What I didn't realize at first is that -f- has no special meaning in v11 - it just writes a file called ./- And it's considered untennable to change behavior of v11. In our use, that was being piped to sed, which then saw nothing on its stdin and just exits.. I changed our script to use pg_restore -f /dev/stdout, which seems to be portable across postgres versions for the OS distribution we're running. Unfortunately, I can't think of anything portable across *OS* or useful to include in documentation. In the worst case, someone might need to call pg_restore differently based on its version. Justin
Re: postgres=# VACUUM FULL pg_statistic => ERROR: missing chunk number 0 for toast value .. in pg_toast_2619
Moving this old thread to -hackers https://www.postgresql.org/message-id/flat/20180519142603.GA30060%40telsasoft.com I wanted to mention that this seems to still be an issue, now running pg11.5. log_time | 2019-08-30 23:20:00.118+10 user_name | postgres database | ts session_id | 5d69227e.235 session_line | 1 command_tag| CLUSTER session_start_time | 2019-08-30 23:19:58+10 error_severity | ERROR sql_state_code | XX000 message| unexpected chunk number 1 (expected 0) for toast value 2369261203 in pg_toast_2619 query | CLUSTER pg_statistic USING pg_statistic_relid_att_inh_index application_name | psql Note that my original report was for "missing" chunk during "VACUUM FULL", and the current error is "unexpected chunk" during CLUSTER. I imagine that's related issue. I haven't seen this in awhile (but stopped trying to reproduce it long ago). A recently-deployed update to this maintenance script is probably why it's now doing CLUSTER. On Fri, May 25, 2018 at 08:49:50AM -0500, Justin Pryzby wrote: > On Sat, May 19, 2018 at 02:39:26PM -0400, Tom Lane wrote: > > Justin Pryzby writes: > > > [pryzbyj@database ~]$ while :; do for db in `psql postgres -Atc "SELECT > > > datname FROM pg_database WHERE datallowconn"`; do for t in pg_statistic > > > pg_attrdef pg_constraint; do echo "$db.$t..."; > > > PGOPTIONS=-cstatement_timeout='9s' psql $db -qc "VACUUM FULL $t"; done; > > > done; done > > > > > ... > > > postgres.pg_statistic... > > > postgres.pg_attrdef... > > > postgres.pg_constraint... > > > template1.pg_statistic... > > > template1.pg_attrdef... > > > template1.pg_constraint... > > > ts.pg_statistic... > > > ERROR: canceling statement due to statement timeout > > > ts.pg_attrdef... > > > ts.pg_constraint... > > > postgres.pg_statistic... > > > ERROR: missing chunk number 0 for toast value 3372855171 in pg_toast_2619 > > > > Hm, so was the timeout error happening every time through on that table, > > or just occasionally, or did you provoke it somehow? I'm wondering how > > your 9s timeout relates to the expected completion time. > > Actually statement_timeout isn't essential for this (maybe it helps to > triggers > it more often - not sure). > > Could you try: > time sh -ec 'while :; do time psql postgres -c "VACUUM FULL VERBOSE > pg_toast.pg_toast_2619"; psql postgres -c "VACUUM FULL VERBOSE pg_statistic"; > done'; date > > Three servers experienced error within 30min, but one server didn't fail until > 12h later, and a handful others still haven't failed.. > > Does this help at all ? > 2018-05-24 21:57:49.98-03 | 5b075f8d.1ad1 | LOG| pryzbyj | > postgres | statement: VACUUM FULL VERBOSE pg_toast.pg_toast_2619 > 2018-05-24 21:57:50.067-03 | 5b075f8d.1ad1 | INFO | pryzbyj | > postgres | vacuuming "pg_toast.pg_toast_2619" > 2018-05-24 21:57:50.09-03 | 5b075f8d.1ad1 | INFO | pryzbyj | > postgres | "pg_toast_2619": found 0 removable, 408 nonremovable row versions > in 99 pages > 2018-05-24 21:57:50.12-03 | 5b075f8e.1ada | LOG| pryzbyj | > postgres | statement: VACUUM FULL VERBOSE pg_statistic > 2018-05-24 21:57:50.129-03 | 5b075f8e.1ada | INFO | pryzbyj | > postgres | vacuuming "pg_catalog.pg_statistic" > 2018-05-24 21:57:50.185-03 | 5b075f8e.1ada | ERROR | pryzbyj | > postgres | missing chunk number 0 for toast value 3382957233 in pg_toast_2619 > > Some thing; this server has autovacuum logging, although it's not clear to me > if that's an essential component of the problem, either: > 2018-05-24 21:16:39.856-06 | LOG | 5b078017.7b99 | pryzbyj | postgres | > statement: VACUUM FULL VERBOSE pg_toast.pg_toast_2619 > 2018-05-24 21:16:39.876-06 | LOG | 5b078010.7968 | | | > automatic vacuum of table "postgres.pg_toast.pg_toast_2619": index scans: 1 > + > | | | | | > pages: 0 removed, 117 r > 2018-05-24 21:16:39.909-06 | INFO | 5b078017.7b99 | pryzbyj | postgres | > vacuuming "pg_toast.pg_toast_2619" > 2018-05-24 21:16:39.962-06 | INFO | 5b078017.7b99 | pryzbyj | postgres | > "pg_toast_2619": found 0 removable, 492 nonremovable row versions in 117 pages > 2018-05-24 21:16:40.025-06 | LOG | 5b078018.7b9b | pryzbyj | postgres | > statement: VACUUM FULL VERBOSE pg_statistic >
OOM with many sorts
One of our instances was killed: | Out of memory: Kill process 27670 (postmaster) score 381 or sacrifice child | Killed process 27670 (postmaster) total-vm:13207796kB, anon-rss:5736kB, file-rss:0kB, shmem-rss:12421696kB The VM has: 32GB RAM, shared_buffers=12GB, work_mem=128MB Running only pg11.3 Trying to diagnose the issue by running a query on a narrower range of dates, it looks to me that's due to many sorts nodes per worker. Part of the query plan looks like this: [...] -> Finalize GroupAggregate (actual rows=1788 loops=1) Group Key: huawei_umts_ucell_20181217.sect_id, (to_char(huawei_umts_ucell_20181217.start_time, '-MM'::text)) -> Gather Merge (actual rows=56921 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (actual rows=18974 loops=3) Sort Key: huawei_umts_ucell_20181217.sect_id, (to_char(huawei_umts_ucell_20181217.start_time, '-MM'::text)) Sort Method: quicksort Memory: 36499kB Worker 0: Sort Method: quicksort Memory: 40275kB Worker 1: Sort Method: quicksort Memory: 40263kB -> Parallel Append (actual rows=18974 loops=3) -> Partial GroupAggregate (actual rows=1783 loops=1) Group Key: huawei_umts_ucell_20181217.sect_id, (to_char(huawei_umts_ucell_20181217.start_time, '-MM'::text)) -> Sort (actual rows=42792 loops=1) Sort Key: huawei_umts_ucell_20181217.sect_id, (to_char(huawei_umts_ucell_20181217.start_time, '-MM'::text)) Worker 0: Sort Method: quicksort Memory: 44997kB -> Parallel Seq Scan on huawei_umts_ucell_20181217 (actual rows=42792 loops=1) Filter: ((start_time >= '2018-12-01 00:00:00'::timestamp without time zone) AND (start_time < '2019-01-01 00:00:00'::timestamp without time zone)) -> Partial GroupAggregate (actual rows=1783 loops=1) Group Key: huawei_umts_ucell_20181216.sect_id, (to_char(huawei_umts_ucell_20181216.start_time, '-MM'::text)) -> Sort (actual rows=42792 loops=1) Sort Key: huawei_umts_ucell_20181216.sect_id, (to_char(huawei_umts_ucell_20181216.start_time, '-MM'::text)) Worker 1: Sort Method: quicksort Memory: 44997kB -> Parallel Seq Scan on huawei_umts_ucell_20181216 (actual rows=42792 loops=1) Filter: ((start_time >= '2018-12-01 00:00:00'::timestamp without time zone) AND (start_time < '2019-01-01 00:00:00'::timestamp without time zone)) [...many partial GroupAggregate/Sort repeated here for ~200 child tables...] This isn't the first time I've had to reduce work_mem on a parallel query to avoid OOM, but it seems unreasonable to be concerned with 50MB sorts. It looks like the results of each Sort node stay in RAM, during processing of additional sort nodes (is that required?) I SET max_parallel_workers_per_gather=0 and dumped mem context: ... PortalContext: 203948152 total in 25756 blocks; 176600 free (319 chunks); 203771552 used: ... Grand total: 1918118632 bytes in 30716 blocks; 192472208 free (3685 chunks); 1725646424 used The total shows ~1700MB but the largest context is only 200MB. "Caller tuples" accounts for 1400MB. awk '/Caller tuples/{s+=$3}END{print s/1024^3}' /tmp/mem-context 1.44043 I'm attaching full plan and mem context. It doesn't seem to be changed under pg12b2, which is concerning since PG12 advertizes that it gracefully supports "thousands" of partitions. It seems to me that would require tiny work_mem, which would be devastating to some workloads. This is not a contrived test case, it's one of our reports run across ~10 months of history with weekly granularity. Thanks in advance for any advice. Justin QUERY PLAN -- GroupAggregate (actual rows=277 loops=1) Group Key: sites.site_office, sites.site_location, (to_char(huawei_umts_ucell_20181217.start_time, '-MM'::text)), sites.site_name -> Sort (actual rows=1783 loops=1) Sort Key: sites.site_office, sites.site_location,
distinguish update from insert (on conflict)
Is it still impossible to distinguish whether a row was inserted vs updated ? The latest I can see is here: https://wiki.postgresql.org/wiki/UPSERT#RETURNING_behavior ..but I'm hopeful that the 4 year old wiki page is out of date. Justin
both force order of evaluation and hit index
I wrote a script to import CSV query logs as described here: https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-CSVLOG I also created indices for searching: |CREATE INDEX ON postgres_log_2019_04_09_2300 ((regexp_replace(message, '^duration: ([.0-9]+) ms.*', '\1')::float)) WHERE message~'^duration:' I'm having an issue that queries on long duration 1) need to specify "WHERE message~'^duration'" to hit the index; 2) need to *evaluate* that condition before the next condition: |regexp_replace(message, '^duration: ([.0-9]+) ms.*', '\1')::float))> I've tried using CASE and OFFSET 0 and WITH, it seems like any effort I make to force evaluation of "duration>" to happen before "message~'^duration'" also causes a seq scan rather than index scan on query duration. I'm hesitating to use ORDER BY just to encourage index scan, since it doesn't even guarantee that it's going to avoid the error. This is doing index scan on other/"wrong" index: |postgres=# explain SELECT date_trunc('minute',log_time)log_time, error_severity, session_id, user_name, database, left(message,99), left(query,99) |FROM postgres_log_2019_04_10_1000 WHERE log_time>'2019-04-10 10:51' AND |CASE WHEN message~'^duration:' THEN regexp_replace(message, '^duration: ([.0-9]+) ms.*', '\1')::float ELSE NULL END > AND |(application_name!='psql' OR user_name='postgres') ORDER BY 1 DESC; | Sort (cost=74908.17..74955.87 rows=19080 width=101) | Sort Key: (date_trunc('minute'::text, log_time)) DESC | -> Bitmap Heap Scan on postgres_log_2019_04_10_1000 (cost=22.24..73551.61 rows=19080 width=101) | Recheck Cond: (log_time > '2019-04-10 10:51:00-04'::timestamp with time zone) | Filter: (((application_name <> 'psql'::text) OR (user_name = 'postgres'::text)) AND (CASE WHEN (message ~ '^duration:'::text) THEN (regexp_replace(message, '^duration: ([.0-9]+) ms.*'::text, '\1'::text))::double precision ELSE NULL::double precision END > ''::double precision)) | -> Bitmap Index Scan on postgres_log_2019_04_10_1000_log_time_idx (cost=0.00..17.47 rows=58197 width=0) | Index Cond: (log_time > '2019-04-10 10:51:00-04'::timestamp with time zone) This gives a nice plan, but doesn't avoid errors due to casting to float lines which don't match the duration regex. postgres=# explain SELECT date_trunc('minute',log_time)log_time, dur, error_severity, session_id, user_name, database, left(message,99), left(query,99) FROM (SELECT *,regexp_replace(message, '^duration: ([.0-9]+) ms.*', '\1')::float AS dur FROM postgres_log_2019_04_10_1000 WHERE log_time>'2019-04-10 10:50' AND message~'^duration:')x WHERE dur> AND (application_name!='psql' OR user_name='postgres') ; | Bitmap Heap Scan on postgres_log_2019_04_10_1000 (cost=5214.86..50141.68 rows=19742 width=109) | Recheck Cond: ((log_time > '2019-04-10 10:50:00-04'::timestamp with time zone) AND ((regexp_replace(message, '^duration: ([.0-9]+) ms.*'::text, '\1'::text))::double precision > ''::double precision) AND (message ~ '^duration:'::text)) | Filter: ((application_name <> 'psql'::text) OR (user_name = 'postgres'::text)) | -> BitmapAnd (cost=5214.86..5214.86 rows=19819 width=0) | -> Bitmap Index Scan on postgres_log_2019_04_10_1000_log_time_idx (cost=0.00..17.50 rows=59462 width=0) | Index Cond: (log_time > '2019-04-10 10:50:00-04'::timestamp with time zone) | -> Bitmap Index Scan on postgres_log_2019_04_10_1000_duration_idx (cost=0.00..5187.23 rows=238241 width=0) | Index Cond: ((regexp_replace(message, '^duration: ([.0-9]+) ms.*'::text, '\1'::text))::double precision > ''::double precision) |(8 rows) |ERROR: invalid input syntax for type double precision: "temporary file: path "pg_tblspc/3292386340/PG_11_201809051/pgsql_tmp/pgsql_tmp16493.0", size 603619328" If I write as CTE or OFFSET 0 subq, that doesn't allow scanning on "duration" index, and scans on time index instead: |postgres=# explain analyze WITH x AS (SELECT *,regexp_replace(message, '^duration: ([.0-9]+) ms.*', '\1')::float AS dur FROM postgres_log_2019_04_10_1000 WHERE log_time>'2019-04-10 10:50' AND message~'^duration:') SELECT date_trunc('minute',log_time)log_time, dur, error_severity, session_id, user_name, database, left(message,99), left(query,99) FROM x WHERE dur> AND (application_name!='psql' OR user_name='postgres') ; |postgres=# explain analyze SELECT date_trunc('minute',log_time)log_time, dur, error_severity, session_id, user_name, database, left(message,99), left(query,99) FROM (SELECT *,regexp_replace(message, '^duration: ([.0-9]+) ms.*', '\1')::float AS dur FROM postgres_log_2019_04_10_1000 WHERE log_time>'2019-04-10 10:50' AND message~'^duration:' OFFSET 0)x WHERE dur> AND (application_name!='psql' OR user_name='postgres') ; Thanks in advance for any advice. Justin
Re: Question about pg_upgrade from 9.2 to X.X
On Wed, Mar 06, 2019 at 09:44:16PM -0800, Perumal Raj wrote: > Any pointers for pg_repack schema creation ? With recent postgres, you should use just: "CREATE EXTENSION pg_repack", which does all that for you. > Will there be any impact in the future , Since i used --link option ? You probably have an old DB directory laying around which is (at least partially) hardlinks. You should remove it .. but be careful to remove the correct dir. My scripts always rename the old dir before running pg_upgrade, so it's less scary to rm -fr it later. Justin
Re: Question about pg_upgrade from 9.2 to X.X
On Tue, Mar 05, 2019 at 08:09:12AM -0800, Perumal Raj wrote: > Thanks Sergei/Justin for the continues update. > > So reorg Schema might be created as part of some scripts prior to 9.2 > Version ? I'm guessing they were probably created in 9.2. > These are the functions in DB not the Extension. However these functions > will not run as the associated libraries are not exists in System now (9.2) > and I hope no impact to system. I guess someone installed pgreorg, ran its scripts to install its functions into the DB, and then removed pgreorg without removing its scripts. > One Question need your address, > > Prior to 9.2 to 9.6 upgrade , I have tried 9.2 10.7 upgrade and failed > similar error(you can refer beginning o the post ). > > > could not load library "$libdir/hstore": ERROR: could not access file > > "$libdir/hstore": No such file or directory > > could not load library "$libdir/adminpack": ERROR: could not access file > > "$libdir/adminpack": No such file or directory > > could not load library "$libdir/uuid-ossp": ERROR: could not access file > > "$libdir/uuid-ossp": No such file or directory > > These Extension seems to be standard. What is the use of these function and > do we have any alternative in Higher version or Enhanced object if i drop > it in 9.2 and continue upgrade to 10.7 Version. See Sergei's response: https://www.postgresql.org/message-id/7164691551378448%40myt3-1179f584969c.qloud-c.yandex.net You probably want to install this package for the new version (9.6 or 10 or 11). [pryzbyj@TS-DB ~]$ rpm -ql postgresql11-contrib |grep -E '(uuid-ossp|adminpack|hstore)\.control' /usr/pgsql-11/share/extension/adminpack.control /usr/pgsql-11/share/extension/hstore.control /usr/pgsql-11/share/extension/uuid-ossp.control Justin
Re: Question about pg_upgrade from 9.2 to X.X
On Mon, Mar 04, 2019 at 02:21:11PM -0800, Perumal Raj wrote: > Does it mean that these functions are default and came with 9.2 ? > I am wondering how these functions are created in the DB as the > library($libdir/pg_reorg) is not exists in system I don't think it's default. But was probably installed by running some SQL script. It tentatively sounds safe to me to drop, but you should take a backup and inspect and double check your pg_dump output and output of "begin; drop schema pgreorg cascade". Justin
Re: Question about pg_upgrade from 9.2 to X.X
On Mon, Mar 04, 2019 at 01:37:30PM -0800, Perumal Raj wrote: > I could see bunch of functions under reorg schema. Those functions are the ones preventing you from upgrading. You should drop schema pg_reorg cascade. You can run it in a transaction first to see what it will drop. But after the upgrade, you can CREATE EXTENSION pg_repack, which is a fork of pg_reorg, which is itself no longer maintained. Justin
Re: Question about pg_upgrade from 9.2 to X.X
Moving to -general list (-hackers is for development topics like proposed patches and patch reviews and beta testing and crash reports). On Thu, Feb 28, 2019 at 10:13:58AM -0800, Perumal Raj wrote: > could not load library "$libdir/pg_reorg": > ERROR: could not access file "$libdir/pg_reorg": No such file or directory As Sergei said, you can run pg_dump -s and look for references to reorg, and drop them. Or, you could try this: CREATE EXTENSION pg_reorg FROM unpackaged; Or maybe this: CREATE EXTENSION pg_repack FROM unpackaged; If that works, you can DROP EXTENSION pg_repack; Otherwise, I think you can maybe do something like: DROP SCHEMA pg_repack CASCADE; -- or, DROP SCHEMA pg_reorg CASCADE; Please send output of: \dn
Re: postgresql v11.1 Segmentation fault: signal 11: by running SELECT... JIT Issue?
On Mon, Nov 26, 2018 at 07:00:35PM -0800, Andres Freund wrote: > The fix is easy enough, just adding a > v_hoff = LLVMBuildZExt(b, v_hoff, LLVMInt32Type(), ""); > fixes the issue for me. On Tue, Jan 29, 2019 at 12:38:38AM -0800, pabloa98 wrote: > And perhaps should I modify this too? > If that is the case, I am not sure what kind of modification we should do. Andres commited the fix in November, and it's included in postgres11.2, which is scheduled to be released Thursday. So we'll both be able to re-enable JIT on our wide tables again. https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b23852766 Justin
Re: query logging of prepared statements
On Fri, Feb 08, 2019 at 07:29:53AM -0600, Justin Pryzby wrote: > A couple months ago, I implemented prepared statements for PyGreSQL. While > updating our application in advance of their release with that feature, I > noticed that our query logs were several times larger. Previously sent to -general (and quoted fully below), resending to -hackers with patch. https://www.postgresql.org/message-id/20190208132953.GF29720%40telsasoft.com I propose that the prepared statement associated with an EXECUTE should only be included in log "DETAIL" when log_error_verbosity=VERBOSE, for both SQL EXECUTE and PQexecPrepared. I'd like to be able to continue logging DETAIL (including bind parameters), so want like to avoid setting "TERSE" just to avoid redundant messages. With attached patch, I'm not sure if !*portal_name && !portal->prepStmtName would catch cases other than PQexecParams (?) Compare unpatched server to patched server to patched server with log_error_verbosity=verbose. |$ psql postgres -xtc "SET log_error_verbosity=default;SET log_statement='all'; SET client_min_messages=log" -c "PREPARE q AS SELECT 2" -c "EXECUTE q" |SET |LOG: statement: PREPARE q AS SELECT 2 |PREPARE |LOG: statement: EXECUTE q |DETAIL: prepare: PREPARE q AS SELECT 2 |?column? | 2 |$ PGHOST=/tmp PGPORT=5678 psql postgres -xtc "SET log_error_verbosity=default;SET log_statement='all'; SET client_min_messages=log" -c "PREPARE q AS SELECT 2" -c "EXECUTE q" |SET |LOG: statement: PREPARE q AS SELECT 2 |PREPARE |LOG: statement: EXECUTE q |?column? | 2 |$ PGHOST=/tmp PGPORT=5678 psql postgres -xtc "SET log_error_verbosity=verbose;SET log_statement='all'; SET client_min_messages=log" -c "PREPARE q AS SELECT 2" -c "EXECUTE q" |SET |LOG: statement: PREPARE q AS SELECT 2 |PREPARE |LOG: statement: EXECUTE q |DETAIL: prepare: PREPARE q AS SELECT 2 |?column? | 2 For PQexecPrepared library call: |$ xPGPORT=5678 xPGHOST=/tmp PYTHONPATH=../PyGreSQL/build/lib.linux-x86_64-2.7/ python2.7 -c "import pg; d=pg.DB('postgres'); d.query('SET client_min_messages=log; SET log_error_verbosity=default; SET log_statement=\"all\"'); d.query('SELECT 1; PREPARE q AS SELECT \$1'); d.query_prepared('q',[1]); d.query_formatted('SELECT %s', [2])" |LOG: statement: SELECT 1; PREPARE q AS SELECT $1 |LOG: execute q: SELECT 1; PREPARE q AS SELECT $1 |DETAIL: parameters: $1 = '1' |LOG: execute : SELECT $1 |DETAIL: parameters: $1 = '2' |$ PGPORT=5678 PGHOST=/tmp PYTHONPATH=../PyGreSQL/build/lib.linux-x86_64-2.7/ python2.7 -c "import pg; d=pg.DB('postgres'); d.query('SET client_min_messages=log; SET log_error_verbosity=default; SET log_statement=\"all\"'); d.query('SELECT 1; PREPARE q AS SELECT \$1'); d.query_prepared('q',[1]); d.query_formatted('SELECT %s', [2])" |LOG: statement: SELECT 1; PREPARE q AS SELECT $1 |LOG: execute q |DETAIL: parameters: $1 = '1' |LOG: execute : SELECT $1 |DETAIL: parameters: $1 = '2' |$ PGPORT=5678 PGHOST=/tmp PYTHONPATH=../PyGreSQL/build/lib.linux-x86_64-2.7/ python2.7 -c "import pg; d=pg.DB('postgres'); d.query('SET client_min_messages=log; SET log_error_verbosity=verbose; SET log_statement=\"all\"'); d.query('SELECT 1; PREPARE q AS SELECT \$1'); d.query_prepared('q',[1]); d.query_formatted('SELECT %s', [2])" |LOG: statement: SELECT 1; PREPARE q AS SELECT $1 |LOG: execute q: SELECT 1; PREPARE q AS SELECT $1 |DETAIL: parameters: $1 = '1' |LOG: execute : SELECT $1 |DETAIL: parameters: $1 = '2' Also, I noticed that if the statement was prepared using SQL PREPARE (rather than PQprepare), and if it used simple query with multiple commands, they're all included in the log, like this when executed with PQexecPrepared: |LOG: execute q: SET log_error_verbosity=verbose; SET client_min_messages=log; PREPARE q AS SELECT $1 And looks like this for SQL EXECUTE: |[pryzbyj@telsasoft-db postgresql]$ psql postgres -txc "SET client_min_messages=log;SELECT 1;PREPARE q AS SELECT 2" -c "EXECUTE q" |PREPARE |LOG: statement: EXECUTE q |DETAIL: prepare: SET client_min_messages=log;SELECT 1;PREPARE q AS SELECT 2 |?column? | 2 On Fri, Feb 08, 2019 at 07:29:53AM -0600, Justin Pryzby wrote: > A couple months ago, I implemented prepared statements for PyGreSQL. While > updating our application in advance of their release with that feature, I > noticed that our query logs were several times larger. > > With non-prepared statements, we logged to CSV: > |message| SELECT 1 > > With SQL EXECUTE, we log: > |message| statement: EXECUTE sqlex(2); > |detail | prepare: prepare sqlex AS SELECT $1; > > With PQexecPrepared, we would log: > |message| execute qq: PREPARE qq AS SELECT $1 > |detail |
Re: Server goes to Recovery Mode when run a SQL
Hi, On Fri, Feb 08, 2019 at 02:11:33PM -0700, PegoraroF10 wrote: > *Well, now we have two queries which stops completelly our postgres server. > That problem occurs on 10.6 and 11.1 versions. > On both server the problem is the same. > Linux logs of old crash are:* > Feb 1 18:39:53 fx-cloudserver kernel: [ 502.405788] show_signal_msg: 5 > callbacks suppressedFeb 1 18:39:53 fx-cloudserver kernel: [ 502.405791] > postgres[10195]: segfault at 24 ip 555dc6a71cb0 sp 7ffc5f91db38 > error 4 in postgres[555dc69b4000+6db000] "segfault" seems to mean you hit a bug, which we'll want more information to diagnose. Could you install debugging symbols ? Ubuntu calls their package postgresql-10-dbg or similar. And start server with coredumps enabled, using pg_ctl -c -D /var/lib/postgresql/10/main (or similar). Then trigger the query and hope to find a core dump in the data directory. Or possibly it'll be processed into /var/crash by apport daemon, depending if that's running and enabled (check /proc/sys/kernel/core_pattern). https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD#Getting_a_trace_from_a_randomly_crashing_backend > *Linux Log of new crash, which takes several minutes to stop:* > Feb 8 15:06:40 fxReplicationServer kernel: [1363901.643121] postgres > invoked oom-killer: gfp_mask=0x24280ca, order=0, oom_score_adj=0Feb 8 > fxReplicationServer kernel: [1363901.643368] Killed process 9399 (postgres) > total-vm:16518496kB, anon-rss:11997448kB, file-rss:38096kBFeb 8 17:21:16 > fxReplicationServer kernel: [1371977.845728] postgres[10321]: segfault at 10 > ip 5567a6069752 sp 7ffed70be970 error 4 in In this case, you ran out of RAM, as you noticed. You should make sure ANALYZE statistics are up to date (including manually ANALYZEing any parent tables). On Sun, Feb 03, 2019 at 09:05:42AM -0700, PegoraroF10 wrote: > I´m using Postgres 10 on ubuntu in a Google VM (8 cores, 32Gb RAM, 250Gb SSD) > and DB has 70GB What is work_mem setting ? You could try running the query with lower work_mem, or check EXPLAIN ANALYZE and try to resolve any bad rowcount estimates. > *This query runs for aproximately 5 minutes. See link above with images and > logs and you´ll see how memory will grow. Memory use starts with 8gb e grows > until use them all. When all memory is in use then it starts to swap. When > all swap is allocated then it gets the "out of memory" and stops > completelly. > on: https://drive.google.com/open?id=18zIvkV3ew4aZ1_cxI-EmIPVql7ydvEwi* It says "Access Denied", perhaps you could send a link to https://explain.depesz.com/ ? Justin
query logging of prepared statements
A couple months ago, I implemented prepared statements for PyGreSQL. While updating our application in advance of their release with that feature, I noticed that our query logs were several times larger. With non-prepared statements, we logged to CSV: |message| SELECT 1 With SQL EXECUTE, we log: |message| statement: EXECUTE sqlex(2); |detail | prepare: prepare sqlex AS SELECT $1; With PQexecPrepared, we would log: |message| execute qq: PREPARE qq AS SELECT $1 |detail | parameters: $1 = '3' For comparison, with PQexecParams, the logs I see look like this (apparently the "unnamed" prepared statement is used behind the scenes): |message| execute : SELECT [...] It's not clear to me why it'd be desirable for the previous PREPARE to be additionally logged during every execution, instead of just its name (in "message") and params (in "detail"). (Actually, I had to triple check that it wasn't somehow executing a prepared statement which itself created a prepared statement...) For us, the performance benefit is to minimize the overhead (mostly in pygres) of many INSERTs into append-only tables. It's not great that a feature intended to improve performance is causing 2x more log volume to be written. Also, it seems odd that for SQL EXECUTE, the PREPARE is shown in "detail", but for the library call, it's shown in "message". I found: |commit bc24d5b97673c365f19be21f83acca3c184cf1a7 |Author: Bruce Momjian |Date: Tue Aug 29 02:11:30 2006 + | |Now bind displays prepare as detail, and execute displays prepare and |optionally bind. I re-added the "statement:" label so people will |understand why the line is being printed (it is log_*statement |behavior). | |Use single quotes for bind values, instead of double quotes, and double |literal single quotes in bind values (and document that). I also made |use of the DETAIL line to have much cleaner output. and: |commit c8961bf1ce0b51019db31c5572dac18b664e02f1 |Author: Bruce Momjian |Date: Fri Aug 4 18:53:46 2006 + | |Improve logging of protocol-level prepared statements. Justin
pg_stat_activity.query_start in the future?
I'm buried and not having any good ideas how to diagnose this or what else to send, so here it is. Feel free to tell me why I'm the one whose confused.. postgres=# SELECT pid, now(), query_start, state, query FROM pg_stat_activity WHERE pid=27757; pid | now | query_start | state |query ---+---+---++- 27757 | 2018-08-17 11:10:16.568429-04 | 2018-08-17 10:17:52.814282-04 | active | autovacuum: VACUUM ANALYZE public.eric_enodeb_cell_20180815 (1 row) postgres=# \! ps -O lstart 27757 PID STARTED S TTY TIME COMMAND 27757 Fri Aug 17 08:53:20 2018 S ?00:11:56 postgres: autovacuum worker process xton postgres=# \! date Fri Aug 17 11:10:58 EDT 2018 Justin
Re: ALTER TABLE SET (toast.asdf) is not rejected ... if a table has no toast
On Mon, Jun 11, 2018 at 11:47:59AM -0400, Alvaro Herrera wrote: > On 2018-Jun-11, Justin Pryzby wrote: > > > I noticed that this is accepted: > > > > postgres=# ALTER TABLE t SET (toast.asdf=128); > > ALTER TABLE > > > > I thought since "toast" was a core namespace, it would've been rejected? > > > > I recall having read a discussion about verifying these ... I wasn't able > > to find what I was thinking of, but found this one. > > https://www.postgresql.org/message-id/flat/20090114144332.GF24156%40alvh.no-ip.org > > Uh. ISTM that this was lost during the 9.4 cycle, because that command *is* > rejected in 9.3. > alvherre=# ALTER TABLE t SET (toast.asdf=128); > ERROR: unrecognized parameter "asdf" I'm assuming your "t" has a toast table. I think this is properly rejected when toast exists. It looks like we were maybe both (partially?) fooled by behavior that's been discussed multiple times over the years: SET (toast.*) is ignored for tables which don't currently have any TOAST. https://www.postgresql.org/message-id/flat/20090211194311.GP8924%40alvh.no-ip.org Maybe all's well, but I suggest there's maybe an question which hasn't been previously been disambiguated and raised: if a table has no toast, should SET (toast.JUNK) be ignored, or rejected ? BTW maybe I was originally thinking of this: https://www.postgresql.org/message-id/flat/c4d71df2-9e0e-3912-dc81-9a72e080c238%40lab.ntt.co.jp#c4d71df2-9e0e-3912-dc81-9a72e080c...@lab.ntt.co.jp Justin
ALTER TABLE SET (toast.asdf) is not rejected
I noticed that this is accepted: postgres=# ALTER TABLE t SET (toast.asdf=128); ALTER TABLE I thought since "toast" was a core namespace, it would've been rejected? I recall having read a discussion about verifying these ... I wasn't able to find what I was thinking of, but found this one. https://www.postgresql.org/message-id/flat/20090114144332.GF24156%40alvh.no-ip.org Justin
Re: notes from transition to relkind='p'
On Sun, Jun 03, 2018 at 12:13:49PM -0700, Adrian Klaver wrote: > On 06/01/2018 03:14 PM, Justin Pryzby wrote: > >Before I forget any more, this is a brain of issues/considerations/concerns Should have said brain DUMP Keep in mind, I've phrased these as notes, maybe useful to someone else, maybe not an issue to most people, and maybe not possible or worth changing behavior for, since these are already in last year's published release. > >with our (partial) transition to partitioned tables over the last ~9 months. > > I > >believe these are all documented behaviors, but could be seen by users as a > >gratuitious/unexpected change or rough edge and the differences could perhaps > >be mitigated. I realize there's maybe no good time or way to change most of > >these, but maybe the list will be helpful to somebody in avoiding unexpected > >transitional issues. > > It is not clear from above what you are transitioning from. Transitioning from inheritence based partitioning with empty relkind='r' parents to PG10 declarative partitioning. > > . ANALYZE relkind_p also updates stats of child (whereas ANALYZE relkind_r > >only updates stats for parent); it's unclear if there's any reason why it > >wasn't always done this way(?). I end up having to conditionize > > processing > >based on relkind. 3c3bb99330aa9b4c2f6258bfa0265d806bf365c3 > > Because there is a clear partition dependency in the declarative > partitioning scheme. In the INHERIT scheme(which I assume is what you are > talking about) the partitioning is optional. I see your point - but AFAICT there's no reason why ANALYZE relkind_r couldn't have always acted "recursively" on any children, or if there's any reason it couldn't conceivably have been changed in v10, for consistency, if that was determined to outweigh the disruptiveness/inconvenience of changing an existing behavior. > > . The docs say: if detaching/re-attach a partition, should first ADD CHECK > > to > >avoid a slow ATTACH operation. Perhaps DETACHing a partition could > >implicitly CREATE a constraint which is usable when reATTACHing? > > I personally would not like that. If I wanted to maintain the partition > constraint I would keep the table in the partition. If I was DETACHing it > then it would be to get rid of it or have it exist as a stand alone table. > If I where to keep it deciding what constraints to maintain should be up to > me. Noted; for us, the reasons we uninherit/detach (and then inherit/attach) are: - to avoid massive space use and long, disruptive table lock while executing a typo-promoting ALTER. - when ALTERing to a new tablespace, to avoid locking parent table, which is more disruptive than losing visibility of a partition's tuples; > > . relkind_p has no entry in pg_stat_user_tables (last_analyze, etc). Maybe > >the view could do the needed CASE ... (SELECT min() FROM pg_inherits > > JOIN psut) > > . ALTER TABLE ATTACH requires specifying bounds: Maybe it sounds naive to > >suggest one would want to avoid that; but consider: we ended up adding > > both > >shell and python logic to parse the table name to allow detaching and > >reattaching partitions. I think it'd be a nice if the bounds were > > inferred > >if there was a single constraint on the partition key. > > The above I am not sure how you envision that working, especially the > inferring the key part. Having the program guess at what I want a partition > to be constrained by is something I would need fully explained to me. I mean the partition bounds of an partition of a relkind='p' could (perhaps) be changed into a CHECK constraint at DETACH time, avoiding full table scan and long lock if it's later re-ATTACHED. If implemented, it could be requested rather than strictly implied.. ALTER TABLE p DETACH r ADD CONSTRAINT c; ... ALTER TABLE p ATTACH r FOR VALUES INFERRED BY CONSTRAINT c; -- must be a check constraint, must be a "simple" list or range with a single column, etc. Maybe not feasible for hash partitioning? Justin
notes from transition to relkind='p'
Before I forget any more, this is a brain of issues/considerations/concerns with our (partial) transition to partitioned tables over the last ~9 months. I believe these are all documented behaviors, but could be seen by users as a gratuitious/unexpected change or rough edge and the differences could perhaps be mitigated. I realize there's maybe no good time or way to change most of these, but maybe the list will be helpful to somebody in avoiding unexpected transitional issues. . DROP TABLE on a parent no longer fails without CASCADE (dependencies of relkind='p' are deptype='a' "soft" dependencies) 8b4d582d279d784616c228be58af1e39aa430402 . ANALYZE relkind_p also updates stats of child (whereas ANALYZE relkind_r only updates stats for parent); it's unclear if there's any reason why it wasn't always done this way(?). I end up having to conditionize processing based on relkind. 3c3bb99330aa9b4c2f6258bfa0265d806bf365c3 . The docs say: if detaching/re-attach a partition, should first ADD CHECK to avoid a slow ATTACH operation. Perhaps DETACHing a partition could implicitly CREATE a constraint which is usable when reATTACHing? . relkind_p has no entry in pg_stat_user_tables (last_analyze, etc). Maybe the view could do the needed CASE ... (SELECT min() FROM pg_inherits JOIN psut) . ALTER TABLE ATTACH requires specifying bounds: Maybe it sounds naive to suggest one would want to avoid that; but consider: we ended up adding both shell and python logic to parse the table name to allow detaching and reattaching partitions. I think it'd be a nice if the bounds were inferred if there was a single constraint on the partition key. . ALTER TABLE ATTACH has reversed order of child vs parent relative to NO/INHERIT. . And actually, having both ALTER TABLE DE/TACH vs NO/INHERIT is itself messy: we ended up having branches (both shell and python) to handle both cases (at least for a transitional period, but probably we'll need to continue handling both into the indeterminate future). Cheers, Justin
\d t: ERROR: XX000: cache lookup failed for relation
Is that considered an actionable problem? Encountered while trying to reproduce the vacuum full pg_statistic/toast_2619 bug; while running a loop around VAC FULL and more in another session: [1]- Running { time sh -ec 'while :; do psql --port 5678 postgres -qc "VACUUM FULL pg_toast.pg_toast_2619"; psql --port 5678 postgres -qc "VACUUM FULL pg_statistic"; done'; date; } & [2]+ Running time while :; do psql postgres --port 5678 -c "INSERT INTO t SELECT i FROM generate_series(1,99) i"; sleep 1; for a in `seq 999`; do psql postgres --port 5678 -c "ALTER TABLE t ALTER i TYPE int USING i::int"; sleep 1; psql postgres --port 5678 -c "ALTER TABLE t ALTER i TYPE bigint"; sleep 1; done; psql postgres --port 5678 -c "TRUNCATE t"; sleep 1; done & [pryzbyj@kgfs66telsadb ~]$ psql --port 5678 postgres -x psql (11beta1) Type "help" for help. ... postgres=# \set VERBOSITY verbose postgres=# \d t ERROR: XX000: cache lookup failed for relation 8096742 LOCATION: flatten_reloptions, ruleutils.c:11065 Justin
Re: postgres=# VACUUM FULL pg_statistic => ERROR: missing chunk number 0 for toast value .. in pg_toast_2619
On Sat, May 19, 2018 at 02:39:26PM -0400, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > [pryzbyj@database ~]$ while :; do for db in `psql postgres -Atc "SELECT > > datname FROM pg_database WHERE datallowconn"`; do for t in pg_statistic > > pg_attrdef pg_constraint; do echo "$db.$t..."; > > PGOPTIONS=-cstatement_timeout='9s' psql $db -qc "VACUUM FULL $t"; done; > > done; done > > > ... > > postgres.pg_statistic... > > postgres.pg_attrdef... > > postgres.pg_constraint... > > template1.pg_statistic... > > template1.pg_attrdef... > > template1.pg_constraint... > > ts.pg_statistic... > > ERROR: canceling statement due to statement timeout > > ts.pg_attrdef... > > ts.pg_constraint... > > postgres.pg_statistic... > > ERROR: missing chunk number 0 for toast value 3372855171 in pg_toast_2619 > > Hm, so was the timeout error happening every time through on that table, > or just occasionally, or did you provoke it somehow? I'm wondering how > your 9s timeout relates to the expected completion time. Actually statement_timeout isn't essential for this (maybe it helps to triggers it more often - not sure). Could you try: time sh -ec 'while :; do time psql postgres -c "VACUUM FULL VERBOSE pg_toast.pg_toast_2619"; psql postgres -c "VACUUM FULL VERBOSE pg_statistic"; done'; date Three servers experienced error within 30min, but one server didn't fail until 12h later, and a handful others still haven't failed.. Does this help at all ? 2018-05-24 21:57:49.98-03 | 5b075f8d.1ad1 | LOG| pryzbyj | postgres | statement: VACUUM FULL VERBOSE pg_toast.pg_toast_2619 2018-05-24 21:57:50.067-03 | 5b075f8d.1ad1 | INFO | pryzbyj | postgres | vacuuming "pg_toast.pg_toast_2619" 2018-05-24 21:57:50.09-03 | 5b075f8d.1ad1 | INFO | pryzbyj | postgres | "pg_toast_2619": found 0 removable, 408 nonremovable row versions in 99 pages 2018-05-24 21:57:50.12-03 | 5b075f8e.1ada | LOG| pryzbyj | postgres | statement: VACUUM FULL VERBOSE pg_statistic 2018-05-24 21:57:50.129-03 | 5b075f8e.1ada | INFO | pryzbyj | postgres | vacuuming "pg_catalog.pg_statistic" 2018-05-24 21:57:50.185-03 | 5b075f8e.1ada | ERROR | pryzbyj | postgres | missing chunk number 0 for toast value 3382957233 in pg_toast_2619 Some thing; this server has autovacuum logging, although it's not clear to me if that's an essential component of the problem, either: 2018-05-24 21:16:39.856-06 | LOG | 5b078017.7b99 | pryzbyj | postgres | statement: VACUUM FULL VERBOSE pg_toast.pg_toast_2619 2018-05-24 21:16:39.876-06 | LOG | 5b078010.7968 | | | automatic vacuum of table "postgres.pg_toast.pg_toast_2619": index scans: 1 + | | | | | pages: 0 removed, 117 r 2018-05-24 21:16:39.909-06 | INFO | 5b078017.7b99 | pryzbyj | postgres | vacuuming "pg_toast.pg_toast_2619" 2018-05-24 21:16:39.962-06 | INFO | 5b078017.7b99 | pryzbyj | postgres | "pg_toast_2619": found 0 removable, 492 nonremovable row versions in 117 pages 2018-05-24 21:16:40.025-06 | LOG | 5b078018.7b9b | pryzbyj | postgres | statement: VACUUM FULL VERBOSE pg_statistic 2018-05-24 21:16:40.064-06 | INFO | 5b078018.7b9b | pryzbyj | postgres | vacuuming "pg_catalog.pg_statistic" 2018-05-24 21:16:40.145-06 | ERROR | 5b078018.7b9b | pryzbyj | postgres | missing chunk number 0 for toast value 765874692 in pg_toast_2619 Or this one? postgres=# SELECT log_time, database, user_name, session_id, left(message,999) FROM postgres_log WHERE (log_time>='2018-05-24 19:56' AND log_time<'2018-05-24 19:58') AND (database='postgres' OR database IS NULL OR user_name IS NULL OR user_name='pryzbyj') AND message NOT LIKE 'statement:%' ORDER BY 1; log_time | 2018-05-24 19:56:35.396-04 database | user_name | session_id | 5b075131.3ec0 left | skipping vacuum of "pg_toast_2619" --- lock not available ... log_time | 2019-05-24 19:57:35.78-04 database | user_name | session_id | 5b07516d.445e left | automatic vacuum of table "postgres.pg_toast.pg_toast_2619": index scans: 1 : pages: 0 removed, 85 remain, 0 skipped due to pins, 0 skipped frozen : tuples: 1 removed, 348 remain, 0 are dead but not yet removable, oldest xmin: 63803106 : buffer usage: 179 hits, 4 misses, 87 dirtied : avg read rate: 1.450 MB/s, avg write rate: 31.531 MB/s : system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.02 s log_time | 2018-05-24 19:57:35.879-04 database | postgres user_name | pryzbyj session_id | 5b07516f.447f left | missing chunk number 0 for toast value 624341680 in pg_toast_2619 log_time | 2018-05-24 19:57:44.332-04 database | user_name | session_id | 5af9fda3.70d5 left | checkpoint starting: time Justin
Re: postgres=# VACUUM FULL pg_statistic => ERROR: missing chunk number 0 for toast value .. in pg_toast_2619
On Sat, May 19, 2018 at 11:08:23AM -0400, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > I'll defer fixing this for awhile in case someone wants me to save a copy of > > the relation/toast/index. From last time, I recall this just needs the > > right > > combination of REINDEX/VACUUM/ANALYZE, and the only complication was me > > needing to realize the right combination of affected DB(s). > > If you could come up with such a sequence that causes the problem > reproducibly, that would be of huge interest, and probably lead to > a fix promptly. But I don't think that we can do much by looking > at the post-mortem state --- once the toast rows are gone, they're > gone, especially if the table's been vacuumed since. This is unlikely to allow reproducing it, but for sake of completeness here's a fuller log. I'll try to trigger on another DB. postgres=# SELECT log_time, database, session_id, left(message,99) FROM postgres_log WHERE log_time BETWEEN '2018-05-19 07:49:01' AND '2018-05-19 07:50' AND (database IS NULL OR database='postgres') ORDER BY 1 ; 2018-05-19 07:49:02.232-06 | | 5afbc238.382f | checkpoint complete: wrote 32175 buffers (6.1%); 0 WAL file(s) added, 0 removed, 8 recycled; write= 2018-05-19 07:49:02.261-06 | postgres | 5b002b4e.65f2 | statement: SHOW server_version 2018-05-19 07:49:02.278-06 | postgres | 5b002b4e.65f7 | statement: SELECT pg_get_indexdef('jrn_postgres_log_log_time_idx'::regclass) 2018-05-19 07:49:02.29-06 | postgres | 5b002b4e.65f9 | statement: SELECT 1 FROM information_schema.tables WHERE table_name='postgres_log' LIMIT 1 2018-05-19 07:49:02.311-06 | postgres | 5b002b4e.65fb | statement: SELECT 1 FROM pg_class WHERE relname='jrn_postgres_log' 2018-05-19 07:49:02.324-06 | postgres | 5b002b4e.65fd | statement: SELECT 1 FROM pg_class WHERE relname='jrn_postgres_log_unique_idx' 2018-05-19 07:49:02.338-06 | postgres | 5b002b4e.65ff | statement: SELECT 1 FROM pg_class WHERE relname='jrn_postgres_log_log_time_idx' 2018-05-19 07:49:02.353-06 | postgres | 5b002b4e.6601 | statement: SELECT 1 FROM pg_class WHERE relname='jrn_postgres_log_error_severity_idx' 2018-05-19 07:49:02.37-06 | postgres | 5b002b4e.6603 | statement: SELECT 1 FROM pg_class WHERE relname='jrn_postgres_log_message_system_idx' 2018-05-19 07:49:02.39-06 | postgres | 5b002b4e.6605 | statement: SELECT 1 FROM pg_class WHERE relname='jrn_postgres_log_error_message_idx' 2018-05-19 07:49:02.405-06 | postgres | 5b002b4e.6607 | statement: SELECT 1 FROM pg_class WHERE relname='jrn_postgres_log_duration_idx' 2018-05-19 07:49:02.422-06 | postgres | 5b002b4e.6609 | statement: SELECT 1 FROM pg_class WHERE relname='jrn_postgres_log_quotedquoted_idx' 2018-05-19 07:49:02.464-06 | postgres | 5b002b4e.6619 | statement: SELECT 1 FROM pg_class WHERE relname='postgres_log_2018_05_19_0700' 2018-05-19 07:49:02.482-06 | postgres | 5b002b4e.661c | statement: COPY postgres_log_2018_05_19_0700 FROM '/var/log/postgresql/postgresql-2018-05-19_074617 2018-05-19 07:49:04.711-06 | postgres | 5b002b50.6627 | statement: SELECT 1 FROM pg_class WHERE relname='postgres_log_2018_05_19_0700' 2018-05-19 07:49:04.724-06 | postgres | 5b002b50.662a | statement: COPY postgres_log_2018_05_19_0700 FROM '/var/log/postgresql/postgresql-2018-05-19_074643 2018-05-19 07:49:06.803-06 | postgres | 5b002b52.6637 | statement: SELECT pg_get_indexdef('jrn_postgres_log_duration_idx'::regclass) 2018-05-19 07:49:06.837-06 | postgres | 5b002b52.6639 | statement: SELECT inhrelid::regclass::text FROM pg_inherits i LEFT JOIN pg_constraint c ON i.inhrel 2018-05-19 07:49:06.867-06 | postgres | 5b002b52.663b | statement: SELECT inhrelid::regclass::text FROM pg_inherits WHERE inhparent='postgres_log'::regclas 2018-05-19 07:49:06.918-06 | postgres | 5b002b52.6641 | statement: SELECT log_time<now()-'25 hours'::interval FROM postgres_log_2018_05_18_0700 LIMIT 1 2018-05-19 07:49:14.126-06 | postgres | 5b002b5a.66c9 | statement: SELECT DISTINCT ON (session_id) log_time, session_id, replace(regexp_replace(detail,'^(. 2018-05-19 07:49:32.264-06 | | 5afbc238.382f | checkpoint starting: time 2018-05-19 07:49:33.972-06 | | 5b002b59.66c1 | automatic analyze of table "ts.public.cdrs_huawei_sgwrecord_2018_05_19" system usage: CPU: user: 6. 2018-05-19 07:49:38.192-06 | postgres | 5b002b72.69d5 | statement: SELECT starelid::regclass, attname FROM pg_statistic s JOIN pg_attribute a + | | | ON a.attrel 2018-05-19 07:49:38.232-06 | postgres | 5b002b72.69d8 | statement: DELETE FROM pg_statistic s USING pg_attribute a WHERE + | | | a.attrelid=s.starelid AND a.attn 2018-05-19 07:49:38.266-0
postgres=# VACUUM FULL pg_statistic => ERROR: missing chunk number 0 for toast value .. in pg_toast_2619
I recall seeing various discussions hoping that it'd been finally fixed - Just wanted to report that this has happened now under postgres 10.4. It looks like this is not related to: 0408e1ed599b06d9bca2927a50a4be52c9e74bb9 which is for "unexpected chunk number" (?) Note that this is on the postgres database, which I think is where I saw it on one of our internal VMs in the past (althought my memory indicates that may have affected multiple DBs). In the immediate case, this is customer's centos6 VM running under qemu/KVM: the same configuration as our internal VM which had this issue (I just found a ticket dated 2016-10-06). In case it helps: - the postgres database has a few things in it, primarily imported CSV logs. On this particular server, there's actually a 150GB table with old CSV logs from an script I fixed recently to avoid saving many lines than intended (something like for each session_id every session_line following an error_severity!='LOG') - I also have copies of pg_stat_bgwriter, pg_settings, and an aggregated copy of pg_buffercache here. - nagios: some scripts loop around all DBs; some maybe connect directly to postgres (for example, to list DBs). However, I don't think check_postgres probably doesn't connect to postgres DB. I'll defer fixing this for awhile in case someone wants me to save a copy of the relation/toast/index. From last time, I recall this just needs the right combination of REINDEX/VACUUM/ANALYZE, and the only complication was me needing to realize the right combination of affected DB(s). Thanks, Justin
Re: relkind='p' has no pg_stat_user_tables
On Thu, May 03, 2018 at 11:15:19AM -0700, Adrian Klaver wrote: > On 05/03/2018 10:38 AM, Justin Pryzby wrote: > >On Thu, May 03, 2018 at 09:31:12AM -0700, Adrian Klaver wrote: > >>On 05/03/2018 09:20 AM, Alvaro Herrera wrote: > >>>>https://www.postgresql.org/docs/10/static/sql-createtable.html > >>>> > >>>>"A partitioned table is divided into sub-tables (called partitions), which > >>>>are created using separate CREATE TABLE commands. The partitioned table is > >>>>itself empty. A data row inserted into the table is routed to a partition > >>>>based on the value of columns or expressions in the partition key. ... " > >>> > >>>Yeah, but I think Justin has a valid question from the POV of the user: > >>>how can we figure out if we need to re-run analyze on a partitioned > >>>table, if the time of last analyze is not stored anywhere? > >> > >>I agree. The only thing I can think of is, that knowing : > >> > >>ANALYZE VERBOSE t; > >> > >>walks the inheritance tree, look at the pg_stat_user_tables for one of the > >>children for the last time analyzed. > > > >I think I can make this work for my purposes: > > > >SELECT MIN(GREATEST(last_analyze,last_autoanalyze)) > >FROM pg_stat_user_tables psut > >JOIN pg_inherits i > >ON i.inhrelid=psut.relid > >WHERE i.inhparent=... > > > >I was about to say that it's perhaps more correct for relkind='r' parents, > >too. > > > >But actually, it looks like for relkind='p', ANALYZE populates stats on child > >tables in addition to the parent. For relkind='r', the behavior (introduced > >in > >PG9.0 as I recall) is that ANALYZE on parent creates stats only for parent > >(both "inherited" stats including children, and "ONLY" stats for the > >potentially-nonempty parent). > > > >I guess ability to update child tables' stats is a nice feature, but I'm > >surprised. I wonder if that was a deliberate/documented change ? > > I was with you until I got to the above. You seem to be comparing apples and > oranges unless I am missing something. Yes, I was surprised about the difference between ANALYZE relkind_p and relkind_r. But I see that's a documented behavior I'd missed until now: https://www.postgresql.org/docs/current/static/sql-analyze.html |If the specified table is a partitioned table, both the inheritance statistics |of the partitioned table as a whole and statistics of the individual partitions |are updated. Thanks, Justin
Re: relkind='p' has no pg_stat_user_tables
On Thu, May 03, 2018 at 09:31:12AM -0700, Adrian Klaver wrote: > On 05/03/2018 09:20 AM, Alvaro Herrera wrote: > >>https://www.postgresql.org/docs/10/static/sql-createtable.html > >> > >>"A partitioned table is divided into sub-tables (called partitions), which > >>are created using separate CREATE TABLE commands. The partitioned table is > >>itself empty. A data row inserted into the table is routed to a partition > >>based on the value of columns or expressions in the partition key. ... " > > > >Yeah, but I think Justin has a valid question from the POV of the user: > >how can we figure out if we need to re-run analyze on a partitioned > >table, if the time of last analyze is not stored anywhere? > > I agree. The only thing I can think of is, that knowing : > > ANALYZE VERBOSE t; > > walks the inheritance tree, look at the pg_stat_user_tables for one of the > children for the last time analyzed. I think I can make this work for my purposes: SELECT MIN(GREATEST(last_analyze,last_autoanalyze)) FROM pg_stat_user_tables psut JOIN pg_inherits i ON i.inhrelid=psut.relid WHERE i.inhparent=... I was about to say that it's perhaps more correct for relkind='r' parents, too. But actually, it looks like for relkind='p', ANALYZE populates stats on child tables in addition to the parent. For relkind='r', the behavior (introduced in PG9.0 as I recall) is that ANALYZE on parent creates stats only for parent (both "inherited" stats including children, and "ONLY" stats for the potentially-nonempty parent). I guess ability to update child tables' stats is a nice feature, but I'm surprised. I wonder if that was a deliberate/documented change ? Justin
relkind='p' has no pg_stat_user_tables
I (finally) realized that my script for ANALYZEing parents of table hierarchies every month or so was looping around the same parent tables every night due to no stats for date of last last analysis. I guess that's deliberate/known and maybe related to relkind='p' having no relfilenode. Is there any good workaround other than making stampfiles or making my own "last analyzed" table? Thanks, Justin
Re: circular wait not triggering deadlock ?
On Thu, Mar 08, 2018 at 03:05:36PM -0500, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > On Thu, Mar 08, 2018 at 01:57:06PM -0500, Tom Lane wrote: > >> PID 20488 is evidently waiting for PID 6471 to finish its transaction. > >> What's that one doing? > > > Um, I thought I had kept track of all two pids but looks not.. > > query| SELECT pg_export_snapshot(); > > pg_export_snapshot doesn't block for other transactions, though. > Further down, you have output that confirms that: > > > 6471 | idle in transaction | psql | SELECT pg_export_snapshot(); > > That SELECT is the most recently completed command, not the current one. > So now the question is what the connected application is waiting for > before committing. I believe it does an idle loop around sleep(1), until all the pg_dump --snapshot children are done. Justin
Re: circular wait not triggering deadlock ?
On Thu, Mar 08, 2018 at 01:57:06PM -0500, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > Running pg10.2, I have a handful of maintenance jobs run in the middle of > > the > > night, which appear to have gotten stuck waiting on each other.. > > > ts=# SELECT granted, pid, mode, virtualtransaction, virtualxid FROM > > pg_locks WHERE virtualxid='22/4317099' ORDER BY 1,2,3; > > granted | pid | mode | virtualtransaction | virtualxid > > -+---+---++ > > f | 20488 | ShareLock | 19/28401734| 22/4317099 > > t | 6471 | ExclusiveLock | 22/4317099 | 22/4317099 > > PID 20488 is evidently waiting for PID 6471 to finish its transaction. > What's that one doing? Um, I thought I had kept track of all two pids but looks not.. query| SELECT pg_export_snapshot(); So that explains that. I already killed one proc, but.. ts=# SELECT granted, relation::regclass, pid, mode, virtualtransaction, virtualxid FROM pg_locks WHERE pid=17248 ORDER BY 1; granted |relation | pid | mode | virtualtransaction | virtualxid -+-+---+-++ f | eric_enodeb_cell_metrics| 17248 | AccessShareLock | 27/1755026 | ts=# SELECT granted, relation::regclass, pid, mode, virtualtransaction, virtualxid FROM pg_locks WHERE relation='eric_enodeb_cell_metrics'::regclass ORDER BY 1; --pid=17248 virtualxid='22/4317099' ORDER BY 1,2,3; granted | relation | pid | mode | virtualtransaction | virtualxid -+--+---+--++ f | eric_enodeb_cell_metrics | 22961 | AccessShareLock | 31/1337307 | f | eric_enodeb_cell_metrics | 17248 | AccessShareLock | 27/1755026 | f | eric_enodeb_cell_metrics | 28357 | AccessShareLock | 28/1118276 | f | eric_enodeb_cell_metrics | 21846 | AccessExclusiveLock | 29/830016 | f | eric_enodeb_cell_metrics | 16592 | AccessExclusiveLock | 18/38156962| f | eric_enodeb_cell_metrics | 18941 | AccessShareLock | 32/838769 | t | eric_enodeb_cell_metrics | 530 | ShareUpdateExclusiveLock | 16/38159763| (7 rows) ts=# SELECT pid, state, left(query,199) FROM pg_stat_activity WHERE pid IN(530,17248); 530 | active | ALTER TABLE eric_enodeb_cell_metrics* ALTER start_time SET STATISTICS 400 17248 | active | SELECT c.tableoid, c.oid, c.relname, (SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM (SELECT acl, row_n FROM pg_catalog.unnest(coalesce(c.relacl,pg_catalog.acldefault( CASE WHEN c.relkind = 'S' ts=# SELECT granted, pid, mode, virtualtransaction, virtualxid FROM pg_locks WHERE relation='eric_enodeb_cell_20180304'::regclass ; granted | pid | mode | virtualtransaction | virtualxid -+---+--++ t | 20488 | ShareUpdateExclusiveLock | 19/28401734| f | 530 | ShareUpdateExclusiveLock | 16/38159763| ts=# SELECT pid, state, left(query,199) FROM pg_stat_activity WHERE pid=20488; pid | state | left ---++--- 20488 | active | CREATE INDEX CONCURRENTLY index_1113966210 ON eric_enodeb_cell_20180304 USING btree (site_id) WITH (fillfactor='100') TABLESPACE oldindex ts=# SELECT granted, pid, relation::regclass, locktype, mode virtualxid, virtualxid, virtualtransaction FROM pg_locks WHERE pid=20488 ORDER BY 1; granted | pid | relation | locktype |virtualxid | virtualxid | virtualtransaction -+---+---++--+-+ f | 20488 | | virtualxid | ShareLock | 22/4317099 | 19/28401734 ts=# SELECT granted, pid, relation::regclass, locktype, mode virtualxid, virtualtransaction FROM pg_locks WHERE virtualxid='22/4317099' ORDER BY 1; granted | pid | relation | locktype | virtualxid | virtualtransaction -+---+--++---+ f | 20488 | | virtualxid | ShareLock | 19/28401734 t | 6471 | | virtualxid | ExclusiveLock | 22/4317099 (2 rows) So...I gather ALTER SET STATS is waiting on pg_dump which is waiti
Re: backends stuck in "startup"
On Wed, Nov 22, 2017 at 07:43:50PM -0500, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > For starters, I found that PID 27427 has: > > > (gdb) p proc->lwWaiting > > $1 = 0 '\000' > > (gdb) p proc->lwWaitMode > > $2 = 1 '\001' > > To confirm, this is LWLockAcquire's "proc", equal to MyProc? > If so, and if LWLockAcquire is blocked at PGSemaphoreLock, > that sure seems like a smoking gun. Right: (gdb) bt #0 0x003f6ee0d930 in ?? () #1 0x7f19f3400d38 in ?? () #2 0x006a4440 in PGSemaphoreLock (sema=0x7f19f3400d38) at pg_sema.c:310 #3 0x00714495 in LWLockAcquire (lock=0x7f19f3404f80, mode=LW_SHARED) at lwlock.c:1233 #4 0x006f75e5 in BufferAlloc (smgr=0x1f3c368, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=1, mode=RBM_NORMAL, strategy=0x0, hit=0x7fff6ef1c5af "") at bufmgr.c:1012 #5 ReadBuffer_common (smgr=0x1f3c368, relpersistence=112 'p', forkNum=MAIN_FORKNUM, blockNum=1, mode=RBM_NORMAL, strategy=0x0, hit=0x7fff6ef1c5af "") at bufmgr.c:745 [...] (gdb) up #1 0x7f19f3400d38 in ?? () (gdb) up #2 0x006a4440 in PGSemaphoreLock (sema=0x7f19f3400d38) at pg_sema.c:310 310 errStatus = sem_wait(PG_SEM_REF(sema)); (gdb) up #3 0x00714495 in LWLockAcquire (lock=0x7f19f3404f80, mode=LW_SHARED) at lwlock.c:1233 1233PGSemaphoreLock(proc->sem); (gdb) p proc->lwWaiting $1 = 0 '\000' (gdb) p proc->lwWaitMode $2 = 1 '\001' (gdb) p proc $3 = (PGPROC *) 0x7f1a77dba500 (gdb) p MyProc $4 = (PGPROC *) 0x7f1a77dba500 (gdb) p MyProc==proc $5 = 1 I suppose one needs to show that this backend was stuck and not "just happened to start 1ms before gdb connected", which I actually have: postgres 27427 17665 0 16:57 ?00:00:00 postgres: indigo indigo 207.241.169.248(37226) startup ... postgres 30298 17665 0 17:01 ?00:00:00 postgres: main main 207.241.169.100(43088) startup ..so that process was in "startup" 3-4 minutes after being forked. > My hypothesis about a missed memory barrier would imply that there's (at > least) one process that's waiting but is not in the lock's wait queue and Do I have to also check the wait queue to verify? Give a hint/pointer please? > > Would you suggest how I can maximize the likelyhood/speed of triggering > > that ? > > Five years ago, with a report of similar symptoms, you said "You need to > > hack > > pgbench to suppress the single initialization connection it normally likes > > to > > make, else the test degenerates to the one-incoming-connection case" > > https://www.postgresql.org/message-id/8896.1337998337%40sss.pgh.pa.us > > I don't think that case was related at all. > > My theory suggests that any contended use of an LWLock is at risk, > in which case just running pgbench with about as many sessions as > you have in the live server ought to be able to trigger it. However, > that doesn't really account for your having observed the problem > only during session startup, Remember, this issue breaks existing sessions, too. After it'd happened a couple times, I made sure to leave a session opened to allow collecting diagnostics (and try to un-stick it), but when it recurred, was unable to even SELECT * FROM pg_locks. Am I right this won't help for lwlocks? ALTER SYSTEM SET log_lock_waits=yes Justin
Re: backends stuck in "startup"
On Wed, Nov 22, 2017 at 01:27:12PM -0500, Tom Lane wrote: > Justin Pryzby <pry...@telsasoft.com> writes: > > On Tue, Nov 21, 2017 at 03:40:27PM -0800, Andres Freund wrote: > >> Could you try stracing next time? > > > I straced all the "startup" PIDs, which were all in futex, without > > exception: > > If you've got debug symbols installed, could you investigate the states > of the LWLocks the processes are stuck on? > > My hypothesis about a missed memory barrier would imply that there's (at > least) one process that's waiting but is not in the lock's wait queue and > has MyProc->lwWaiting == false, while the rest are in the wait queue and > have MyProc->lwWaiting == true. Actually chasing through the list > pointers would be slightly tedious, but checking MyProc->lwWaiting, > and maybe MyProc->lwWaitMode, in each process shouldn't be too hard. > Also verify that they're all waiting for the same LWLock (by address). I believe my ~40 cores are actually for backends from two separate instances of this issue on the VM, as evidenced by different argv pointers. And for each instance, I have cores for only a fraction of the backends (max_connections=400). For starters, I found that PID 27427 has: (gdb) p proc->lwWaiting $1 = 0 '\000' (gdb) p proc->lwWaitMode $2 = 1 '\001' ..where all the others have lwWaiting=1 For #27427: (gdb) p *lock $27 = {tranche = 59, state = {value = 1627389952}, waiters = {head = 147, tail = 308}} (gdb) info locals mustwait = 1 '\001' proc = 0x7f1a77dba500 result = 1 '\001' extraWaits = 0 __func__ = "LWLockAcquire" And at this point I have to ask for help how to finish traversing these structures. I could upload cores for someone (I don't think there's anything too private) but so far I have 16GB gz compressed cores. Note: I've compiled locally PG 10.1 with PREFERRED_SEMAPHORES=SYSV to keep the service up (and to the degree that serves to verify that avoids the issue, great). But I could start an instance running pgbench to try to trigger on this VM, with smaller shared_buffers and backends/clients to allow full cores of every backend (I don't think I'll be able to dump all 400 cores each up to 2GB from the production instance). Would you suggest how I can maximize the likelyhood/speed of triggering that ? Five years ago, with a report of similar symptoms, you said "You need to hack pgbench to suppress the single initialization connection it normally likes to make, else the test degenerates to the one-incoming-connection case" https://www.postgresql.org/message-id/8896.1337998337%40sss.pgh.pa.us ..but, pgbench --connect seems to do what's needed(?) (I see that dates back to 2001, having been added at ba708ea3). (I don't know there's any suggestion or reason to be believe the bug is specific to connection/startup phase, or that it's a necessary or sufficient to hit the bug, but it's at least known to be impacted and all I have to go on for now). Justin
Re: backends stuck in "startup"
On Tue, Nov 21, 2017 at 03:40:27PM -0800, Andres Freund wrote: > Hi, > > On 2017-11-21 17:09:26 -0600, Justin Pryzby wrote: > > I'm sorry to report this previously reported problem is happening again, > > starting shortly after pg_upgrading a customer to PG10.1 from 9.6.5. > > > > As $subject: backends are stuck in startup for minutes at a time. I didn't > > strace this time, but I believe last time I saw one was waiting in a futex. > > > - I'm planning do disable sync seqscan for the night (although as I > > understand > >that may not matter (?)) > > I doubt it matters here. The system still got wedged/stuck after a few hours even with: synchronize_seqscans=off; and use of pg_buffercache disabled (but extension not actually dropped), and our maintenance (vacuum and reindex) scripts disabled (these normally only run at night, so at least REINDEX script wasn't involved with several earlier incidents of this issue; but, I was still running the vacuum ANALYZE script post-upgrade). Justin
Re: backends stuck in "startup"
On Tue, Nov 21, 2017 at 03:40:27PM -0800, Andres Freund wrote: > Hi, > > On 2017-11-21 17:09:26 -0600, Justin Pryzby wrote: > > I'm sorry to report this previously reported problem is happening again, > > starting shortly after pg_upgrading a customer to PG10.1 from 9.6.5. > > > > As $subject: backends are stuck in startup for minutes at a time. I didn't > > strace this time, but I believe last time I saw one was waiting in a futex. > > Could you try stracing next time? I straced all the "startup" PIDs, which were all in futex, without exception: [pryzbyj@TS-DB ~]$ sudo strace `ps h --sort start_time -o pid,args -u postgres |awk '/startup$/{print "-p"$1}'` 2>&1 |tee -a /tmp/strace-pg-backends Process 3611 attached Process 6081 attached [pid 6081] futex(0x7ff4314000f8, FUTEX_WAIT, 0, NULL [pid 3611] futex(0x7ff431400138, FUTEX_WAIT, 0, NULL [pid 3354] futex(0x7ff431400258, FUTEX_WAIT, 0, NULL [pid 3117] futex(0x7ff4314002b8, FUTEX_WAIT, 0, NULL [pid 2948] futex(0x7ff431400378, FUTEX_WAIT, 0, NULL [pid 2925] futex(0x7ff4314003d8, FUTEX_WAIT, 0, NULL [pid 2789] futex(0x7ff431400438, FUTEX_WAIT, 0, NULL [pid 2752] futex(0x7ff4314004b8, FUTEX_WAIT, 0, NULL [pid 2663] futex(0x7ff431400618, FUTEX_WAIT, 0, NULL [pid 2661] futex(0x7ff431400638, FUTEX_WAIT, 0, NULL [pid 2624] futex(0x7ff431400678, FUTEX_WAIT, 0, NULL [pid 2576] futex(0x7ff4314006f8, FUTEX_WAIT, 0, NULL [pid 365] futex(0x7ff431400718, FUTEX_WAIT, 0, NULL [pid 326] futex(0x7ff431400798, FUTEX_WAIT, 0, NULL [pid 32704] futex(0x7ff431400898, FUTEX_WAIT, 0, NULL [pid 32309] futex(0x7ff431400938, FUTEX_WAIT, 0, NULL [pid 32292] futex(0x7ff431400a18, FUTEX_WAIT, 0, NULL [pid 32170] futex(0x7ff431400ad8, FUTEX_WAIT, 0, NULL [pid 32134] futex(0x7ff431400b58, FUTEX_WAIT, 0, NULL [pid 31987] futex(0x7ff431400bf8, FUTEX_WAIT, 0, NULL [pid 31962] futex(0x7ff431400c58, FUTEX_WAIT, 0, NULL [pid 31817] futex(0x7ff431400d38, FUTEX_WAIT, 0, NULL [pid 31807] futex(0x7ff431400d98, FUTEX_WAIT, 0, NULL [pid 29664] futex(0x7ff431400e98, FUTEX_WAIT, 0, NULL [pid 29462] futex(0x7ff431400ef8, FUTEX_WAIT, 0, NULL [pid 29426] futex(0x7ff431400f38, FUTEX_WAIT, 0, NULL [pid 29258] futex(0x7ff431401058, FUTEX_WAIT, 0, NULL [pid 29227] futex(0x7ff431402f58, FUTEX_WAIT, 0, NULL [pid 29107] futex(0x7ff431401138, FUTEX_WAIT, 0, NULL [pid 29071] futex(0x7ff431401198, FUTEX_WAIT, 0, NULL [pid 28942] futex(0x7ff431401258, FUTEX_WAIT, 0, NULL [pid 28924] futex(0x7ff431401298, FUTEX_WAIT, 0, NULL [pid 26942] futex(0x7ff431401418, FUTEX_WAIT, 0, NULL [pid 25278] futex(0x7ff431401438, FUTEX_WAIT, 0, NULL [pid 24909] futex(0x7ff431401598, FUTEX_WAIT, 0, NULL [pid 24908] futex(0x7ff4314015b8, FUTEX_WAIT, 0, NULL [pid 24852] futex(0x7ff4314015f8, FUTEX_WAIT, 0, NULL [pid 24851] futex(0x7ff431401618, FUTEX_WAIT, 0, NULL [pid 24585] futex(0x7ff4314016f8, FUTEX_WAIT, 0, NULL [pid 24584] futex(0x7ff431401718, FUTEX_WAIT, 0, NULL [pid 24478] futex(0x7ff431401918, FUTEX_WAIT, 0, NULL [pid 24476] futex(0x7ff431401958, FUTEX_WAIT, 0, NULL [pid 24470] futex(0x7ff4314019d8, FUTEX_WAIT, 0, NULL [pid 24464] futex(0x7ff431401ad8, FUTEX_WAIT, 0, NULL [pid 24458] futex(0x7ff431401bb8, FUTEX_WAIT, 0, NULL [pid 24456] futex(0x7ff431401b98, FUTEX_WAIT, 0, NULL [pid 24450] futex(0x7ff431401c58, FUTEX_WAIT, 0, NULL [pid 24448] futex(0x7ff431401c98, FUTEX_WAIT, 0, NULL [pid 24439] futex(0x7ff431401cf8, FUTEX_WAIT, 0, NULL [pid 24323] futex(0x7ff431401e38, FUTEX_WAIT, 0, NULL [pid 24322] futex(0x7ff431401e58, FUTEX_WAIT, 0, NULL [pid 23057] futex(0x7ff431401fd8, FUTEX_WAIT, 0, NULL [pid 23055] futex(0x7ff431402038, FUTEX_WAIT, 0, NULL [pid 21826] futex(0x7ff4314021f8, FUTEX_WAIT, 0, NULL [pid 21824] futex(0x7ff431402238, FUTEX_WAIT, 0, NULL [pid 21794] futex(0x7ff431402298, FUTEX_WAIT, 0, NULL [pid 21792] futex(0x7ff431402638, FUTEX_WAIT, 0, NULL [pid 21791] futex(0x7ff431402af8, FUTEX_WAIT, 0, NULL [pid 21790] futex(0x7ff431402c98, FUTEX_WAIT, 0, NULL [pid 21786] futex(0x7ff431402478, FUTEX_WAIT, 0, NULL [pid 21785] futex(0x7ff431402758, FUTEX_WAIT, 0, NULL [pid 21784] futex(0x7ff431403098, FUTEX_WAIT, 0, NULL [pid 21768] futex(0x7ff431402718, FUTEX_WAIT, 0, NULL [pid 21662] futex(0x7ff431402338, FUTEX_WAIT, 0, NULL [pid 21648] futex(0x7ff431402c38, FUTEX_WAIT, 0, NULL [pid 21630] futex(0x7ff4314024d8, FUTEX_WAIT, 0, NULL [pid 21591] futex(0x7ff431402278, FUTEX_WAIT, 0, NULL [pid 21590] futex(0x7ff431402bf8, FUTEX_WAIT, 0, NULL Also: [pryzbyj@TS-DB ~]$ date;ps -O lstart,wchan=99 -u postgres --sort start_time |grep startup$ Wed Nov 22 00:47:41 EST 2017 1990 Wed Nov 22 00:36:44 2017 futex_wait_queue_meS ?00:00:00 postgres: xx xx xx.247(53934) startup 2007 Wed Nov 22 00:36:44 2017 futex_wait_queue_meS ?00:00:00 postgres: xx xx xx.247(53945) startup 2014 Wed Nov 22 00:36:44 2017 futex_wait_queue_meS ?