Hi hackers, In other thread "[HACKERS] Block level parallel vacuum"[1], Prabhat Kumar Sahu reported a random assert failure but he got only once and he was not able to reproduce it. In that thread [2], Amit Kapila suggested some points to reproduce assert. I tried to reproduce and I was able to reproduce it consistently.
Below are the steps to reproduce assert: *Configure sett*ing: log_min_messages=debug1 autovacuum_naptime = 5s autovacuum = on postgres=# create temporary table temp1(c1 int); CREATE TABLE postgres=# \d+ List of relations Schema | Name | Type | Owner | Persistence | Size | Description -----------+-------+-------+----------+-------------+---------+------------- pg_temp_3 | temp1 | table | mahendra | temporary | 0 bytes | (1 row) postgres=# drop schema pg_temp_3 cascade; NOTICE: drop cascades to table temp1 DROP SCHEMA postgres=# \d+ Did not find any relations. postgres=# create temporary table temp2(c1 int); CREATE TABLE postgres=# \d+ Did not find any relations. postgres=# select pg_sleep(6); WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. HINT: In a moment you should be able to reconnect to the database and repeat your command. server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. postgres=# *Stack Trace:* elinux-2.5-12.el7.x86_64 libxml2-2.9.1-6.el7_2.3.x86_64 openssl-libs-1.0.2k-12.el7.x86_64 pcre-8.32-17.el7.x86_64 xz-libs-5.2.2-1.el7.x86_64 zlib-1.2.7-17.el7.x86_64 (gdb) bt #0 0x00007f80b2ef9277 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #1 0x00007f80b2efa968 in __GI_abort () at abort.c:90 #2 0x0000000000ecdd4e in ExceptionalCondition (conditionName=0x11a9bcb "strvalue != NULL", errorType=0x11a9bbb "FailedAssertion", fileName=0x11a9bb0 "snprintf.c", lineNumber=442) at assert.c:67 #3 0x0000000000f80122 in dopr (target=0x7ffe902e44d0, format=0x10e8fe5 ".%s\"", args=0x7ffe902e45b8) at snprintf.c:442 #4 0x0000000000f7f821 in pg_vsnprintf (str=0x18cd480 "autovacuum: dropping orphan temp table \"postgres.", '\177' <repeats 151 times>..., count=1024, fmt=0x10e8fb8 "autovacuum: dropping orphan temp table \"%s.%s.%s\"", args=0x7ffe902e45b8) at snprintf.c:195 #5 0x0000000000f74cb3 in pvsnprintf (buf=0x18cd480 "autovacuum: dropping orphan temp table \"postgres.", '\177' <repeats 151 times>..., len=1024, fmt=0x10e8fb8 "autovacuum: dropping orphan temp table \"%s.%s.%s\"", args=0x7ffe902e45b8) at psprintf.c:110 #6 0x0000000000f7775b in appendStringInfoVA (str=0x7ffe902e45d0, fmt=0x10e8fb8 "autovacuum: dropping orphan temp table \"%s.%s.%s\"", args=0x7ffe902e45b8) at stringinfo.c:149 #7 0x0000000000ecf5de in errmsg (fmt=0x10e8fb8 "autovacuum: dropping orphan temp table \"%s.%s.%s\"") at elog.c:832 #8 0x0000000000aef625 in do_autovacuum () at autovacuum.c:2253 #9 0x0000000000aedfae in AutoVacWorkerMain (argc=0, argv=0x0) at autovacuum.c:1693 #10 0x0000000000aed82f in StartAutoVacWorker () at autovacuum.c:1487 #11 0x0000000000b1773a in StartAutovacuumWorker () at postmaster.c:5562 #12 0x0000000000b16c13 in sigusr1_handler (postgres_signal_arg=10) at postmaster.c:5279 #13 <signal handler called> #14 0x00007f80b2fb8c53 in __select_nocancel () at ../sysdeps/unix/syscall-template.S:81 #15 0x0000000000b0da27 in ServerLoop () at postmaster.c:1691 #16 0x0000000000b0cfa2 in PostmasterMain (argc=3, argv=0x18cb290) at postmaster.c:1400 #17 0x000000000097868a in main (argc=3, argv=0x18cb290) at main.c:210 ereport(LOG, (errmsg("autovacuum: dropping orphan temp table \"%s.%s.%s\"", get_database_name(MyDatabaseId), get_namespace_name(classForm->relnamespace), NameStr(classForm->relname)))); I debugged and found that "get_namespace_name(classForm->relnamespace)" was null so it was crashing. This bug is introduced or exposed from below mentioned commit: *commit 246a6c8f7b237cc1943efbbb8a7417da9288f5c4* Author: Michael Paquier <mich...@paquier.xyz> Date: Mon Aug 13 11:49:04 2018 +0200 Make autovacuum more aggressive to remove orphaned temp tables Commit dafa084, added in 10, made the removal of temporary orphaned tables more aggressive. This commit makes an extra step into the Before above commit, we were not getting any assert failure but \d+ was not showing any temp table info after "drop schema pg_temp_3 cascade" (for those tables are created after drooping schema) . As per my analysis, I can see that while drooping schema of temporary table, we are not setting myTempNamespace to invalid so at the time of creating again temporary table, we are not creating proper schema. We can fix this problem by either one way 1) reset myTempNamespace to invalid while drooping schema of temp table 2) should not allow to drop temporary table schema Please let me know your thoughts to fix this problem. [1]: https://www.postgresql.org/message-id/CANEvxPorfG2Ck3kuDkm5tWpK%2B3uCzRiibOJ-Lk4ZJ6wHP4KJfA%40mail.gmail.com [2]: https://www.postgresql.org/message-id/CAA4eK1L-Y7vyo%2BypH55kFHy1HS%3D4h1ZWQ%2B5fthKBgOdQzz4hOw%40mail.gmail.com Thanks and Regards Mahendra Siingh Thalor EnterpriseDB: http://www.enterprisedb.com