Hi, On 2019-06-15 12:09:50 -0400, Alvaro Herrera wrote: > Once in a blue moon I get this assertion failure on server start: > > 2019-06-15 12:00:29.650 -04 [30080] LOG: iniciando PostgreSQL 12beta1 on > x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.4.0-1ubuntu1~18.04) 7.4.0, > 64-bit > 2019-06-15 12:00:29.650 -04 [30080] LOG: escuchando en la dirección IPv4 > «127.0.0.1», port 55432 > 2019-06-15 12:00:29.650 -04 [30080] LOG: escuchando en el socket Unix > «/tmp/.s.PGSQL.55432» > 2019-06-15 12:00:29.658 -04 [30956] LOG: el sistema de bases de datos fue > apagado en 2019-06-15 12:00:24 -04 > 2019-06-15 12:00:29.659 -04 [30080] LOG: proceso de servidor (PID 30107) > terminó con código de salida 15 > 2019-06-15 12:00:29.659 -04 [30080] LOG: terminando todos los otros procesos > de servidor activos > TRAP: FailedAssertion(«!(AbortStartTime == 0)», Archivo: > «/pgsql/source/master/src/backend/postmaster/postmaster.c», Línea: 2957) > Aborted (core dumped) > > Apologies for the Spanish -- I cannot readily reproduce this. In > essence, this shows a normal startup, until suddenly process 30107 > terminates with exit code 15, and then while shutting everything down, > postmaster hits the aforementioned assertion and terminates.
I assume this is on master as of a few days ago? This doesn't even look to be *after* a crash-restart? And I assume core files weren't enabled? > One problem with debugging this is that I don't know what process 30107 > is, since the logs don't mention it. Hm - it probably can't be that many processes, it looks like 30107 has to have started pretty soon after the startup process (which IIRC is the one emitting "el sistema de bases de datos fue apagado en"), and as soon as that's done 30107 is noticed as having crashed. Unfortunately, as this appears to be a start in a clean database, we don't really know which phase of startup this is. There's IIRC no messages to be expected before "database system is ready to accept connections" in a clean start. What is a bit odd is that: > 2019-06-15 12:00:29.659 -04 [30080] LOG: proceso de servidor (PID 30107) > terminó con código de salida 15 comes from: #. translator: %s is a noun phrase describing a child process, such as #. "server process" #: postmaster/postmaster.c:3656 #, c-format msgid "%s (PID %d) exited with exit code %d" msgstr "%s (PID %d) terminó con código de salida %d" #: postmaster/postmaster.c:3301 postmaster/postmaster.c:3321 #: postmaster/postmaster.c:3328 postmaster/postmaster.c:3346 msgid "server process" msgstr "proceso de servidor" And "server process" is afaict only used for actual backends, not other types of processes. But we've not yet seen "database system is ready to accept accept connections", so IIRC it could only be a "dead_end" type backend? But we didn't yet see an error from that... > No idea what is going on. Seems to indicate a logic error in postmaster's state machine. Perhaps something related to dead_end processes? Greetings, Andres Freund