Hi, Stefan Kaltenbrunner reported a problem in postmaster via IM to me. I thought I had nailed down the bug, but after more careful reading of the code, turns out I was wrong.
The reported problem is that postmaster shuts itself down with this error message: 2010-11-12 10:19:05 CET FATAL: no free slots in PMChildFlags array I thought that canAcceptConnections() was confused about what the result of CountChildren() meant, but apparently not. This is a change from the 8.3 code that didn't have the ChildSlots stuff -- previously, if canAcceptConnections failed to report CAC_TOOMANY, it would just fail later when trying to add the backend to the shared-inval queue, as stated in the comment therein. In the new code, however, failure to keep an accurate count means that we fail later in AssigPostmasterChildSlot with a FATAL error, leading to overall shutdown. In postmaster.c, this all happens before forking, so I see no way for the system to be confused due to multiple processes starting in parallel. If you suspect that this may have to do with some race condition on starting many backends quickly, you would probably be right. The evidence from the log (which thankfully is set to DEBUG3, though most other settings about it seem to be rather broken) says that there were many backend starting just before the FATAL message: 2010-11-12 10:18:55 CET DEBUG: forked new backend, pid=2632 socket=348 2010-11-12 10:18:55 CET DEBUG: forked new backend, pid=840 socket=348 2010-11-12 10:18:55 CET DEBUG: forked new backend, pid=2972 socket=348 2010-11-12 10:18:55 CET DEBUG: forked new backend, pid=2724 socket=348 2010-11-12 10:18:57 CET DEBUG: forked new backend, pid=840 socket=348 2010-11-12 10:18:57 CET DEBUG: forked new backend, pid=2724 socket=348 2010-11-12 10:18:57 CET DEBUG: forked new backend, pid=2632 socket=348 2010-11-12 10:19:00 CET DEBUG: forked new backend, pid=2724 socket=348 2010-11-12 10:19:01 CET DEBUG: forked new backend, pid=2972 socket=348 2010-11-12 10:19:01 CET DEBUG: forked new backend, pid=2724 socket=348 2010-11-12 10:19:02 CET DEBUG: forked new backend, pid=2984 socket=348 2010-11-12 10:19:02 CET DEBUG: forked new backend, pid=840 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=2984 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=840 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=2984 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=2972 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=840 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=2724 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=2972 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=2904 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=840 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=1280 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=2984 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=2904 socket=348 2010-11-12 10:19:04 CET DEBUG: forked new backend, pid=840 socket=348 2010-11-12 10:19:05 CET DEBUG: forked new backend, pid=2724 socket=348 This is Windows 2000 Server --- I guess the PIDs being reused rather quickly is not something to worry particularly about. (Also note that log_line_prefix does not include the PID so it's not easy to learn much more from the log, according to Stefan). -- Álvaro Herrera <alvhe...@alvh.no-ip.org> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers