From: filip at hajny dot net Operating system: SmartOS (possibly SunOS) PHP version: Irrelevant Package: FPM related Bug Type: Bug Bug description:FPM won't listen to signals when event ports and catch_workers_output are used
Description: ------------ When "events.mechanism = port" and "catch_workers_output = yes" are in effect on SmartOS (and possibly Illumos/SunOS in general), FPM won't terminate the process stack on a signal received (includes attempts to shut down the stack with TERM, or reload with USR2). There seems to be no other way to stop the stack except for SIGKILL to the master. Looking at the debug log, we see SIGTERM sent to the children processes, which terminate, but stay as zombies. # ./sapi/fpm/php-fpm -F [01-Oct-2013 11:59:28.382567] DEBUG: pid 96294, fpm_scoreboard_init_main(), line 40: got clock tick '100' [01-Oct-2013 11:59:28.382842] DEBUG: pid 96294, fpm_event_init_main(), line 337: event module is port and 5 fds have been reserved [01-Oct-2013 11:59:28.382863] NOTICE: pid 96294, fpm_init(), line 83: fpm is running, pid 96294 [01-Oct-2013 11:59:28.383274] DEBUG: pid 96294, fpm_children_make(), line 421: [pool www] child 96295 started [01-Oct-2013 11:59:28.383626] DEBUG: pid 96294, fpm_children_make(), line 421: [pool www] child 96296 started [01-Oct-2013 11:59:28.383658] DEBUG: pid 96294, fpm_event_loop(), line 366: 2232 bytes have been reserved in SHM [01-Oct-2013 11:59:28.383665] NOTICE: pid 96294, fpm_event_loop(), line 367: ready to handle connections [01-Oct-2013 11:59:42.417811] DEBUG: pid 96294, fpm_event_loop(), line 419: event module triggered 1 events [01-Oct-2013 11:59:42.417854] DEBUG: pid 96294, fpm_got_signal(), line 85: received SIGTERM [01-Oct-2013 11:59:42.417864] NOTICE: pid 96294, fpm_got_signal(), line 86: Terminating ... [01-Oct-2013 11:59:42.417873] DEBUG: pid 96294, fpm_pctl(), line 231: switching to 'terminating' state [01-Oct-2013 11:59:42.417913] DEBUG: pid 96294, fpm_pctl_kill_all(), line 159: [pool www] sending signal 15 SIGTERM to child 96296 [01-Oct-2013 11:59:42.417933] DEBUG: pid 96294, fpm_pctl_kill_all(), line 159: [pool www] sending signal 15 SIGTERM to child 96295 [01-Oct-2013 11:59:42.417942] DEBUG: pid 96294, fpm_pctl_kill_all(), line 168: 2 child(ren) still alive [01-Oct-2013 11:59:42.417959] DEBUG: pid 96294, fpm_event_loop(), line 419: event module triggered 1 events [01-Oct-2013 11:59:42.418045] DEBUG: pid 96294, fpm_event_loop(), line 419: event module triggered 2 events [01-Oct-2013 11:59:42.418079] DEBUG: pid 96294, fpm_event_loop(), line 419: event module triggered 2 events [01-Oct-2013 11:59:42.418855] ERROR: pid 96294, fpm_stdio_child_said(), line 120: unable to read what child say: Bad file number (9) [01-Oct-2013 11:59:42.418923] DEBUG: pid 96294, fpm_event_loop(), line 419: event module triggered 1 events Killed DTracing sig_handler, we see one SIGCLD per child received. # dtrace -n 'pid$target::sig_handler:entry{ trace(arg0); }' -p $(pgrep php|sort - n|head -1) dtrace: description 'pid$target::sig_handler:entry' matched 1 probe CPU ID FUNCTION:NAME 2 483 sig_handler:entry 15 2 483 sig_handler:entry 18 2 483 sig_handler:entry 18 Oddly enough, the problem is only manifest as long as at least one child process exists from the flock launched at the start. If we manually terminate all the children processes first and let the master re-spawn them all, the problem goes away entirely. # ./sapi/fpm/php-fpm -F [01-Oct-2013 12:20:01.656647] DEBUG: pid 96910, fpm_scoreboard_init_main(), line 40: got clock tick '100' [01-Oct-2013 12:20:01.656920] DEBUG: pid 96910, fpm_event_init_main(), line 337: event module is port and 5 fds have been reserved [01-Oct-2013 12:20:01.656940] NOTICE: pid 96910, fpm_init(), line 83: fpm is running, pid 96910 [01-Oct-2013 12:20:01.657345] DEBUG: pid 96910, fpm_children_make(), line 421: [pool www] child 96911 started [01-Oct-2013 12:20:01.657696] DEBUG: pid 96910, fpm_children_make(), line 421: [pool www] child 96912 started [01-Oct-2013 12:20:01.657730] DEBUG: pid 96910, fpm_event_loop(), line 366: 2232 bytes have been reserved in SHM [01-Oct-2013 12:20:01.657740] NOTICE: pid 96910, fpm_event_loop(), line 367: ready to handle connections [01-Oct-2013 12:20:12.049038] DEBUG: pid 96910, fpm_event_loop(), line 419: event module triggered 4 events [01-Oct-2013 12:20:12.049745] ERROR: pid 96910, fpm_stdio_child_said(), line 120: unable to read what child say: Bad file number (9) [01-Oct-2013 12:20:12.049811] DEBUG: pid 96910, fpm_event_loop(), line 419: event module triggered 1 events [01-Oct-2013 12:20:12.049844] DEBUG: pid 96910, fpm_got_signal(), line 76: received SIGCHLD [01-Oct-2013 12:20:12.049880] WARNING: pid 96910, fpm_children_bury(), line 252: [pool www] child 96911 exited on signal 15 (SIGTERM) after 10.392237 seconds from start [01-Oct-2013 12:20:12.050366] NOTICE: pid 96910, fpm_children_make(), line 421: [pool www] child 96920 started [01-Oct-2013 12:20:12.050404] WARNING: pid 96910, fpm_children_bury(), line 252: [pool www] child 96912 exited on signal 15 (SIGTERM) after 10.392396 seconds from start [01-Oct-2013 12:20:12.050735] NOTICE: pid 96910, fpm_children_make(), line 421: [pool www] child 96921 started [01-Oct-2013 12:20:12.050770] DEBUG: pid 96910, fpm_got_signal(), line 76: received SIGCHLD [01-Oct-2013 12:20:12.050790] DEBUG: pid 96910, fpm_event_loop(), line 419: event module triggered 1 events ^C[01-Oct-2013 12:20:17.138831] DEBUG: pid 96910, fpm_got_signal(), line 80: received SIGINT [01-Oct-2013 12:20:17.138894] NOTICE: pid 96910, fpm_got_signal(), line 81: Terminating ... [01-Oct-2013 12:20:17.138927] DEBUG: pid 96910, fpm_pctl(), line 231: switching to 'terminating' state [01-Oct-2013 12:20:17.139001] DEBUG: pid 96910, fpm_pctl_kill_all(), line 159: [pool www] sending signal 15 SIGTERM to child 96921 [01-Oct-2013 12:20:17.139031] DEBUG: pid 96910, fpm_pctl_kill_all(), line 159: [pool www] sending signal 15 SIGTERM to child 96920 [01-Oct-2013 12:20:17.139043] DEBUG: pid 96910, fpm_pctl_kill_all(), line 168: 2 child(ren) still alive [01-Oct-2013 12:20:17.139069] DEBUG: pid 96910, fpm_event_loop(), line 419: event module triggered 1 events [01-Oct-2013 12:20:17.139796] DEBUG: pid 96910, fpm_got_signal(), line 76: received SIGCHLD [01-Oct-2013 12:20:17.139834] DEBUG: pid 96910, fpm_children_bury(), line 252: [pool www] child 96920 exited on signal 2 (SIGINT) after 5.089324 seconds from start [01-Oct-2013 12:20:17.139876] DEBUG: pid 96910, fpm_children_bury(), line 252: [pool www] child 96921 exited on signal 2 (SIGINT) after 5.088997 seconds from start [01-Oct-2013 12:20:17.139906] NOTICE: pid 96910, fpm_pctl_exit(), line 70: exiting, bye-bye! This has been a problem on all FPM-capable PHP branches (including current trunk) for at least a year now. Configure phase seems irrelevant, but for the record: # ./configure --enable-fpm --with-fpm-user=www --with-fpm-group=www --with-config- file-path=/opt/local/etc --with- config-file-scan-dir=/opt/local/etc/php.d --sysconfdir=/opt/local/etc -- localstatedir=/var --disable-all -- prefix=/opt/local -- Edit bug report at https://bugs.php.net/bug.php?id=65800&edit=1 -- Try a snapshot (PHP 5.4): https://bugs.php.net/fix.php?id=65800&r=trysnapshot54 Try a snapshot (PHP 5.5): https://bugs.php.net/fix.php?id=65800&r=trysnapshot55 Try a snapshot (trunk): https://bugs.php.net/fix.php?id=65800&r=trysnapshottrunk Fixed in SVN: https://bugs.php.net/fix.php?id=65800&r=fixed Fixed in release: https://bugs.php.net/fix.php?id=65800&r=alreadyfixed Need backtrace: https://bugs.php.net/fix.php?id=65800&r=needtrace Need Reproduce Script: https://bugs.php.net/fix.php?id=65800&r=needscript Try newer version: https://bugs.php.net/fix.php?id=65800&r=oldversion Not developer issue: https://bugs.php.net/fix.php?id=65800&r=support Expected behavior: https://bugs.php.net/fix.php?id=65800&r=notwrong Not enough info: https://bugs.php.net/fix.php?id=65800&r=notenoughinfo Submitted twice: https://bugs.php.net/fix.php?id=65800&r=submittedtwice register_globals: https://bugs.php.net/fix.php?id=65800&r=globals PHP 4 support discontinued: https://bugs.php.net/fix.php?id=65800&r=php4 Daylight Savings: https://bugs.php.net/fix.php?id=65800&r=dst IIS Stability: https://bugs.php.net/fix.php?id=65800&r=isapi Install GNU Sed: https://bugs.php.net/fix.php?id=65800&r=gnused Floating point limitations: https://bugs.php.net/fix.php?id=65800&r=float No Zend Extensions: https://bugs.php.net/fix.php?id=65800&r=nozend MySQL Configuration Error: https://bugs.php.net/fix.php?id=65800&r=mysqlcfg