From:             filip at hajny dot net
Operating system: SmartOS (possibly SunOS)
PHP version:      Irrelevant
Package:          FPM related
Bug Type:         Bug
Bug description:FPM won't listen to signals when event ports and 
catch_workers_output are used

Description:
------------
When "events.mechanism = port" and "catch_workers_output = yes" are in
effect on 
SmartOS (and possibly 
Illumos/SunOS in general), FPM won't terminate the process stack on a
signal 
received (includes attempts to shut 
down the stack with TERM, or reload with USR2). There seems to be no other
way to 
stop the stack except for 
SIGKILL to the master.

Looking at the debug log, we see SIGTERM sent to the children processes,
which 
terminate, but stay as zombies.

# ./sapi/fpm/php-fpm -F
[01-Oct-2013 11:59:28.382567] DEBUG: pid 96294, fpm_scoreboard_init_main(),
line 
40: got clock tick '100'
[01-Oct-2013 11:59:28.382842] DEBUG: pid 96294, fpm_event_init_main(), line
337: 
event module is port and 5 fds 
have been reserved
[01-Oct-2013 11:59:28.382863] NOTICE: pid 96294, fpm_init(), line 83: fpm
is 
running, pid 96294
[01-Oct-2013 11:59:28.383274] DEBUG: pid 96294, fpm_children_make(), line
421: 
[pool www] child 96295 started
[01-Oct-2013 11:59:28.383626] DEBUG: pid 96294, fpm_children_make(), line
421: 
[pool www] child 96296 started
[01-Oct-2013 11:59:28.383658] DEBUG: pid 96294, fpm_event_loop(), line 366:
2232 
bytes have been reserved in SHM
[01-Oct-2013 11:59:28.383665] NOTICE: pid 96294, fpm_event_loop(), line
367: ready 
to handle connections
[01-Oct-2013 11:59:42.417811] DEBUG: pid 96294, fpm_event_loop(), line 419:
event 
module triggered 1 events
[01-Oct-2013 11:59:42.417854] DEBUG: pid 96294, fpm_got_signal(), line 85:

received SIGTERM
[01-Oct-2013 11:59:42.417864] NOTICE: pid 96294, fpm_got_signal(), line 86:

Terminating ...
[01-Oct-2013 11:59:42.417873] DEBUG: pid 96294, fpm_pctl(), line 231:
switching to 
'terminating' state
[01-Oct-2013 11:59:42.417913] DEBUG: pid 96294, fpm_pctl_kill_all(), line
159: 
[pool www] sending signal 15 
SIGTERM to child 96296
[01-Oct-2013 11:59:42.417933] DEBUG: pid 96294, fpm_pctl_kill_all(), line
159: 
[pool www] sending signal 15 
SIGTERM to child 96295
[01-Oct-2013 11:59:42.417942] DEBUG: pid 96294, fpm_pctl_kill_all(), line
168: 2 
child(ren) still alive
[01-Oct-2013 11:59:42.417959] DEBUG: pid 96294, fpm_event_loop(), line 419:
event 
module triggered 1 events
[01-Oct-2013 11:59:42.418045] DEBUG: pid 96294, fpm_event_loop(), line 419:
event 
module triggered 2 events
[01-Oct-2013 11:59:42.418079] DEBUG: pid 96294, fpm_event_loop(), line 419:
event 
module triggered 2 events
[01-Oct-2013 11:59:42.418855] ERROR: pid 96294, fpm_stdio_child_said(),
line 120: 
unable to read what child say: 
Bad file number (9)
[01-Oct-2013 11:59:42.418923] DEBUG: pid 96294, fpm_event_loop(), line 419:
event 
module triggered 1 events
Killed

DTracing sig_handler, we see one SIGCLD per child received.

# dtrace -n 'pid$target::sig_handler:entry{ trace(arg0); }' -p $(pgrep
php|sort -
n|head -1)
dtrace: description 'pid$target::sig_handler:entry' matched 1 probe
CPU     ID                    FUNCTION:NAME
  2    483                sig_handler:entry                15
  2    483                sig_handler:entry                18
  2    483                sig_handler:entry                18

Oddly enough, the problem is only manifest as long as at least one child
process 
exists from the flock launched at 
the start. If we manually terminate all the children processes first and
let the 
master re-spawn them all, the 
problem goes away entirely.

# ./sapi/fpm/php-fpm -F
[01-Oct-2013 12:20:01.656647] DEBUG: pid 96910, fpm_scoreboard_init_main(),
line 
40: got clock tick '100'
[01-Oct-2013 12:20:01.656920] DEBUG: pid 96910, fpm_event_init_main(), line
337: 
event module is port and 5 fds 
have been reserved
[01-Oct-2013 12:20:01.656940] NOTICE: pid 96910, fpm_init(), line 83: fpm
is 
running, pid 96910
[01-Oct-2013 12:20:01.657345] DEBUG: pid 96910, fpm_children_make(), line
421: 
[pool www] child 96911 started
[01-Oct-2013 12:20:01.657696] DEBUG: pid 96910, fpm_children_make(), line
421: 
[pool www] child 96912 started
[01-Oct-2013 12:20:01.657730] DEBUG: pid 96910, fpm_event_loop(), line 366:
2232 
bytes have been reserved in SHM
[01-Oct-2013 12:20:01.657740] NOTICE: pid 96910, fpm_event_loop(), line
367: ready 
to handle connections
[01-Oct-2013 12:20:12.049038] DEBUG: pid 96910, fpm_event_loop(), line 419:
event 
module triggered 4 events
[01-Oct-2013 12:20:12.049745] ERROR: pid 96910, fpm_stdio_child_said(),
line 120: 
unable to read what child say: 
Bad file number (9)
[01-Oct-2013 12:20:12.049811] DEBUG: pid 96910, fpm_event_loop(), line 419:
event 
module triggered 1 events
[01-Oct-2013 12:20:12.049844] DEBUG: pid 96910, fpm_got_signal(), line 76:

received SIGCHLD
[01-Oct-2013 12:20:12.049880] WARNING: pid 96910, fpm_children_bury(), line
252: 
[pool www] child 96911 exited on 
signal 15 (SIGTERM) after 10.392237 seconds from start
[01-Oct-2013 12:20:12.050366] NOTICE: pid 96910, fpm_children_make(), line
421: 
[pool www] child 96920 started
[01-Oct-2013 12:20:12.050404] WARNING: pid 96910, fpm_children_bury(), line
252: 
[pool www] child 96912 exited on 
signal 15 (SIGTERM) after 10.392396 seconds from start
[01-Oct-2013 12:20:12.050735] NOTICE: pid 96910, fpm_children_make(), line
421: 
[pool www] child 96921 started
[01-Oct-2013 12:20:12.050770] DEBUG: pid 96910, fpm_got_signal(), line 76:

received SIGCHLD
[01-Oct-2013 12:20:12.050790] DEBUG: pid 96910, fpm_event_loop(), line 419:
event 
module triggered 1 events
^C[01-Oct-2013 12:20:17.138831] DEBUG: pid 96910, fpm_got_signal(), line
80: 
received SIGINT
[01-Oct-2013 12:20:17.138894] NOTICE: pid 96910, fpm_got_signal(), line 81:

Terminating ...
[01-Oct-2013 12:20:17.138927] DEBUG: pid 96910, fpm_pctl(), line 231:
switching to 
'terminating' state
[01-Oct-2013 12:20:17.139001] DEBUG: pid 96910, fpm_pctl_kill_all(), line
159: 
[pool www] sending signal 15 
SIGTERM to child 96921
[01-Oct-2013 12:20:17.139031] DEBUG: pid 96910, fpm_pctl_kill_all(), line
159: 
[pool www] sending signal 15 
SIGTERM to child 96920
[01-Oct-2013 12:20:17.139043] DEBUG: pid 96910, fpm_pctl_kill_all(), line
168: 2 
child(ren) still alive
[01-Oct-2013 12:20:17.139069] DEBUG: pid 96910, fpm_event_loop(), line 419:
event 
module triggered 1 events
[01-Oct-2013 12:20:17.139796] DEBUG: pid 96910, fpm_got_signal(), line 76:

received SIGCHLD
[01-Oct-2013 12:20:17.139834] DEBUG: pid 96910, fpm_children_bury(), line
252: 
[pool www] child 96920 exited on 
signal 2 (SIGINT) after 5.089324 seconds from start
[01-Oct-2013 12:20:17.139876] DEBUG: pid 96910, fpm_children_bury(), line
252: 
[pool www] child 96921 exited on 
signal 2 (SIGINT) after 5.088997 seconds from start
[01-Oct-2013 12:20:17.139906] NOTICE: pid 96910, fpm_pctl_exit(), line 70:

exiting, bye-bye!

This has been a problem on all FPM-capable PHP branches (including current
trunk) 
for at least a year now. 
Configure phase seems irrelevant, but for the record:

# ./configure --enable-fpm --with-fpm-user=www --with-fpm-group=www
--with-config-
file-path=/opt/local/etc --with-
config-file-scan-dir=/opt/local/etc/php.d --sysconfdir=/opt/local/etc --
localstatedir=/var --disable-all --
prefix=/opt/local


-- 
Edit bug report at https://bugs.php.net/bug.php?id=65800&edit=1
-- 
Try a snapshot (PHP 5.4):   
https://bugs.php.net/fix.php?id=65800&r=trysnapshot54
Try a snapshot (PHP 5.5):   
https://bugs.php.net/fix.php?id=65800&r=trysnapshot55
Try a snapshot (trunk):     
https://bugs.php.net/fix.php?id=65800&r=trysnapshottrunk
Fixed in SVN:               https://bugs.php.net/fix.php?id=65800&r=fixed
Fixed in release:           https://bugs.php.net/fix.php?id=65800&r=alreadyfixed
Need backtrace:             https://bugs.php.net/fix.php?id=65800&r=needtrace
Need Reproduce Script:      https://bugs.php.net/fix.php?id=65800&r=needscript
Try newer version:          https://bugs.php.net/fix.php?id=65800&r=oldversion
Not developer issue:        https://bugs.php.net/fix.php?id=65800&r=support
Expected behavior:          https://bugs.php.net/fix.php?id=65800&r=notwrong
Not enough info:            
https://bugs.php.net/fix.php?id=65800&r=notenoughinfo
Submitted twice:            
https://bugs.php.net/fix.php?id=65800&r=submittedtwice
register_globals:           https://bugs.php.net/fix.php?id=65800&r=globals
PHP 4 support discontinued: https://bugs.php.net/fix.php?id=65800&r=php4
Daylight Savings:           https://bugs.php.net/fix.php?id=65800&r=dst
IIS Stability:              https://bugs.php.net/fix.php?id=65800&r=isapi
Install GNU Sed:            https://bugs.php.net/fix.php?id=65800&r=gnused
Floating point limitations: https://bugs.php.net/fix.php?id=65800&r=float
No Zend Extensions:         https://bugs.php.net/fix.php?id=65800&r=nozend
MySQL Configuration Error:  https://bugs.php.net/fix.php?id=65800&r=mysqlcfg

Reply via email to