ID:               40286
 User updated by:  gabriel at oxeva dot fr
 Reported By:      gabriel at oxeva dot fr
-Status:           Feedback
+Status:           Open
 Bug Type:         CGI related
 Operating System: Linux 2.6
 PHP Version:      4.4.4
 Assigned To:      dmitry
 New Comment:

strace -p <PID> provides the following :
read(3,  <unfinished ...>

and gdb program <PID> and "bt" provides :
(gdb) bt
#0  0xb7fe3410 in ?? ()
#1  0xbfd86618 in ?? ()
#2  0x00000008 in ?? ()
#3  0xbfd86600 in ?? ()
#4  0x008e14f3 in __read_nocancel () from /lib/tls/libc.so.6
#5  0x083ba23e in fcgi_read ()
#6  0x083bbb38 in FCGX_FPrintF ()
#7  0x0831ab22 in sapi_deactivate ()
#8  0x08314a3d in php_request_shutdown ()
#9  0x083bcdeb in main ()

Please note that I can't test with debugging symbols (the libraries and
PHP are stripped), as this binary is in production environment and the
bug occurs only under load.


Previous Comments:
------------------------------------------------------------------------

[2007-01-30 13:20:40] [EMAIL PROTECTED]

Could you plase attach debugger to non-killed process and provide
backtrace.

Do php-5.2 has the same problem?

------------------------------------------------------------------------

[2007-01-30 11:50:52] gabriel at oxeva dot fr

In all the report, I mean killed is kill with signal 15 (TERM).

As stated in the report, children are blocking in a syscall, which
means they can only be killed by signal 9 (KILL). The fastcgi_cleanup
function registered on shutdown kills with TERM signal 15. I think the
bug occurs when children, under load, are executing a syscall when the
parent is killed and start the fastcgi_cleanup. A fast workaround would
be to kill children with signal 9 in the fastcgi_cleanup, at
sapi/cgi/cgi_main.c:951.

------------------------------------------------------------------------

[2007-01-30 11:40:32] [EMAIL PROTECTED]

>From what I can see in the sources, FastCGI registers a signal handler
to kill its children on shutdown (see sapi/cgi/cgi_main.c, line 1219),
but this handler surely won't be called on SIGKILL. Hence the question
- what signal do you mean by "killed"?

------------------------------------------------------------------------

[2007-01-30 11:34:02] gabriel at oxeva dot fr

Description:
------------
Context:
When running PHP in FastCGI mode with a fastCGI apache module (such a
mod_fcgid), all is running fine when PHP_FCGI_CHILDREN unset : only 1
process spawned. When using PHP_FCGI_CHILDREN=n, the PHP parent process
forks n childs, and the parent acts as a manager between the child
processes, wait()ing to respawn them if they are killed or exit. The
problem happens when the FastCGI process manager handled by the apache
module has to kill the parent PHP process (it only knows the parent's
PID) for any reason such as idle timeout, max lifetime, etc.

Problem:
While the PHP parent process is properly killed by the FastCGI process
manager, the children aren't killed, but instead stay alive, waiting
for a new request which will never come (because the socket shared with
the parent is removed at the same time parent is killed).

Reproduce code:
---------------
This is not always reproducible, as the problem only happens when the
php FastCGI processes are busy.

The only way the kill these "orphan" children, is using the signal 9 on
them (to interrupt the blocking read() syscall they are executing)

Expected result:
----------------
In the example, the fastCGI process manager spawns php by fork()ing
then exec()ing /path/to/php , with environment PHP_FCGI_CHILDREN=2

PHP parent process is PID 10, and it forks itself 2 childs, PID 11 and
12.

When killing PID 10 with normal signal 15, and the whole php processes
are under load, PID 10 is killed, but the 2 children PID 11 and 12 stay
alive.

The expected result is that when the PHP parent process is killed, all
the children in any processing state are killed too.

Actual result:
--------------
strace of children processes (PID 11 and 12) still alive gives :
# strace -p 11
Process 11 attached - interrupt to quit
read(3,  <unfinished ...>

PID 12 give the same result.


------------------------------------------------------------------------


-- 
Edit this bug report at http://bugs.php.net/?id=40286&edit=1

Reply via email to