Re: mod_fcgid can kill all the services on the server via kill -15 -1

2011-05-04 Thread William A. Rowe Jr.
On 5/3/2011 11:51 PM, pqf wrote:
 Here is the new patch, anyone review it? I will commmit it if no one respond 
 :)

+1, commit away


Re: Re: mod_fcgid can kill all the services on the server via kill -15 -1

2011-05-03 Thread pqf
Hi, guys
Here is the new patch, anyone review it? I will commmit it if no one respond :)

Index: fcgid_proc_unix.c
===
--- fcgid_proc_unix.c   (revision 1099314)
+++ fcgid_proc_unix.c   (working copy)
@@ -402,6 +402,7 @@
 procnode-proc_id = tmpproc;
 
 if (rv != APR_SUCCESS) {
+memset(procnode-proc_id, 0, sizeof(procnode-proc_id));
 ap_log_error(APLOG_MARK, APLOG_ERR, rv, procinfo-main_server,
  mod_fcgid: can't run %s, wargv[0]);
 }
@@ -414,6 +415,11 @@
 /* su as root before sending signal, for suEXEC */
 apr_status_t rv;
 
+if (procnode-proc_id.pid == 0) {
+/* procnode-proc_id.pid be 0 while fcgid_create_privileged_process() 
fail */
+return APR_SUCCESS; 
+}
+
 if (ap_unixd_config.suexec_enabled  seteuid(0) != 0) {
 
 /* can't gain privileges to send signal (should not occur); do NOT
@@ -461,6 +467,7 @@
 /* Destroy pool */
 apr_pool_destroy(procnode-proc_pool);
 procnode-proc_pool = NULL;
+memset(procnode-proc_id, 0, sizeof(procnode-proc_id));
 
 return APR_CHILD_DONE;
 }



2011-05-04



Ryan 


发件人: pqfp...@mailtech.cn
发送时间: 2011-04-18 09:58
主 题: Re: mod_fcgid can kill all the services on the server via kill -15 -1
收件人: devdev@httpd.apache.org



Hi, all
Another question, does proc_wait_process() should update procnode-proc_id to 0 
too? or else mod_fcgid may send a signal to another irrelevant process while 
apache is shutting down? I don't follow up mod_fcgid for a while, I just took a 
glance, maybe it's updated somewhere else?
By the way, procnode-proc_id is set to 0 while apache startup, so why not 
update procnode-proc_id to 0 while fcgid_create_privileged_process() is fail? 
And then check magic number 0 rather than both -1 and 0,  in both 
proc_kill_gracefully() and proc_kill_force().

Cheers.
Ryan


 
Hello,


There is a very interesting, and quite a rare bug in mod_fcgid. It is easy to 
reproduce if you can cause fork to fail (which can be done with CloudLinux -- 
if anyone wants to replicate it).


Here is how it works: 
mod_fcgid tries to spawn a new process (proc_spawn_process in 
fcgid_proc_unix.c), but fork returns -1. 
More exactly fcgid_create_privileged_process function call returns error, and 
fills in tmpproc.pid with -1  tmpproc is assiged to procnode-proc_id).


Now, if at the same time service httpd restart is executed, function 
kill_all_subprocess in fcgid_pm_main.c will execute, and it will try to go 
through all procnodes, sending SIGTERM via proc_kill_gracefully, (and then 
SIGKILL via proc_kill_force) to procnode-proc_id.pid
Yet, one procnode will be pointing to procnode-proc_id.pid, causing kill -15 
-1 (kill all).
The end results all services on the server failing, including SSH, apache, 
syslogd, etc..


I guess the problem is really rare for most people. Also it is quite hard to 
diagnose, as it is completely not clear where the signal came from, and it took 
us some time to correlate them with apache restarts.. Yet due to our OS being 
used by shared hosts (where httpd restart is common thing), and our ability to 
limit memory available to processes on per virtual host bases (which causes 
fork to fail once that virtual host reaches memory limit), we see the issue 
quite often.


The solution is quite simple (not sure if it is the best / right solution), in 
file: fcgid_proc_unix.c, in methods proc_kill_gracefully, line:


rv = apr_proc_kill((procnode-proc_id), SIGTERM);


should be changed to:
   if (procnode-proc_id.pid != -1) {
  rv = apr_proc_kill((procnode-proc_id), SIGTERM);
   } else {
  rv = APR_SUCCESS;
   }


Similarly in proc_kill_force
rv = apr_proc_kill((procnode-proc_id), SIGKILL);
should be changed to:
   if (procnode-proc_id.pid != -1) {
  rv = apr_proc_kill((procnode-proc_id), SIGKILL);
   } else {
  rv = APR_SUCCESS;
   }


Regards,
Igor Seletskiy
CEO @ Cloud Linux Inc

Re: mod_fcgid can kill all the services on the server via kill -15 -1

2011-04-26 Thread Igor Seletskiy
Ryan,

I like this approach. Do you want us to prepare a patch against latest
version?
Also, we never committed to apache project. Would it be enough if we post a
patch here. Or what would be the process?


On Sun, Apr 17, 2011 at 9:58 PM, pqf p...@mailtech.cn wrote:

  Hi, all
 Another question, does proc_wait_process() should update procnode-proc_id
 to 0 too? or else mod_fcgid may send a signal to another irrelevant process
 while apache is shutting down? I don't follow up mod_fcgid for a while, I
 just took a glance, maybe it's updated somewhere else?
 By the way, procnode-proc_id is set to 0 while apache startup, so why not
 update procnode-proc_id to 0 while fcgid_create_privileged_process() is
 fail? And then check magic number 0 rather than both -1 and 0,  in both
 proc_kill_gracefully() and proc_kill_force().

 Cheers.
 Ryan



  Hello,

 There is a very interesting, and quite a rare bug in mod_fcgid. It is easy
 to reproduce if you can cause fork to fail (which can be done with
 CloudLinux -- if anyone wants to replicate it).

 *Here is how it works: *
 mod_fcgid tries to spawn a new process (proc_spawn_process in
 fcgid_proc_unix.c), but fork returns -1.
 More exactly fcgid_create_privileged_process function call returns error,
 and fills in tmpproc.pid with -1  tmpproc is assiged to procnode-proc_id).

 Now, if at the same time service httpd restart is executed, function
 kill_all_subprocess in fcgid_pm_main.c will execute, and it will try to go
 through all procnodes, sending SIGTERM via proc_kill_gracefully, (and then
 SIGKILL via proc_kill_force) to procnode-proc_id.pid
 Yet, one procnode will be pointing to procnode-proc_id.pid, causing kill
 -15 -1 (kill all).
 The end results all services on the server failing, including SSH, apache,
 syslogd, etc..

 I guess the problem is really rare for most people. Also it is quite hard
 to diagnose, as it is completely not clear where the signal came from, and
 it took us some time to correlate them with apache restarts.. Yet due to our
 OS being used by shared hosts (where httpd restart is common thing), and our
 ability to limit memory available to processes on per virtual host bases
 (which causes fork to fail once that virtual host reaches memory limit), we
 see the issue quite often.

 The solution is quite simple (not sure if it is the best / right solution),
 in file: fcgid_proc_unix.c, in methods proc_kill_gracefully, line:

 rv = apr_proc_kill((procnode-proc_id), SIGTERM);

 should be changed to:
if (procnode-proc_id.pid != -1) {
   rv = apr_proc_kill((procnode-proc_id), SIGTERM);
} else {
   rv = APR_SUCCESS;
}

 Similarly in proc_kill_force
 rv = apr_proc_kill((procnode-proc_id), SIGKILL);
 should be changed to:
if (procnode-proc_id.pid != -1) {
   rv = apr_proc_kill((procnode-proc_id), SIGKILL);
} else {
   rv = APR_SUCCESS;
}

 Regards,
 Igor Seletskiy
 CEO @ Cloud Linux Inc





Re: mod_fcgid can kill all the services on the server via kill -15 -1

2011-04-19 Thread Igor Seletskiy
I like this idea better then just checking for pid == -1, though sending
TERM signal to 0 shouldn't be that damaging (if damaging at all).
Mostly because apachectl will run in different process group, so it will not
be killed, and will finish restarting apache.
And unless apache itself is embeded or started by some other software --
signal shouldn't kill anything but apache.


On Sun, Apr 17, 2011 at 9:58 PM, pqf p...@mailtech.cn wrote:

  Hi, all
 Another question, does proc_wait_process() should update procnode-proc_id
 to 0 too? or else mod_fcgid may send a signal to another irrelevant process
 while apache is shutting down? I don't follow up mod_fcgid for a while, I
 just took a glance, maybe it's updated somewhere else?
 By the way, procnode-proc_id is set to 0 while apache startup, so why not
 update procnode-proc_id to 0 while fcgid_create_privileged_process() is
 fail? And then check magic number 0 rather than both -1 and 0,  in both
 proc_kill_gracefully() and proc_kill_force().

 Cheers.
 Ryan



  Hello,

 There is a very interesting, and quite a rare bug in mod_fcgid. It is easy
 to reproduce if you can cause fork to fail (which can be done with
 CloudLinux -- if anyone wants to replicate it).

 *Here is how it works: *
 mod_fcgid tries to spawn a new process (proc_spawn_process in
 fcgid_proc_unix.c), but fork returns -1.
 More exactly fcgid_create_privileged_process function call returns error,
 and fills in tmpproc.pid with -1  tmpproc is assiged to procnode-proc_id).

 Now, if at the same time service httpd restart is executed, function
 kill_all_subprocess in fcgid_pm_main.c will execute, and it will try to go
 through all procnodes, sending SIGTERM via proc_kill_gracefully, (and then
 SIGKILL via proc_kill_force) to procnode-proc_id.pid
 Yet, one procnode will be pointing to procnode-proc_id.pid, causing kill
 -15 -1 (kill all).
 The end results all services on the server failing, including SSH, apache,
 syslogd, etc..

 I guess the problem is really rare for most people. Also it is quite hard
 to diagnose, as it is completely not clear where the signal came from, and
 it took us some time to correlate them with apache restarts.. Yet due to our
 OS being used by shared hosts (where httpd restart is common thing), and our
 ability to limit memory available to processes on per virtual host bases
 (which causes fork to fail once that virtual host reaches memory limit), we
 see the issue quite often.

 The solution is quite simple (not sure if it is the best / right solution),
 in file: fcgid_proc_unix.c, in methods proc_kill_gracefully, line:

 rv = apr_proc_kill((procnode-proc_id), SIGTERM);

 should be changed to:
if (procnode-proc_id.pid != -1) {
   rv = apr_proc_kill((procnode-proc_id), SIGTERM);
} else {
   rv = APR_SUCCESS;
}

 Similarly in proc_kill_force
 rv = apr_proc_kill((procnode-proc_id), SIGKILL);
 should be changed to:
if (procnode-proc_id.pid != -1) {
   rv = apr_proc_kill((procnode-proc_id), SIGKILL);
} else {
   rv = APR_SUCCESS;
}

 Regards,
 Igor Seletskiy
 CEO @ Cloud Linux Inc





Re: mod_fcgid can kill all the services on the server via kill -15 -1

2011-04-17 Thread pqf
Hi, all
Another question, does proc_wait_process() should update procnode-proc_id to 0 
too? or else mod_fcgid may send a signal to another irrelevant process while 
apache is shutting down? I don't follow up mod_fcgid for a while, I just took a 
glance, maybe it's updated somewhere else?
By the way, procnode-proc_id is set to 0 while apache startup, so why not 
update procnode-proc_id to 0 while fcgid_create_privileged_process() is fail? 
And then check magic number 0 rather than both -1 and 0,  in both 
proc_kill_gracefully() and proc_kill_force().

Cheers.
Ryan


 
Hello,


There is a very interesting, and quite a rare bug in mod_fcgid. It is easy to 
reproduce if you can cause fork to fail (which can be done with CloudLinux -- 
if anyone wants to replicate it).


Here is how it works: 
mod_fcgid tries to spawn a new process (proc_spawn_process in 
fcgid_proc_unix.c), but fork returns -1. 
More exactly fcgid_create_privileged_process function call returns error, and 
fills in tmpproc.pid with -1  tmpproc is assiged to procnode-proc_id).


Now, if at the same time service httpd restart is executed, function 
kill_all_subprocess in fcgid_pm_main.c will execute, and it will try to go 
through all procnodes, sending SIGTERM via proc_kill_gracefully, (and then 
SIGKILL via proc_kill_force) to procnode-proc_id.pid
Yet, one procnode will be pointing to procnode-proc_id.pid, causing kill -15 
-1 (kill all).
The end results all services on the server failing, including SSH, apache, 
syslogd, etc..


I guess the problem is really rare for most people. Also it is quite hard to 
diagnose, as it is completely not clear where the signal came from, and it took 
us some time to correlate them with apache restarts.. Yet due to our OS being 
used by shared hosts (where httpd restart is common thing), and our ability to 
limit memory available to processes on per virtual host bases (which causes 
fork to fail once that virtual host reaches memory limit), we see the issue 
quite often.


The solution is quite simple (not sure if it is the best / right solution), in 
file: fcgid_proc_unix.c, in methods proc_kill_gracefully, line:


rv = apr_proc_kill((procnode-proc_id), SIGTERM);


should be changed to:
   if (procnode-proc_id.pid != -1) {
  rv = apr_proc_kill((procnode-proc_id), SIGTERM);
   } else {
  rv = APR_SUCCESS;
   }


Similarly in proc_kill_force
rv = apr_proc_kill((procnode-proc_id), SIGKILL);
should be changed to:
   if (procnode-proc_id.pid != -1) {
  rv = apr_proc_kill((procnode-proc_id), SIGKILL);
   } else {
  rv = APR_SUCCESS;
   }


Regards,
Igor Seletskiy
CEO @ Cloud Linux Inc