2012/8/15 pqf <p...@mailtech.cn>: > Hi, all > I prefer the following solution, can anyone review it? > procmgr_post_spawn_cmd() will be blocked until process manager create a new > fcgid process, the worst case is someone else take the new created process > before I do, and I have to post another spawn command to PM again. The > extreme case is loop FCGID_APPLY_TRY_COUNT but get no process slot. > > > Index: fcgid_bridge.c > =================================================================== > --- fcgid_bridge.c (revision 1373226) > +++ fcgid_bridge.c (working copy) > @@ -30,7 +30,7 @@ > #include "fcgid_spawn_ctl.h" > #include "fcgid_protocol.h" > #include "fcgid_bucket.h" > -#define FCGID_APPLY_TRY_COUNT 2 > +#define FCGID_APPLY_TRY_COUNT 4 > #define FCGID_REQUEST_COUNT 32 > #define FCGID_BRIGADE_CLEAN_STEP 32 > > @@ -447,19 +447,13 @@ > if (bucket_ctx->procnode) > break; > > - /* Avoid sleeping the very first time through if there are no > - busy processes; the problem is just that we haven't spawned > - anything yet, so waiting is pointless */ > - if (i > 0 || j > 0 || count_busy_processes(r, &fcgi_request)) { > - apr_sleep(apr_time_from_sec(1)); > - > - bucket_ctx->procnode = apply_free_procnode(r, > &fcgi_request); > - if (bucket_ctx->procnode) > - break; > - } > - > /* Send a spawn request if I can't get a process slot */ > procmgr_post_spawn_cmd(&fcgi_request, r); > + > + /* Try again */ > + bucket_ctx->procnode = apply_free_procnode(r, &fcgi_request); > + if (bucket_ctx->procnode) > + break; > } > > /* Connect to the fastcgi server */
if You get rid of sleep apache will not wait for free process if all of them are busy, this will lead to 503 errors currently mod_fcgid waits FCGID_APPLY_TRY_COUNT * FCGID_REQUEST_COUNT * 1 second and this is usually 64 seconds, this means if You have an overloaded vhost with low FcgidMaxProcessesPerClass it can bring whole server down, each thread waits 64 seconds so it doesn't take long before all threads are occupied. In my setup we lowerd the wait time and FCGID_REQUEST_COUNT to lower the impact of overloaded class but I think the best solution will be to add availability to each class. Total wait time will be related to it (by changing sleep time, and FCGID_APPLY_TRY_COUNT). If request is unsuccessful availability will be halved so next time the wait time will be shorter. This way congested class will get 0% availability, and new connections will instantly get 503 it there are no free slots. A successful wait will increase availability. Regards, Michal Grzedzicki