Re: Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693

Lazy Tue, 28 Aug 2012 04:40:55 -0700

2012/8/28 pqf <[email protected]>:
> So what can mod_fcgid do in this overloaded?
> 1. mod_fcgid get a request
> 2. mod_fcgid can't apply a free slot of FCGI handler
> 3. mod_fcgid send a spawn request to PM
> 4. PM deny the request(for too much process already)
> 5. Now....
>    for( i=1; i<64; i++)
>   {
>      a) mod_fcgid delay a while, and then send another spawn request to PM
> and try apply free slot again.
>      b) mod_fcgid send another spawn request at once, even the last request
> is denied.
>      c) ??
> (now is a, b maybe not a good idea, any new idea?)
>   }
>
> I think the bottleneck is too much request, too less FCGI handler. httpd(or
> mod_fcgid) either drop client connections or delay a while, there is no
> other way out?


My idea is to add a availability number to each class. If a wait fails
it will be decreased or increased if wait is successful. Let's say we
want to wait max 100 times 250ms


int connected=0;

   for(i=0; !connected && i <= (get_clas_avail()/MAX_AVAIL)*100; i++) {
            /* Apply a process slot */
            bucket_ctx->procnode = apply_free_procnode(r, &fcgi_request);

            /* Send a spawn request if I can't get a process slot */
            /* procmgr_send_spawn_cmd() return APR_SUCCESS if a
process is created */
            if( !bucket_ctx->procnode &&
(procmgr_send_spawn_cmd(&fcgi_request, r) != APR_SUCCESS) )
                apr_sleep(apr_time_from_msec(250));
            else
                /* Apply a process slot */
                bucket_ctx->procnode = apply_free_procnode(r, &fcgi_request);

            if (bucket_ctx->procnode) {
                if (proc_connect_ipc(bucket_ctx->procnode,
                                     &bucket_ctx->ipc) != APR_SUCCESS) {
                    proc_close_ipc(&bucket_ctx->ipc);
                    bucket_ctx->procnode->diewhy = FCGID_DIE_CONNECT_ERROR;
                    return_procnode(r, bucket_ctx->procnode, 1 /* has
error */ );
                    bucket_ctx->procnode = NULL;
                    decrease_avail();
                }else {
                    increase_avail();
                    connected=1;
                }
            }
    }

    if (!connected) {
        decrease_avail();
        ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r,
                      "mod_fcgid: can't apply process slot for %s",
                      cmd_conf->cmdline);
        return HTTP_SERVICE_UNAVAILABLE;
    }

decrease_avail() might halve availability each time called.


Availability should be dynamic maybe controlled by processmanager and
returned to the threads handling connections by
procmgr_send_spawn_cmd(), it can depend on total number of denied
spawn requests for specific class in a similar way as score, without
this connections will be 503 not sooner then 25 seconds, which is
still IMHO to long. Another improvement would bo to make wait time
shorter for not overloaded classes to keep the penalty of denied spawn
as low as possible.


I plan to work on that later.

>
>
> Another question. Is it necessary to call procmgr_init_spawn_cmd() from
> inside the for loop ?
> I took a brief look, it seems not necessary. I will move it out of loop and
> test.
>
> 2012-08-28
> ________________________________
> pqf
> ________________________________
> 发件人：Lazy
> 发送时间：2012-08-27 21:47
> 主题：Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693
> 收件人："dev"<[email protected]>
> 抄送：
>
> 2012/8/16 pqf <[email protected]>:
>> How about this:
>> 1. procmgr_post_spawn_cmd() now return a status code from PM, so process
>> handler now know the spawn request is denyed or not.
>> 2. if a new process is created, no sleep is needed.
>> 3. if no process is created, sleep a while
>
> sorry for the late reply,
>
> in the old code there ware no sleep() between procmgr_post_spawn_cmd()
> and apply_free_procnode()
>
> sleep() was invoked only if there ware no free procnode.
>
> This happened only if we ware denied spawning new process or in some
> cases if some other thread managed to use that procnode before us.
>
> Your change adresses cases if some other thread stole "our" newly
> spawned fcgi process, old code was waiting 1s before trying to spawn
> another/recheck, new code doesn't, I guess this is the orginal issue
> in stress tests when total number of simultaneous connections doesn't
> exceed max fcgi processes. But when spawning is denied recovery time
> is still long 1s.
>
>
> I was refering to cases when spawn is denied.
>
> If a vhost is overloaded or someone added sleep(60) in the code,
> mod_fcgid blocks on all request to that vhost
> for over a minute and it is possible to occupy 1000 threads using
> under 20 new connections to slow vhost
> per second. This can be mitingated by adding avaiability which will
> impact time spend on waiting for free process. Overloaded vhost will
> start to drop connections faster preventing the web-server reaching
> MaxClients
> limit.
>
> Another question. Is it necessary to call procmgr_init_spawn_cmd()
> from inside the for loop ?
>
>
>>
>> 2012-08-16
>> ________________________________
>> pqf
>> ________________________________
>> 发件人：Lazy
>> 发送时间：2012-08-16 16:47
>> 主题：Re: Re: mod_fcgid concurrency bottleneck, issue#53693
>> 收件人："dev"<[email protected]>
>> 抄送：
>>
>> 2012/8/16 pqf <[email protected]>:
>>> Hi, Michal
>>> My solution do "add availability to each class", which is the
>>> procmgr_post_spawn_cmd() call in each loop do.
>>> The sleep() call is intrudused for a stress test without warm up time, in
>>> this case, mod_fcgid will create more processes than a slow start
>>> one(each
>>> process handler can't apply a free slot on the very begining, so send a
>>> request to process manager to create one, it's easy to reach the max # of
>>> process limit while httpd startup, but the idle process will be killed
>>> later), the sleep() call is a little like a "server side warm up delay".
>>> But since someone said remove this sleep(), the server work fine without
>>> bottleneck(Maybe he didn't notise the warm up issue?), so I thought
>>> remove
>>> the sleep() is a good idea. But reduce the time of sleep() is fine to me
>>> too.
>>
>> I was referring to the case where all processes are busy, without
>> sleep(), handle_request() wil quickly send spawn requsts, whith will
>> be denyed by process menager, with sleep() handle_request() will
>> always wait quite a long time,
>> occupying slots
>>
>> --
>> Michal Grzedzicki
>>
>>
>
>

handle.diff
Description: Binary data

Re: Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693

Reply via email to