Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693
Hi, all I took some times to setup a stress test for mod_fcgid, to see is there any other concurrency bottleneck left, it seems fine to me. I did a stress test with callgrind on mod_fcgid, this is the big picture: http://people.apache.org/~pqf/mod_fcgid_performance/performance.png and this is the output of callgrind: http://people.apache.org/~pqf/mod_fcgid_performance/callgrind.out As we can see, most cpu time are used in ap_create_environment() + init_environment(). ap_create_environment() copy environment vars to a char[][] buffer, and init_environment() copy char[][] to a char[] buffer for FCGI protocol. I have to keep using ap_create_environment() to be compatible with other cgi modules(cgid, cgi...), I will not shortcut these two calls for less cpu utils. And I make a strace trace to the system calls, to make sure there is no obvious IO throughput bottleneck, and no luck either 4622 13:17:16.737888 ... futex resumed ) = 0 4622 13:17:16.737920 futex(0x95ce65c, FUTEX_WAKE, 1 unfinished ... 4622 13:17:16.738038 ... futex resumed ) = 0 4622 13:17:16.740735 gettimeofday({1346217436, 740787}, NULL) = 0 4622 13:17:16.740835 getsockname(12, {sa_family=AF_INET6, sin6_port=htons(8080), inet_pton(AF_INET6, :::127.0.0.1, si n6_addr), sin6_flowinfo=0, sin6_scope_id=0}, [28]) = 0 4622 13:17:16.741008 fcntl64(12, F_GETFL) = 0x2 (flags O_RDWR) 4622 13:17:16.741056 fcntl64(12, F_SETFL, O_RDWR|O_NONBLOCK) = 0 4622 13:17:16.741824 gettimeofday({1346217436, 741849}, NULL) = 0 4622 13:17:16.741890 gettimeofday({1346217436, 741911}, NULL) = 0 4622 13:17:16.741969 read(12, GET /fcgi-bin/foo.pl HTTP/1.0\r\nH..., 8000) = 97 4622 13:17:16.742049 gettimeofday({1346217436, 742070}, NULL) = 0 4622 13:17:16.742108 gettimeofday({1346217436, 742129}, NULL) = 0 4622 13:17:16.742165 gettimeofday({1346217436, 742185}, NULL) = 0 4622 13:17:16.74 gettimeofday({1346217436, 742242}, NULL) = 0 4622 13:17:16.742278 gettimeofday({1346217436, 742299}, NULL) = 0 4622 13:17:16.742345 gettimeofday({1346217436, 742367}, NULL) = 0 4622 13:17:16.742404 gettimeofday({1346217436, 742424}, NULL) = 0 4622 13:17:16.742467 stat64(/usr/local/apache2/htdocs/fcgi-bin/foo.pl, {st_mode=S_IFREG|0755, st_size=229, ...}) = 0 4622 13:17:16.742697 semop(4292662, 0x17234c, 1) = 0 4622 13:17:16.742750 semop(4292662, 0x172358, 1) = 0 4622 13:17:16.742804 socket(PF_FILE, SOCK_STREAM, 0) = 13 4622 13:17:16.742864 connect(13, {sa_family=AF_FILE, path=/usr/local/apache2/logs/fcgidsock/4617.11}, 110) = 0 4622 13:17:16.743028 fcntl64(13, F_GETFL) = 0x2 (flags O_RDWR) 4622 13:17:16.743074 fcntl64(13, F_SETFL, O_RDWR|O_NONBLOCK) = 0 4622 13:17:16.743118 gettimeofday({1346217436, 743138}, NULL) = 0 4622 13:17:16.743174 writev(13, [{\1\1\0\1\0\10\0\0, 8}, {\0\1\0\0\0\0\0\0, 8}, {\1\4\0\1\3U\0\0, 8}, {\t\30UNIQUE_ID UD2l3MCoqlsAABILpXoAA..., 853}, {\1\4\0\1\0\0\0\0, 8}, {\1\5\0\1\0\0\0\0, 8}], 6) = 893 4622 13:17:16.743337 read(13, 0xa6f4c5b8, 8192) = -1 EAGAIN (Resource temporarily unavailable) 4622 13:17:16.743386 poll([{fd=13, events=POLLIN, revents=POLLIN}], 1, 4) = 1 4622 13:17:16.743789 read(13, \1\6\0\1\3\266\2\0Content-Type: text/plain..., 8192) = 984 4622 13:17:16.743879 gettimeofday( unfinished ... 4622 13:17:16.743968 ... gettimeofday resumed {1346217436, 743943}, NULL) = 0 4622 13:17:16.744037 close(13) = 0 4622 13:17:16.744101 semop(4292662, 0x17234c, 1) = 0 4622 13:17:16.744149 semop(4292662, 0x172358, 1) = 0 4622 13:17:16.744209 gettimeofday({1346217436, 744231}, NULL) = 0 4622 13:17:16.744266 gettimeofday({1346217436, 744287}, NULL) = 0 4622 13:17:16.744343 writev(12, [{HTTP/1.1 200 OK\r\nDate: Wed, 29 A..., 154}, {CONTEXT_DOCUMENT_ROOT=\/usr/loca..., 92 4}], 2 unfinished ... 4622 13:17:16.745033 ... writev resumed ) = 1078 4622 13:17:16.745076 gettimeofday({1346217436, 745098}, NULL) = 0 4622 13:17:16.745148 times({tms_utime=1416, tms_stime=1464, tms_cutime=0, tms_cstime=0}) = 499844951 4622 13:17:16.745255 gettimeofday( unfinished ... 4622 13:17:16.745800 ... gettimeofday resumed {1346217436, 745791}, NULL) = 0 4622 13:17:16.746013 shutdown(12, 1 /* send */ unfinished ... 4622 13:17:16.746554 ... shutdown resumed ) = 0 4622 13:17:16.746625 gettimeofday( unfinished ... 4622 13:17:16.746684 ... gettimeofday resumed {1346217436, 746658}, NULL) = 0 4622 13:17:16.746765 epoll_ctl(11, EPOLL_CTL_ADD, 12, {EPOLLIN, {u32=158210764, u64=158210764}} unfinished ... 4622 13:17:16.746833 ... epoll_ctl resumed ) = 0 4622 13:17:16.747587 gettimeofday( unfinished ... 4622 13:17:16.747904 ... gettimeofday resumed {1346217436, 747649}, NULL) = 0 4622 13:17:16.748672 futex(0x95ce690, FUTEX_WAIT, 176390, NULL unfinished ... 2012-09-05 pqf 发件人:pqf 发送时间:2012-08-28 09:34 主题:回复: Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693 收件人:devdev@httpd.apache.org 抄送: So what can mod_fcgid do in this overloaded? 1. mod_fcgid get a request 2. mod_fcgid can't apply a free slot
Re: Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693
2012/8/28 pqf p...@mailtech.cn: So what can mod_fcgid do in this overloaded? 1. mod_fcgid get a request 2. mod_fcgid can't apply a free slot of FCGI handler 3. mod_fcgid send a spawn request to PM 4. PM deny the request(for too much process already) 5. Now for( i=1; i64; i++) { a) mod_fcgid delay a while, and then send another spawn request to PM and try apply free slot again. b) mod_fcgid send another spawn request at once, even the last request is denied. c) ?? (now is a, b maybe not a good idea, any new idea?) } I think the bottleneck is too much request, too less FCGI handler. httpd(or mod_fcgid) either drop client connections or delay a while, there is no other way out? My idea is to add a availability number to each class. If a wait fails it will be decreased or increased if wait is successful. Let's say we want to wait max 100 times 250ms int connected=0; for(i=0; !connected i = (get_clas_avail()/MAX_AVAIL)*100; i++) { /* Apply a process slot */ bucket_ctx-procnode = apply_free_procnode(r, fcgi_request); /* Send a spawn request if I can't get a process slot */ /* procmgr_send_spawn_cmd() return APR_SUCCESS if a process is created */ if( !bucket_ctx-procnode (procmgr_send_spawn_cmd(fcgi_request, r) != APR_SUCCESS) ) apr_sleep(apr_time_from_msec(250)); else /* Apply a process slot */ bucket_ctx-procnode = apply_free_procnode(r, fcgi_request); if (bucket_ctx-procnode) { if (proc_connect_ipc(bucket_ctx-procnode, bucket_ctx-ipc) != APR_SUCCESS) { proc_close_ipc(bucket_ctx-ipc); bucket_ctx-procnode-diewhy = FCGID_DIE_CONNECT_ERROR; return_procnode(r, bucket_ctx-procnode, 1 /* has error */ ); bucket_ctx-procnode = NULL; decrease_avail(); }else { increase_avail(); connected=1; } } } if (!connected) { decrease_avail(); ap_log_rerror(APLOG_MARK, APLOG_WARNING, 0, r, mod_fcgid: can't apply process slot for %s, cmd_conf-cmdline); return HTTP_SERVICE_UNAVAILABLE; } decrease_avail() might halve availability each time called. Availability should be dynamic maybe controlled by processmanager and returned to the threads handling connections by procmgr_send_spawn_cmd(), it can depend on total number of denied spawn requests for specific class in a similar way as score, without this connections will be 503 not sooner then 25 seconds, which is still IMHO to long. Another improvement would bo to make wait time shorter for not overloaded classes to keep the penalty of denied spawn as low as possible. I plan to work on that later. Another question. Is it necessary to call procmgr_init_spawn_cmd() from inside the for loop ? I took a brief look, it seems not necessary. I will move it out of loop and test. 2012-08-28 pqf 发件人:Lazy 发送时间:2012-08-27 21:47 主题:Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693 收件人:devdev@httpd.apache.org 抄送: 2012/8/16 pqf p...@mailtech.cn: How about this: 1. procmgr_post_spawn_cmd() now return a status code from PM, so process handler now know the spawn request is denyed or not. 2. if a new process is created, no sleep is needed. 3. if no process is created, sleep a while sorry for the late reply, in the old code there ware no sleep() between procmgr_post_spawn_cmd() and apply_free_procnode() sleep() was invoked only if there ware no free procnode. This happened only if we ware denied spawning new process or in some cases if some other thread managed to use that procnode before us. Your change adresses cases if some other thread stole our newly spawned fcgi process, old code was waiting 1s before trying to spawn another/recheck, new code doesn't, I guess this is the orginal issue in stress tests when total number of simultaneous connections doesn't exceed max fcgi processes. But when spawning is denied recovery time is still long 1s. I was refering to cases when spawn is denied. If a vhost is overloaded or someone added sleep(60) in the code, mod_fcgid blocks on all request to that vhost for over a minute and it is possible to occupy 1000 threads using under 20 new connections to slow vhost per second. This can be mitingated by adding avaiability which will impact time spend on waiting for free process. Overloaded vhost will start to drop connections faster preventing the web-server reaching MaxClients limit. Another question. Is it necessary to call procmgr_init_spawn_cmd() from inside the for loop ? 2012-08-16 pqf
Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693
2012/8/16 pqf p...@mailtech.cn: How about this: 1. procmgr_post_spawn_cmd() now return a status code from PM, so process handler now know the spawn request is denyed or not. 2. if a new process is created, no sleep is needed. 3. if no process is created, sleep a while sorry for the late reply, in the old code there ware no sleep() between procmgr_post_spawn_cmd() and apply_free_procnode() sleep() was invoked only if there ware no free procnode. This happened only if we ware denied spawning new process or in some cases if some other thread managed to use that procnode before us. Your change adresses cases if some other thread stole our newly spawned fcgi process, old code was waiting 1s before trying to spawn another/recheck, new code doesn't, I guess this is the orginal issue in stress tests when total number of simultaneous connections doesn't exceed max fcgi processes. But when spawning is denied recovery time is still long 1s. I was refering to cases when spawn is denied. If a vhost is overloaded or someone added sleep(60) in the code, mod_fcgid blocks on all request to that vhost for over a minute and it is possible to occupy 1000 threads using under 20 new connections to slow vhost per second. This can be mitingated by adding avaiability which will impact time spend on waiting for free process. Overloaded vhost will start to drop connections faster preventing the web-server reaching MaxClients limit. Another question. Is it necessary to call procmgr_init_spawn_cmd() from inside the for loop ? 2012-08-16 pqf 发件人:Lazy 发送时间:2012-08-16 16:47 主题:Re: Re: mod_fcgid concurrency bottleneck, issue#53693 收件人:devdev@httpd.apache.org 抄送: 2012/8/16 pqf p...@mailtech.cn: Hi, Michal My solution do add availability to each class, which is the procmgr_post_spawn_cmd() call in each loop do. The sleep() call is intrudused for a stress test without warm up time, in this case, mod_fcgid will create more processes than a slow start one(each process handler can't apply a free slot on the very begining, so send a request to process manager to create one, it's easy to reach the max # of process limit while httpd startup, but the idle process will be killed later), the sleep() call is a little like a server side warm up delay. But since someone said remove this sleep(), the server work fine without bottleneck(Maybe he didn't notise the warm up issue?), so I thought remove the sleep() is a good idea. But reduce the time of sleep() is fine to me too. I was referring to the case where all processes are busy, without sleep(), handle_request() wil quickly send spawn requsts, whith will be denyed by process menager, with sleep() handle_request() will always wait quite a long time, occupying slots -- Michal Grzedzicki
回复: Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693
So what can mod_fcgid do in this overloaded? 1. mod_fcgid get a request 2. mod_fcgid can't apply a free slot of FCGI handler 3. mod_fcgid send a spawn request to PM 4. PM deny the request(for too much process already) 5. Now for( i=1; i64; i++) { a) mod_fcgid delay a while, and then send another spawn request to PM and try apply free slot again. b) mod_fcgid send another spawn request at once, even the last request is denied. c) ?? (now is a, b maybe not a good idea, any new idea?) } I think the bottleneck is too much request, too less FCGI handler. httpd(or mod_fcgid) either drop client connections or delay a while, there is no other way out? Another question. Is it necessary to call procmgr_init_spawn_cmd() from inside the for loop ? I took a brief look, it seems not necessary. I will move it out of loop and test. 2012-08-28 pqf 发件人:Lazy 发送时间:2012-08-27 21:47 主题:Re: Re: Re: mod_fcgid concurrency bottleneck, issue#53693 收件人:devdev@httpd.apache.org 抄送: 2012/8/16 pqf p...@mailtech.cn: How about this: 1. procmgr_post_spawn_cmd() now return a status code from PM, so process handler now know the spawn request is denyed or not. 2. if a new process is created, no sleep is needed. 3. if no process is created, sleep a while sorry for the late reply, in the old code there ware no sleep() between procmgr_post_spawn_cmd() and apply_free_procnode() sleep() was invoked only if there ware no free procnode. This happened only if we ware denied spawning new process or in some cases if some other thread managed to use that procnode before us. Your change adresses cases if some other thread stole our newly spawned fcgi process, old code was waiting 1s before trying to spawn another/recheck, new code doesn't, I guess this is the orginal issue in stress tests when total number of simultaneous connections doesn't exceed max fcgi processes. But when spawning is denied recovery time is still long 1s. I was refering to cases when spawn is denied. If a vhost is overloaded or someone added sleep(60) in the code, mod_fcgid blocks on all request to that vhost for over a minute and it is possible to occupy 1000 threads using under 20 new connections to slow vhost per second. This can be mitingated by adding avaiability which will impact time spend on waiting for free process. Overloaded vhost will start to drop connections faster preventing the web-server reaching MaxClients limit. Another question. Is it necessary to call procmgr_init_spawn_cmd() from inside the for loop ? 2012-08-16 pqf 发件人:Lazy 发送时间:2012-08-16 16:47 主题:Re: Re: mod_fcgid concurrency bottleneck, issue#53693 收件人:devdev@httpd.apache.org 抄送: 2012/8/16 pqf p...@mailtech.cn: Hi, Michal My solution do add availability to each class, which is the procmgr_post_spawn_cmd() call in each loop do. The sleep() call is intrudused for a stress test without warm up time, in this case, mod_fcgid will create more processes than a slow start one(each process handler can't apply a free slot on the very begining, so send a request to process manager to create one, it's easy to reach the max # of process limit while httpd startup, but the idle process will be killed later), the sleep() call is a little like a server side warm up delay. But since someone said remove this sleep(), the server work fine without bottleneck(Maybe he didn't notise the warm up issue?), so I thought remove the sleep() is a good idea. But reduce the time of sleep() is fine to me too. I was referring to the case where all processes are busy, without sleep(), handle_request() wil quickly send spawn requsts, whith will be denyed by process menager, with sleep() handle_request() will always wait quite a long time, occupying slots -- Michal Grzedzicki