On Thursday 11 of April 2013, Arkadiusz Miśkiewicz wrote: > Hello, > > On apache 2.2.22, 2.2.23 and 2.4.4 I'm able to reproduce a problem > where graceful restart takes very long time. Linux 3.7.10, glibc 2.17 here.
Summary of my findings 1) main process on gracefull restart calls /* kill off the idle ones */ ap_mpm_pod_killpg(pod, retained->max_daemons_limit); and that does dummy_connection() for each pod. This works fast (1-2s) in most cases where childrens are alive but not in this case - children exit earlier. 2) children exit in else if (retained->my_generation != ap_scoreboard_image->global->running_generation) { /* restart? */ /* yeah, this could be non-graceful restart, in which case the * parent will kill us soon enough, but why bother checking? */ die_now = 1; } 3) in my case main process works slowly enough that it doesn't finish doing ap_mpm_pod_killpg before all children exit like in 2) 4) so we are ending up in situation where main process is in middle of ap_mpm_pod_killpg while all children already exited in 2). Main process is continuing to do dummy_connection and these connect()s + dummy data send, polling is taking very long time (like 1-2s for each dummy_connection). This is the primary reason why graceful is painfully slow here (1-5 minutes) sometimes. 5) why connect() and socket sending in dummy_connect succeeds? If it failed then ap_mpm_pod_killpg loop would break and things would be fast. Because when all children exited there is main process that still holds listening socket and connection go to this socket. Verified that by doing external telnet connection to the same IP/port dummy_connection is doing - telnet connect() succeeded and telnet waits until next generation comes up. This is why connect()/apr_socket_send() in dummy_connect() dont fail. 6) possible solution - detect that sending dummy_connection data weren't processed by children by trying to read reply diff -ur httpd-2.4.4.org/server/mpm_unix.c httpd-2.4.4/server/mpm_unix.c --- httpd-2.4.4.org/server/mpm_unix.c 2012-07-03 21:38:58.000000000 +0200 +++ httpd-2.4.4/server/mpm_unix.c 2013-04-12 09:14:58.282929959 +0200 @@ -604,7 +604,17 @@ len = strlen(data); } - apr_socket_send(sock, data, &len); + rv = apr_socket_send(sock, data, &len); + if (rv == APR_SUCCESS) { + char *buffer[10]; // dummy buffer to see if child is alive + + len = sizeof(buffer); + rv = apr_socket_recv(sock, buffer, &len); + if (rv != APR_SUCCESS) { + ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, ap_server_conf, APLOGNO(00057) + "receiving response to dummy data to listener on %pI", lp->bind_addr); + } + } apr_socket_close(sock); apr_pool_destroy(p); without patch (40s): [Fri Apr 12 09:08:40.785226 2013] [mpm_prefork:notice] [pid 6790] AH00171: Graceful restart requested, doing restart [Fri Apr 12 09:09:20.968224 2013] [mpm_prefork:notice] [pid 6790] AH00163: Apache/2.4.4 (Unix) configured -- resuming normal operations with patch (3s): [Fri Apr 12 09:10:01.902779 2013] [mpm_prefork:notice] [pid 7532] AH00171: Graceful restart requested, doing restart [Fri Apr 12 09:10:04.973715 2013] [mpm_prefork:notice] [pid 7532] AH00163: Apache/2.4.4 (Unix) configured -- resuming normal operations strace with the patch: http://ixion.pld-linux.org/~arekm/apache4.txt 7) How to reproduce? Main process need to be slowed down, so all children have a chance to exit in 2) before 1) finishes. In my case "strace -ttT -s 300 -p pido-of-main-apache-process" is enough. Then run ab -n 100000 -c 100 http://apache.ip and in parallel issue graceful restart before ab finishes. Look at error_log if graceful time was longer than it should be. Strace should show dummy_connection connects(), polling etc. -- Arkadiusz Miśkiewicz, arekm / maven.pl