Re: 2.4.4 graceful restart taking long time [PATCH]

Arkadiusz Miśkiewicz Fri, 12 Apr 2013 00:22:41 -0700

On Thursday 11 of April 2013, Arkadiusz Miśkiewicz wrote:
> Hello,
> 
> On apache 2.2.22, 2.2.23 and 2.4.4 I'm able to reproduce a problem
> where graceful restart takes very long time. Linux 3.7.10, glibc 2.17 here.


Summary of my findings

1) main process on gracefull restart calls 
/* kill off the idle ones */
ap_mpm_pod_killpg(pod, retained->max_daemons_limit);

and that does dummy_connection() for each pod. This works fast
(1-2s) in most cases where childrens are alive but not in this
case - children exit earlier.

2) children exit in 
        else if (retained->my_generation !=
                 ap_scoreboard_image->global->running_generation) { /* 
restart? */
            /* yeah, this could be non-graceful restart, in which case the
             * parent will kill us soon enough, but why bother checking?
             */
            die_now = 1;
        }

3) in my case main process works slowly enough that it doesn't finish
doing ap_mpm_pod_killpg before all children exit like in 2)

4) so we are ending up in situation where main process is in middle
of ap_mpm_pod_killpg while all children already exited in 2). Main
process is continuing to do dummy_connection and these connect()s
+ dummy data send, polling is taking very long time (like 1-2s for each
dummy_connection). This is the primary reason why graceful is painfully
slow here (1-5 minutes) sometimes.

5) why connect() and socket sending in dummy_connect succeeds?
If it failed then ap_mpm_pod_killpg loop would break and things would
be fast.

Because when all children exited there is main process that still
holds listening socket and connection go to this socket. Verified that
by doing external telnet connection to the same IP/port dummy_connection
is doing - telnet connect() succeeded and telnet waits until next
generation comes up. This is why connect()/apr_socket_send()
in dummy_connect() dont fail.

6) possible solution - detect that sending dummy_connection data weren't
processed by children by trying to read reply

diff -ur httpd-2.4.4.org/server/mpm_unix.c httpd-2.4.4/server/mpm_unix.c
--- httpd-2.4.4.org/server/mpm_unix.c   2012-07-03 21:38:58.000000000 +0200
+++ httpd-2.4.4/server/mpm_unix.c       2013-04-12 09:14:58.282929959 +0200
@@ -604,7 +604,17 @@
         len = strlen(data);
     }
 
-    apr_socket_send(sock, data, &len);
+    rv = apr_socket_send(sock, data, &len);
+    if (rv == APR_SUCCESS) {
+        char *buffer[10]; // dummy buffer to see if child is alive
+
+        len = sizeof(buffer);
+        rv = apr_socket_recv(sock, buffer, &len);
+        if (rv != APR_SUCCESS) {
+            ap_log_error(APLOG_MARK, APLOG_DEBUG, rv, ap_server_conf, 
APLOGNO(00057)
+                         "receiving response to dummy data to listener on 
%pI", lp->bind_addr);
+        }
+    }
     apr_socket_close(sock);
     apr_pool_destroy(p);
 

without patch (40s):
[Fri Apr 12 09:08:40.785226 2013] [mpm_prefork:notice] [pid 6790] AH00171: 
Graceful restart requested, doing restart
[Fri Apr 12 09:09:20.968224 2013] [mpm_prefork:notice] [pid 6790] AH00163: 
Apache/2.4.4 (Unix) configured -- resuming normal operations

with patch (3s):
[Fri Apr 12 09:10:01.902779 2013] [mpm_prefork:notice] [pid 7532] AH00171: 
Graceful restart requested, doing restart
[Fri Apr 12 09:10:04.973715 2013] [mpm_prefork:notice] [pid 7532] AH00163: 
Apache/2.4.4 (Unix) configured -- resuming normal operations

strace with the patch:
http://ixion.pld-linux.org/~arekm/apache4.txt

7) How to reproduce?
Main process need to be slowed down, so all children have a chance to exit
in 2) before 1) finishes.

In my case "strace -ttT -s 300 -p pido-of-main-apache-process" is enough.

Then run ab -n 100000 -c 100 http://apache.ip and in parallel issue graceful 
restart before ab finishes.

Look at error_log if graceful time was longer than it should be. Strace should 
show
dummy_connection connects(), polling etc.

-- 
Arkadiusz Miśkiewicz, arekm / maven.pl

Re: 2.4.4 graceful restart taking long time [PATCH]

Reply via email to