Re: [RFC/PATCH] mpm_winnt: Fix several issues in the child process shutdown logic

2017-07-11 Thread Evgeny Kotkov
Evgeny Kotkov  writes:

>  (1) As a part of the process of shutting down worker threads, the code
>  around child.c:1170 currently posts an amount of I/O completion packets
>  equal to the amount of the threads blocked on the I/O completion port.
>  Then it sleeps until all these threads "acknowledge" the completion
>  packets by decrementing the global amount of blocked threads.
>
>  This can be improved to avoid unnecessary Sleep()'s, and make the
>  shutdown faster.  There is no need to block until the threads actually
>  receive the completion, as the shutdown process includes a separate step
>  that waits until the threads exit.  Instead of synchronizing on the
>  amount of the threads blocked on the I/O completion port, we can send
>  the number of IOCP_SHUTDOWN completion packets equal to the total
>  amount of threads and immediately proceed to the next step.

Committed in https://svn.apache.org/r1801635

>   (2) If the shutdown routine hits a timeout while waiting for the worker
>   threads to exit, it uses TerminateThread() to terminate the remaining
>   threads.
>
>   Using TerminateThread() can have dangerous consequences such as
>   deadlocks — say, if the the thread is terminated while holding a lock
>   or a heap lock in the middle of HeapAlloc(), as these locks would not
>   be released.  Or it can corrupt the application state and cause a crash.
>   See https://msdn.microsoft.com/en-us/library/windows/desktop/ms686717
>
>   Presumably, a much better alternative here would be to leave the cleanup
>   to the operating system by calling TerminateProcess().

Committed in https://svn.apache.org/r1801636

>   (3) Assuming (2) is in place, the code around child.c:1170 that waits for
>   multiple thread handles in batches can be simplified.  With (2), there
>   is no difference between ending the wait with one or multiple remaining
>   threads.  (Since we terminate the process if at least one thread is
>   still active when we hit a timeout.)
>
>   Therefore, instead of making an effort to evenly distribute and batch
>   the handles with WaitForMultipleObjects(), we could just start from
>   one end, and wait for one thread handle at a time.

Committed in https://svn.apache.org/r1801637


Regards,
Evgeny Kotkov


[RFC/PATCH] mpm_winnt: Fix several issues in the child process shutdown logic

2017-07-07 Thread Evgeny Kotkov
Hi all,

Recently we (Ivan Zhakov and I) found several issues in the code responsible
for shutting down a mpm_winnt's child process.

The attached patch should fix these issues:

 (Please note that the changes are combined into a single patch to make the
  review easier, but I'll commit them independently.)

 (1) As a part of the process of shutting down worker threads, the code
 around child.c:1170 currently posts an amount of I/O completion packets
 equal to the amount of the threads blocked on the I/O completion port.
 Then it sleeps until all these threads "acknowledge" the completion
 packets by decrementing the global amount of blocked threads.

 This can be improved to avoid unnecessary Sleep()'s, and make the
 shutdown faster.  There is no need to block until the threads actually
 receive the completion, as the shutdown process includes a separate step
 that waits until the threads exit.  Instead of synchronizing on the
 amount of the threads blocked on the I/O completion port, we can send
 the number of IOCP_SHUTDOWN completion packets equal to the total
 amount of threads and immediately proceed to the next step.

  (2) If the shutdown routine hits a timeout while waiting for the worker
  threads to exit, it uses TerminateThread() to terminate the remaining
  threads.

  Using TerminateThread() can have dangerous consequences such as
  deadlocks — say, if the the thread is terminated while holding a lock
  or a heap lock in the middle of HeapAlloc(), as these locks would not
  be released.  Or it can corrupt the application state and cause a crash.
  See https://msdn.microsoft.com/en-us/library/windows/desktop/ms686717

  Presumably, a much better alternative here would be to leave the cleanup
  to the operating system by calling TerminateProcess().

  (3) Assuming (2) is in place, the code around child.c:1170 that waits for
  multiple thread handles in batches can be simplified.  With (2), there
  is no difference between ending the wait with one or multiple remaining
  threads.  (Since we terminate the process if at least one thread is
  still active when we hit a timeout.)

  Therefore, instead of making an effort to evenly distribute and batch
  the handles with WaitForMultipleObjects(), we could just start from
  one end, and wait for one thread handle at a time.

Note that apart from this, the code around child.c:1146 that shuts down the
listeners, waits, and then proceeds to shutting down the worker threads is
prone to a subtle race.  Since the wait interval is hardcoded to 1 second,
there could still be an accepted connection after it.  And, as the worker
threads are being shut down, it is feasible that this accepted connection
won't ever find a suitable worker thread.  (Eventually, it is going to be
ungracefully closed).  I think that this can be fixed by properly waiting for
the listener threads to exit, and in the meanwhile, this would avoid having
the Sleep(1000) call altogether.  But this is something that I have now left
for future work.

I would greatly appreciate if someone could review or comment on the proposed
changes.  If anyone has an additional insight on the design or planned, but
unaccomplished changes to the mpm_winnt module, I would appreciate hearing
them as well.

Thanks!


Regards,
Evgeny Kotkov
Index: server/mpm/winnt/child.c
===
--- server/mpm/winnt/child.c(revision 1801135)
+++ server/mpm/winnt/child.c(working copy)
@@ -827,18 +827,6 @@ static DWORD __stdcall worker_main(void *thread_nu
 }
 
 
-static void cleanup_thread(HANDLE *handles, int *thread_cnt,
-   int thread_to_clean)
-{
-int i;
-
-CloseHandle(handles[thread_to_clean]);
-for (i = thread_to_clean; i < ((*thread_cnt) - 1); i++)
-handles[i] = handles[i + 1];
-(*thread_cnt)--;
-}
-
-
 /*
  * child_main()
  * Entry point for the main control thread for the child process.
@@ -890,7 +878,6 @@ void child_main(apr_pool_t *pconf, DWORD parent_pi
 HANDLE *child_handles;
 int listener_started = 0;
 int threads_created = 0;
-int watch_thread;
 int time_remains;
 int cld;
 DWORD tid;
@@ -1162,16 +1149,16 @@ void child_main(apr_pool_t *pconf, DWORD parent_pi
 /* Shutdown the worker threads
  * Post worker threads blocked on the ThreadDispatch IOCompletion port
  */
-while (g_blocked_threads > 0) {
+if (g_blocked_threads > 0) {
 ap_log_error(APLOG_MARK, APLOG_DEBUG, APR_SUCCESS, ap_server_conf, 
APLOGNO(00361)
  "Child: %d threads blocked on the completion port",
  g_blocked_threads);
-for (i=g_blocked_threads; i > 0; i--) {
-PostQueuedCompletionStatus(ThreadDispatchIOCP, 0,
-   IOCP_SHUTDOWN, NULL);
-}
-Sleep(1000);
 }
+
+