https://bz.apache.org/bugzilla/show_bug.cgi?id=66302

            Bug ID: 66302
           Summary: Passing health check does not recover worker from its
                    error state
           Product: Apache httpd-2
           Version: 2.4-HEAD
          Hardware: PC
                OS: Linux
            Status: NEW
          Keywords: PatchAvailable
          Severity: regression
          Priority: P2
         Component: mod_proxy_hcheck
          Assignee: bugs@httpd.apache.org
          Reporter: alessandro.cavali...@unibo.it
  Target Milestone: ---

Created attachment 38407
  --> https://bz.apache.org/bugzilla/attachment.cgi?id=38407&action=edit
mod_proxy_hcheck: recover worker from error state

While we were in the process of enabling mod_proxy_hcheck on some of our
apache2 nodes we encountered an unusual behavior: sometimes, after rebooting a
backend, its worker status remains marked as "Init Err" (balancer manager)
until another request is made to the backend, no matter how many health checks
complete successfully.

The following list shows the sequence of events leading to the problem:

1. Watchdog triggers health check, request is successful; worker status is
"Init Ok"
2. HTTP request to apache2 with unreachable backend (rebooting); status becomes
"Init Err"
3. Watchdog triggers another health check, request is again successful because
the backend recovered; worker status remains "Init Err"
4. same as 3
5. same as 4

The only way for the worker status to recover is to wait for "hcfails"
unsuccessful health checks and then again for "hcpasses" requests to be
completed or just wait for legitimate traffic to retry the failed worker, which
may not happen for a long time for rarely used applications.


This was surprising to us since we were expecting the worker status to be
recovered after "hcpasses" successful health checks; however this doesn't seem
to happen when the error status is triggered by ordinary traffic to the backend
(i.e not health checks).

We believe this behavior was accidentally introduced in r1725523. The patch we
are proposing seems to fix the problem in our environment.

-- 
You are receiving this mail because:
You are the assignee for the bug.
---------------------------------------------------------------------
To unsubscribe, e-mail: bugs-unsubscr...@httpd.apache.org
For additional commands, e-mail: bugs-h...@httpd.apache.org

Reply via email to