I am looking for 2 things mod_proxy_balancer cannot currently provide:

  1. Something to limit the maximum impact of having many dead members
     under a load balancer on normal requests.
         * The process of discovering that dead workers are still dead
           shouldn't overtly impact any normal request (assuming there
           are live workers available)
         * Sample situation:
               o Load balancing over 10 ports, most of which do not
                 have an active backend (Tomcat) associated at the
                 time.  If there is only 1 backend alive, every 'retry'
                 seconds, a normal request is delayed by a period of
                 9*dead-connection-latency.  That's neither necessary
                 nor acceptable.
         * Possible solutions include:
               o Having an option to have a background thread ping the
                 backends rather than allowing normal requests to do so.
                     + In the case of mod_proxy_ajp, a cping would be
                       preferred here, rather than a full request.
               o Limiting the number of workers any single normal
                 request will attempt to recover
  2. Something to reduce the severity of log messages when discovering
     that a dead worker is still dead.
         * There is no need to fill the error logs with notices that a
           worker that has been dead is still dead.  This is good
           troubleshooting info and should be logged, but at a lower
           severity level that does not show up in the logs by default.
         * Depending on the solution to (1), this might just fall out
           of that.

I had already started a discussion along these lines on the Tomcat development mailing list, as I have the same needs for both mod_proxy_ajp (for Apache 2.2 front ends) and mod_jk/isapi (for IIS front ends). Mladen Turk kindly pointed me to some work he had recently done on trunk for mod_jk to add a background "watchdog" thread for periodic background work. He has also talked about adding a similar capability to Apache itself in the future. Jim Jagielski pointed out that the discussion should probably move over here as portions have impact on Apache itself.

I need solutions to these problems one way or another, so if nothing else I'll have to hack in something into our own fork of the code.

I have a fair amount of time to solve these problems, however, so I'd much rather see them solved in a good, general way that can be a value-add part of both mod_jk and mod_proxy -- rather than a one-off fork. Ideally the solutions would be somewhat consistent as well, for everyone's sanity.

Thoughts?  Suggestions?

--
Jess Holle

Reply via email to