A few comments (mainly on the proposal to piggy-back the load info header in
the responses) :
*) The mechanism may not work in certain setups of the SLB (e.g DSR)*) For TLS,
I presume this proposal assumes that the connections are terminated at the SLB
layer?*) How does the proposal apply to newer protocols like SPDY, HTTP/2? (or)
the request/responses are streams within a connection, and AFAIK, headers like
Connection/Keep-Alive are no longer valid with SPDY (and perhaps, even with
HTTP/2)
Thanks,
Sudheer
On Saturday, May 2, 2015 2:28 PM, Leif Hedstrom <[email protected]> wrote:
I’m Cc: this to [email protected], since I think this is something
some of our dev would be interested in. There are a few other replies to this
thread already, which can be seen on the archives.
As has been mentioned in another reply, I think the header name ought to be
something ^X- (see RFC 6648). Backend-Info or Backend-Capacity-Info or some
such?
Cheers,
— leif
> On Apr 29, 2015, at 10:54 PM, Jim Riggs <[email protected]> wrote:
>
> [ Long message and proposal follows. Bear with me. There are a lot of words,
> but that is because we need a lot of help/input! ;-) ]
>
> So, this has come up in the past several times, and we discussed it again
> this year at ApacheCon: How do we get the load balancer to make smarter, more
> informed decisions about where to send traffic?
>
> The different LB methods provide some different attempts at balancing
> traffic, but ultimately none of them is "smart" about its decision. Other
> than a member being in error state, the balancer makes its decision solely
> based on configuration (LB set, factor, etc.) and its own knowledge of the
> member (e.g. requests, bytes). What we have often discussed is a way to get
> some type of health/load/capacity information from the backend to make
> informed balancing decisions.
>
> One method is to use health checks (a la haproxy, AWS ELBs, etc.) that
> request one or more URLs and the response code/time indicates whether or not
> the service is up and available, allowing more proactive decisions. While
> this is better than our current state of reactively marking members in error
> state based on failed requests, it still provides a limited view of the
> health/state of the backend.
>
> We have also discussed implementing a way for backends to communicate a
> magical "load" number to the front end to take into account as it balances
> traffic. This would give a much better view into the backend's state, but
> requires some way to come up with this calculation that each backend
> system/server/service/app must provide. This then has to be implemented in
> all the various backends (e.g. httpd, tomcat, php-fpm, unicorn, mongrel,
> etc., etc.), probably a hard sell to all of those projects. And, the front
> end would have limited control over what that number means or how to use it.
>
> During JimJag's balancer talk at ApacheCon this year, he brought up this
> issue of "better, more informed" decision making again. I put some thought
> into it that night and came up with some ideas. Jim, Covener, Trawick,
> Ruggeri, and I then spent some time over the next couple of days talking it
> through and fleshing out some of the details.
>
> Based on all of that, below is what I am proposing. I have some initial code
> that I am working on to implement the different pieces of this, and I will
> put them up in bugz or somewhere when they're a little less rudimentary.
>
> --
>
> Our hope is to create a general standard that can be used by various
> projects, products, proxies, servers, etc., to have a more consistent way for
> a load balancer to request and receive useful internal state information from
> its backend nodes. This information can then be used by the *frontend*
> software/admin (this is the main change from what we have discussed before)
> to calculate a load factor appropriate for each backend node.
>
> This communication uses a new, standard HTTP header, "X-Backend-Info", that
> takes this form in RFC2616 BNF:
>
> backend-info = "version" "=" numeric-entry
> [
> *LWS "," *LWS
> #( numeric-entry | string-entry )
> ]
>
> numeric-entry = numeric-field "=" ( float | <"> float <"> )
> ; that is, numbers may optionally be enclosed in
> ; quotation marks
>
> float = 1*DIGIT [ "." 1*DIGIT ]
>
> numeric-field = "workers-max"
> ; maximum number of workers the backend supports
> | "workers-used"
> ; current number of used/busy workers
> | "workers-allocated"
> ; current number of allocated/ready workers
> | "workers-free"
> ; current number of workers available for use
> ; (generally the difference between workers-max and
> ; workers-used, though some implementations may have
> ; a different notion)
> | "uptime"
> ; number of seconds the backend has been running
> | "requests"
> ; number of requests the backend has processed
> | "memory-max"
> ; total amount of memory available in bytes
> | "memory-used"
> ; amount of used memory in bytes
> | "memory-allocated"
> ; amount of allocated/committed memory in bytes
> | "memory-free"
> ; amount of memory available for use (generally
> ; the difference between memory-max and memory-used,
> ; though some implementations may have a different
> ; notion)
> | "load-current"
> ; the (subjective) current load for the backend
> | "load-5"
> ; the (subjective) 5-minute load for the backend
> | "load-15"
> ; the (subjective) 15-minute load for the backend
>
> string-entry = string-field "=" ( token | quoted-string )
>
> string-field = "provider"
> ; informational description of backend information
> ; provider (module, container, subsystem, app, etc.)
>
>
> As used here, "worker" is an overloaded term whose precise meaning is
> backend-dependent. It might refer to processes, threads, pipelines, or
> whatever the backend system/server/service/app uses to measure or limit its
> number of active, processing connections.
>
> The process-flow looks like this:
>
> 1. The frontend (periodically based on time or requests, or on demand) as
> part of either (1) a normal proxied request or (2) a dedicated health check
> adds an "X-Backend-Info" request header to a backend request, informing the
> backend that it wants node state information. I.e.:
>
> X-Backend-Info: version=1.0
>
> 2. The backend node receives a request with an "X-Backend-Info" header
> specifying a version it supports.
>
> 3. A supporting backend node SHOULD insert one or more "X-Backend-Info"
> response headers with any subset of the backend-info fields that it supports,
> including the required "version" field. The version of information provided
> MUST be less than or equal to the version requested. (The fields are
> standardized so that various frontends know what to expect, rather than each
> backend system/server/service/app creating its own fields/values.) E.g.:
>
> X-Backend-Info: version=1.0, provider="Backend X", workers-max=1000,
> workers-used=517, workers-free=483, uptime=19234,
> requests=85939
>
> 4. The backend MUST add the "X-Backend-Info" token to the "Connection"
> response header, making it a hop-by-hop field that is removed by the frontend
> from the downstream response (RFC2616 14.10 and RFC7230 6.1). [Note there
> appears to be an httpd bug here that I intend to submit and that needs to be
> addressed.]
>
> Connection: X-Backend-Info
>
> 5. The frontend parses the backend-info entries in the received
> "X-Backend-Info" response header. The values are then used as part of either
> an internal or an administrator-specified calculation to determine the load
> factor or weight of that node for subsequent requests.
>
> 6. The frontend MUST remove the "X-Backend-Info" hop-to-hop response header
> per RFCs.
>
> --
>
> As for httpd implementation, this has two pieces. The first is when httpd is
> used as a backend node behind a load balancer and must provide X-Backend-Info
> response data. For this, I have created a module tentatively named
> mod_proxy_backend_info that does nothing except insert an output filter to
> populate the response header with version, provider, workers-*, request,
> uptime, and load-* values when the request header is present. Here is an
> example request-response:
>
> % curl -v -H 'X-Backend-Info: version=1.0' http://localhost/
> * Trying 127.0.0.1...
> * Connected to localhost (127.0.0.1) port 80 (#0)
>> GET / HTTP/1.1
>> User-Agent: curl/7.41.0
>> Host: localhost
>> Accept: */*
>> X-Backend-Info: version=1.0
>>
> < HTTP/1.1 200 OK
> < Date: Thu, 30 Apr 2015 04:32:08 GMT
> < Server: Apache/2.4.9 (Unix) PHP/5.5.14
> < Last-Modified: Wed, 15 Apr 2015 14:04:54 GMT
> < ETag: "2d-513c3d4d78d80"
> < Accept-Ranges: bytes
> < Content-Length: 45
> < X-Backend-Info: version=1.0, provider="mod_proxy_backend_info [Apache/2.4.9
> (Unix) PHP/5.5.14]", workers-max=256, workers-busy=1, workers-ready=4,
> workers-free=255, uptime=1448, requests=3, load-current=1.737305,
> load-5=1.733887, load-15=1.668457
> < Connection: X-Backend-Info
> < Content-Type: text/html
> <
> <html><body><h1>It works!</h1></body></html>
>
>
> The second piece is when httpd is used as the load balancer. For this, I have
> created a module tentatively named mod_lbmethod_bybackendinfo that will:
>
> 1. Periodically (based on elapsed time, number of requests, or both since
> last update) insert the X-Backend-Info request header into a proxied request.
>
> 2. Parse and remove the X-Backend-Info response header.
>
> 3. Calculate the member's "informed" load factor based on a formula specified
> by the user/admin in the configuration. I hope to just use the existing
> lbfactor field to store this calculated value. Then we can use existing logic
> to balance based on lbset and lbfactor for subsequent requests.
>
> 4. Store the current time and request count in the member's data structure so
> the lbmethod knows when it needs to be updated again.
>
>
> What I need from all of you:
>
> - Input/commentary on the proposed idea, approach, and implementation.
> Renaming things, additional fields that might be useful, etc., are all up for
> discussion.
>
> - Help with handling the configuration formula mentioned in #3 above. Can we
> just add some math operators to the expression parser to handle this? What
> all operations/functions might we need (+-*/? max? min? ternary if-then-else?
> ...)? A simple-ish example (something like this maybe?):
>
> <Proxy "balancer://...">
> BalancerMember ...
> ...
> ProxySet \
> lbmethod=bybackendinfo \
> backendupdateseconds=30 \
> backendupdaterequests=100 \
> backendformula="%{BACKEND:uptime} -lt 120 ? 1 : %{BACKEND:workers-free} /
>%{BACKEND:workers-max} * 100"
> </Proxy>
>
> - [Near-long-term] Help adding X-Backend-Info backend support and
> documentation to various projects. Tomcat, php-fpm, others(?) should be
> fairly easy to implement and submit patches. This work does us no good if
> none of our backends support it.
>
> - [Long-term] Help adding X-Backend-Info frontend support and documentation
> to various projects to help this become an "accepted ad-hoc standard"...or
> something like that. Nginx, haproxy, and many others would be targets.
>
>
> Warn out from writing all of this and hopeful that someone other than me
> actually cares, I wish you all well today/tonight!
>
> - Jim
>