Re: Proposal/RFC: "informed" load balancing

Sudheer Vinukonda Wed, 06 May 2015 07:58:36 -0700

A few comments (mainly on the proposal to piggy-back the load info header in 
the responses) :
*) The mechanism may not work in certain setups of the SLB (e.g DSR)*) For TLS, 
I presume this proposal assumes that the connections are terminated at the SLB 
layer?*) How does the proposal apply to newer protocols like SPDY, HTTP/2? (or) 
the request/responses are streams within a connection, and AFAIK, headers like 
Connection/Keep-Alive are no longer valid with SPDY (and perhaps, even with 
HTTP/2)
Thanks,
Sudheer



     On Saturday, May 2, 2015 2:28 PM, Leif Hedstrom <zw...@apache.org> wrote:
   

 I’m Cc: this to d...@trafficserver.apache.org, since I think this is something 
some of our dev would be interested in. There are a few other replies to this 
thread already, which can be seen on the archives.

As has been mentioned in another reply, I think the header name ought to be 
something ^X- (see RFC 6648). Backend-Info or Backend-Capacity-Info or some 
such?

Cheers,

— leif



> On Apr 29, 2015, at 10:54 PM, Jim Riggs <apache-li...@riggs.me> wrote:
> 
> [ Long message and proposal follows. Bear with me. There are a lot of words, 
> but that is because we need a lot of help/input! ;-) ]
> 
> So, this has come up in the past several times, and we discussed it again 
> this year at ApacheCon: How do we get the load balancer to make smarter, more 
> informed decisions about where to send traffic?
> 
> The different LB methods provide some different attempts at balancing 
> traffic, but ultimately none of them is "smart" about its decision. Other 
> than a member being in error state, the balancer makes its decision solely 
> based on configuration (LB set, factor, etc.) and its own knowledge of the 
> member (e.g. requests, bytes). What we have often discussed is a way to get 
> some type of health/load/capacity information from the backend to make 
> informed balancing decisions.
> 
> One method is to use health checks (a la haproxy, AWS ELBs, etc.) that 
> request one or more URLs and the response code/time indicates whether or not 
> the service is up and available, allowing more proactive decisions. While 
> this is better than our current state of reactively marking members in error 
> state based on failed requests, it still provides a limited view of the 
> health/state of the backend.
> 
> We have also discussed implementing a way for backends to communicate a 
> magical "load" number to the front end to take into account as it balances 
> traffic. This would give a much better view into the backend's state, but 
> requires some way to come up with this calculation that each backend 
> system/server/service/app must provide. This then has to be implemented in 
> all the various backends (e.g. httpd, tomcat, php-fpm, unicorn, mongrel, 
> etc., etc.), probably a hard sell to all of those projects. And, the front 
> end would have limited control over what that number means or how to use it.
> 
> During JimJag's balancer talk at ApacheCon this year, he brought up this 
> issue of "better, more informed" decision making again. I put some thought 
> into it that night and came up with some ideas. Jim, Covener, Trawick, 
> Ruggeri, and I then spent some time over the next couple of days talking it 
> through and fleshing out some of the details.
> 
> Based on all of that, below is what I am proposing. I have some initial code 
> that I am working on to implement the different pieces of this, and I will 
> put them up in bugz or somewhere when they're a little less rudimentary.
> 
> --
> 
> Our hope is to create a general standard that can be used by various 
> projects, products, proxies, servers, etc., to have a more consistent way for 
> a load balancer to request and receive useful internal state information from 
> its backend nodes. This information can then be used by the *frontend* 
> software/admin (this is the main change from what we have discussed before) 
> to calculate a load factor appropriate for each backend node.
> 
> This communication uses a new, standard HTTP header, "X-Backend-Info", that 
> takes this form in RFC2616 BNF:
> 
>    backend-info  = "version" "=" numeric-entry
>                    [
>                      *LWS "," *LWS
>                      #( numeric-entry | string-entry )
>                    ]
> 
>    numeric-entry = numeric-field "=" ( float | <"> float <"> )
>                    ; that is, numbers may optionally be enclosed in
>                    ; quotation marks
> 
>    float        = 1*DIGIT [ "." 1*DIGIT ]
> 
>    numeric-field = "workers-max"
>                    ; maximum number of workers the backend supports
>                  | "workers-used"
>                    ; current number of used/busy workers
>                  | "workers-allocated"
>                    ; current number of allocated/ready workers
>                  | "workers-free"
>                    ; current number of workers available for use
>                    ; (generally the difference between workers-max and
>                    ; workers-used, though some implementations may have
>                    ; a different notion)
>                  | "uptime"
>                    ; number of seconds the backend has been running
>                  | "requests"
>                    ; number of requests the backend has processed
>                  | "memory-max"
>                    ; total amount of memory available in bytes
>                  | "memory-used"
>                    ; amount of used memory in bytes
>                  | "memory-allocated"
>                    ; amount of allocated/committed memory in bytes
>                  | "memory-free"
>                    ; amount of memory available for use (generally
>                    ; the difference between memory-max and memory-used,
>                    ; though some implementations may have a different
>                    ; notion)
>                  | "load-current"
>                    ; the (subjective) current load for the backend
>                  | "load-5"
>                    ; the (subjective) 5-minute load for the backend
>                  | "load-15"
>                    ; the (subjective) 15-minute load for the backend
> 
>    string-entry  = string-field "=" ( token | quoted-string )
> 
>    string-field  = "provider"
>                    ; informational description of backend information
>                    ; provider (module, container, subsystem, app, etc.)
> 
> 
> As used here, "worker" is an overloaded term whose precise meaning is 
> backend-dependent. It might refer to processes, threads, pipelines, or 
> whatever the backend system/server/service/app uses to measure or limit its 
> number of active, processing connections.
> 
> The process-flow looks like this:
> 
> 1. The frontend (periodically based on time or requests, or on demand) as 
> part of either (1) a normal proxied request or (2) a dedicated health check 
> adds an "X-Backend-Info" request header to a backend request, informing the 
> backend that it wants node state information. I.e.:
> 
>    X-Backend-Info: version=1.0
> 
> 2. The backend node receives a request with an "X-Backend-Info" header 
> specifying a version it supports.
> 
> 3. A supporting backend node SHOULD insert one or more "X-Backend-Info" 
> response headers with any subset of the backend-info fields that it supports, 
> including the required "version" field. The version of information provided 
> MUST be less than or equal to the version requested. (The fields are 
> standardized so that various frontends know what to expect, rather than each 
> backend system/server/service/app creating its own fields/values.) E.g.:
> 
>    X-Backend-Info: version=1.0, provider="Backend X", workers-max=1000,
>                    workers-used=517, workers-free=483, uptime=19234,
>                    requests=85939
> 
> 4. The backend MUST add the "X-Backend-Info" token to the "Connection" 
> response header, making it a hop-by-hop field that is removed by the frontend 
> from the downstream response (RFC2616 14.10 and RFC7230 6.1). [Note there 
> appears to be an httpd bug here that I intend to submit and that needs to be 
> addressed.]
> 
>    Connection: X-Backend-Info
> 
> 5. The frontend parses the backend-info entries in the received 
> "X-Backend-Info" response header. The values are then used as part of either 
> an internal or an administrator-specified calculation to determine the load 
> factor or weight of that node for subsequent requests.
> 
> 6. The frontend MUST remove the "X-Backend-Info" hop-to-hop response header 
> per RFCs.
> 
> --
> 
> As for httpd implementation, this has two pieces. The first is when httpd is 
> used as a backend node behind a load balancer and must provide X-Backend-Info 
> response data. For this, I have created a module tentatively named 
> mod_proxy_backend_info that does nothing except insert an output filter to 
> populate the response header with version, provider, workers-*, request, 
> uptime, and load-* values when the request header is present. Here is an 
> example request-response:
> 
> % curl -v -H 'X-Backend-Info: version=1.0' http://localhost/
> *  Trying 127.0.0.1...
> * Connected to localhost (127.0.0.1) port 80 (#0)
>> GET / HTTP/1.1
>> User-Agent: curl/7.41.0
>> Host: localhost
>> Accept: */*
>> X-Backend-Info: version=1.0
>> 
> < HTTP/1.1 200 OK
> < Date: Thu, 30 Apr 2015 04:32:08 GMT
> < Server: Apache/2.4.9 (Unix) PHP/5.5.14
> < Last-Modified: Wed, 15 Apr 2015 14:04:54 GMT
> < ETag: "2d-513c3d4d78d80"
> < Accept-Ranges: bytes
> < Content-Length: 45
> < X-Backend-Info: version=1.0, provider="mod_proxy_backend_info [Apache/2.4.9 
> (Unix) PHP/5.5.14]", workers-max=256, workers-busy=1, workers-ready=4, 
> workers-free=255, uptime=1448, requests=3, load-current=1.737305, 
> load-5=1.733887, load-15=1.668457
> < Connection: X-Backend-Info
> < Content-Type: text/html
> <
> <html><body><h1>It works!</h1></body></html>
> 
> 
> The second piece is when httpd is used as the load balancer. For this, I have 
> created a module tentatively named mod_lbmethod_bybackendinfo that will:
> 
> 1. Periodically (based on elapsed time, number of requests, or both since 
> last update) insert the X-Backend-Info request header into a proxied request.
> 
> 2. Parse and remove the X-Backend-Info response header.
> 
> 3. Calculate the member's "informed" load factor based on a formula specified 
> by the user/admin in the configuration. I hope to just use the existing 
> lbfactor field to store this calculated value. Then we can use existing logic 
> to balance based on lbset and lbfactor for subsequent requests.
> 
> 4. Store the current time and request count in the member's data structure so 
> the lbmethod knows when it needs to be updated again.
> 
> 
> What I need from all of you:
> 
> - Input/commentary on the proposed idea, approach, and implementation. 
> Renaming things, additional fields that might be useful, etc., are all up for 
> discussion.
> 
> - Help with handling the configuration formula mentioned in #3 above. Can we 
> just add some math operators to the expression parser to handle this? What 
> all operations/functions might we need (+-*/? max? min? ternary if-then-else? 
> ...)? A simple-ish example (something like this maybe?):
> 
> <Proxy "balancer://...">
>  BalancerMember ...
>  ...
>  ProxySet \
>    lbmethod=bybackendinfo \
>    backendupdateseconds=30 \
>    backendupdaterequests=100 \
>    backendformula="%{BACKEND:uptime} -lt 120 ? 1 : %{BACKEND:workers-free} / 
>%{BACKEND:workers-max} * 100"
> </Proxy>
> 
> - [Near-long-term] Help adding X-Backend-Info backend support and 
> documentation to various projects. Tomcat, php-fpm, others(?) should be 
> fairly easy to implement and submit patches. This work does us no good if 
> none of our backends support it.
> 
> - [Long-term] Help adding X-Backend-Info frontend support and documentation 
> to various projects to help this become an "accepted ad-hoc standard"...or 
> something like that. Nginx, haproxy, and many others would be targets.
> 
> 
> Warn out from writing all of this and hopeful that someone other than me 
> actually cares, I wish you all well today/tonight!
> 
> - Jim
>

Re: Proposal/RFC: "informed" load balancing

Reply via email to