Am 09.08.2017 um 11:50 schrieb Martin Knoblauch:
On Fri, Aug 4, 2017 at 11:47 PM, Rainer Jung <rainer.j...@kippdata.de>
wrote:

Hi Martin,

Am 04.08.2017 um 10:53 schrieb Martin Knoblauch:

Hi,

 just need some clarification on the mod_jk load blanacing method "Next".
The documentation states:

"If method is set to N[ext] the balancer will again use the number of
sessions to find the best worker. All remarks concerning the Session
method
apply as well. The difference to the Session method is how the session
count is handled in the sliding time window. The Next method does not
divide by 2, instead it subtracts the current minimum number. This should
effectively result in a round-robin session balancing, thus the name Next.
Under high load, the two session balancing methods will result in a
similar
distribution, but Next will be better if you need to distribute small
numbers of sessions. "

 What exactly is the "current minimum number"? How is the minimum taken?
From all workers in the balancer set, or only the ACTive ones? I know, I
should look it up in the code :-)


I looked up the code I wrote 6 years ago.

First: when using the session base lb methods, mod_jk needs to estimate
session counts. No lb method of mod_jk contacts the backends to get real
data, instead mod_jk uses the request info it sees to estimate the backend
situation.

For session based methods, mod_jk counts requests, that do not include a
session id assuming that those are exactly the ones that create new
sessions. Of course:

a) a session id can be outdated, meaning mod_jk would not count the
request as session creating but in fact it would create a new one. One can
at least configure mod_jk to be aware of login pages which will always
create a new session (see http://tomcat.apache.org/conne
ctors-doc/reference/uriworkermap.html and http://tomcat.apache.org/conne
ctors-doc/reference/apache.html and there look for "sticky_ignore").

b) a request without a session ID might not actually create a session,
depending on app details. There are additional config options to teach
mod_jk which URIs do not create sessions (see
http://tomcat.apache.org/connectors-doc/reference/uriworkermap.html and
http://tomcat.apache.org/connectors-doc/reference/apache.html and there
look for "stateless").

c) sessions time out in backends and users can log out. mod_jk does not
track that. One can remove the session cookie during the logout, so that
the "new" requests from that user will be counted by the mod_jk session
counter.

Because of these problems I typically recommend to stick to the default lb
method (request counting, not session counting). But sometimes apps have
resource usage dominated by sessions and then a "session" based lb method
can help, especially if you find a configuration which keeps the effect of
a)-c) above small.

Since all counting methods, not only session based ones, would count stuff
since the last restart of mod_jk, but the current backend load situation
depends much more on stuff that happened recently, we try to get rid of
past counts by reducing the counters regularly. By default this happens
once per minute and is done in a way that the counters are divided by 2
once per minute. That way old counter increases contribute less and less to
the current counter value. For the session based method this would mean we
assume half of the counted sessions die after one minute, 50% of the rest
during the next minute etc. Note that the counters are integers, so e.g. a
counter value of 1 will after division by 2 result in a new value 0. Most
often that is no problem, because on a loaded system numbers are big and
rounding down doesn't change a lot.

The next request without session id will be send to the worker with the
smallest such "session" counter.

The "Next" message varies that procedure by not dividing by 2 every
minute, but instead subtracting the minimum value of the backend counters.
Assume after the first minute, your 4 backends have "session" counters 2,
3, 3 and 2. Then the minimum is 2, so after the minute we correct the
values to 0, 1, 1 and 0. Then we add for the next minute new sessions to
that counter and again subtract the new minimum etc.

When would that be helpful? It was for an application with really huge
sessions but small session numbers. There was a risk that if for a minute
only 0 or one sessions were created on the backends, after dividing by 2
all workers were again 0.

You can actually track the counters via the status worker, were they are
exposed as column "V" (load balancer value).

Regards,

Rainer


Hi Rainer,

 thanks a lot for the comprehensive write-up. Very useful. Just it does not
answer my question on which workers are considered when determining the
"minimum number" :-) Will all workers be considered, or only those in ACT
state?

Ah, I didn't get that question, because you didn't mention the worker states.

The current minimum will be taken over all workers which are in activation state active and are also not in error.

The subtraction of the minimum from the lb value will be done for every worker. Workers who are not active or in error and whose value is already smaller than the minimum taken over the possibly smaller set of workers will have their value set to 0 instead of becoming negative.

 The reason why I am interested in the session based methods is exactely
that the application has a relatively small number of "sessions", which
tend to be heavy weight (memory, I/O and CPU). The request methods tend to
not lead to a good distribution of load here.

Understood.

 What I really would be interested in is a balancer method that actually
looks at the worker backends themselves to determine the load and state
they are in. And I did not find a lot (any) pointers. I imagine that this
is a difficult issue that may lead to its own problems (bad latency, ...).

There would be two approaches for this:

- polling the load situation in intervals. For this one would need to define (configure) the poll URL and the format of the expected response, the format could be fixed, like an integer number. One would also have to think about how to extrapolate the numbers between the poll intervals, and code concurrent HTTP(S) requests with timeouts etc.

- piggy-back the info via responses, e.g. in a custom HTTP response header (configurable header name), that the module would strip, with a fixed format, e.g. an integer value. This approach would mean generating the load value must be cheap in the back end. It would be much simpler to implement, but whenever a node does not get requests for some time, we again have no idea about the current load situation, so some extrapolation is still needed.

Note that "extrapolation" would also be needed because if we have a farm of reverse proxies in front of our backends, it would be nice to get a consolidated view of load.

There is currently no code to support any of the two attempts. A naive implementation of the second attempt wouldn't be hard, but the problem of extrapolating would not be solved in the naive impl. With such an impl you would then add a custom response header, e.g. X-SESSION-COUNT and mod_jk would update it lb value whenever (and only if) it sees such a response header. Would that actually help?

Regards,

Rainer



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to