Rainer Jung wrote:
On 06.05.2009 14:35, jean-frederic clere wrote:
Jess Holle wrote:
Rainer Jung wrote:
Yes, I think the counter/aging discussion is for the baseline, i.e. when
we do not have any information channel to or from the backend nodes.
As soon as mod_cluster comes into play, we can use more up-to-date real
data and only need to decide how to interprete it and how to interpolate
during the update interval.
Should general support for a query URL be provided in
mod_proxy_balancer? Or should this be left to mod_cluster?
Can you explain more? I don't get the question.
Does mod_cluster provide yet another approach top to bottom (separate
than mod_jk and mod_proxy/mod_proxy_ajp)?
Mod_cluster is just a balancer for mod_proxy but due to the dynamic
creation of balancers and workers it can't get in the httpd-trunk code
right now.
It would seem nice to me if mod_jk and/or mod_proxy_balancer could do
health checks, but you have to draw the line somewhere on growing any
given module and if mod_jk and mod_proxy_balancer are not going in
that direction at some point mod_cluster may be in my future.
Cool :-)
There are at several different sub systems, and as I understood
mod_cluster it already carefully separates them:
1) Dynamic topology detection (optional)
What are our backend nodes? If you do not want to statically configure
them, you need some mechanism based on either
- registration: backend nodes register at one or multiple topology
management nodes; the addresses of those are either configured, or they
announce themselves on the network via broad- or multicast).
- detection: topology manager receives broad- or multicast packets of
the backend nodes. They do not need to know the topology manager, only
the multicast address
More enhanced would be to already learn the forwarding rules (e.g. URLs
to map) from the backend nodes.
In the simpler case, the topology would be configured statically.
2) Dynamic state detection
a) Livelyness
b) Load numbers
Both could be either polled by (maybe scalability issues) or pushed to a
state manager. Push could be done by tcp (the address could be sent to
the backend, once it was detected in 1) or defined statically). Maybe
one would use both ways, e.g. push for active state changes, like when
an admin stops a node, poll for state manager driven things. Not sure.
3) Balancing
Would be done based on the data collected by the state manager.
It's not clear at all, whether those three should be glued together
tightly, or kept in different pieces. I had the impression the general
direction is more about separating them and to allow multiple
experiments, like mod_cluster and mod_heartbeat.
The interaction would be done via some common data container, e.g.
slotmem or in a distributed (multiple Apaches) situation memcache or
similar.
Does this make sense?
Yes.
I've been working around #1 by using pre-designated port ranges for
backends, e.g. configuring for balancing over a port range of 10 and
only having a couple of servers running in this range at most given
times. That's fine as long as one quiets Apache's error logging so that
it only complains about backends that are *newly* unreachable rather
than complaining each time a backend is retried. I supplied a patch for
this some time back.
#2 and #3 are huge, however, and it would be good to see something firm
rather than experimental in these areas sooner than later.
--
Jess Holle