Re: load balancer cluster set

2006-08-07 Thread Klaus Wagner
 Load balancing really belongs at the network layer.
depends on your needs
 
 IBM released free load-balancing software for linux and windows about
 1997.  My former employer's integration group (about 3 people) got a
 fully redundant implementation running (on 4 pcs) in about 4 months.
ack
 
 The company abandoned the free s/w version for hardware implementations
 on Cisco gear (and others) within about 6 months.  I'm sure the price
 for proprietary hardware has dropped substantially since then.

disagree the price is still at the same level (just the releases went 
up)

I also disagree when it comes to the point of Cisco has the perfect LB
solution. In fact they have not. The problem is not the distribution,
which is not able to provide a Load distribution in means of spreading
EQUAL load on some servers because

a) most applications need application stickyness
b) neither round robin nor other implementations manage to keep equal
load on servers because they can't measure the SERVERS load

But again problems are elsewhere. There are quite some methods
to provide stickyness. All fail in some kinds of uses. They can't watch
cookies 100% correctly and if they are told to introduce their own, they
mess up the protocol. IP stickyness has issues when upstream proxies are
used in a balanced way.

Finally ssl stickyness is not working correctly too (which is generally
a bad idea anyway).

But there is a solution: lose your capeablity of handling ssl and source
that out to cisco appliances completely.

Finally to clean up with myths: The loadbalancing in ciscos LBs is NOT
done in hardware. In fact they use multi purpose CPUs to do LB
decisions. Only when it comes to plain routing, that happens in
hardware.

On the other hand ... to be nice to cisco ... they have done a great job
to keep this thing stable. Servers crash far more often than these
appliances. And they have done a great job in means of failover from one
lb to another because they use routing protocols to announce the active
lb way faster that taking up an ip address on a server and starting a
software lb.

so ... there are reasons to use appliances and there are reasons not to
use them (flexible balancing, application layer stickyness, and so on)

regards klaus



Re: load balancer cluster set

2006-08-07 Thread Guy Hulbert
[ I had given up this thread but since I started it ... ]

Apart from minor details I agree with this comment anyway.

On Mon, 2006-07-08 at 13:48 +0200, Klaus Wagner wrote:
snip
  on Cisco gear (and others) within about 6 months.  I'm sure the price
  for proprietary hardware has dropped substantially since then.
 
 disagree the price is still at the same level (just the releases went 
 up)

What do you mean disagree (this is a case of fact rather than opinion
--- i might be wrong but it's pointless to disagree).  

The price in 1997 for a Local Director or PiX (h/w was same at that
time) was about $40K.  Either it's still at that level or not.  IIRC,
the Pix, at least, has come down quite a bit ... but there has been more
competition in firewalls than in load balancers so you may be right.

snip
 Finally to clean up with myths: The loadbalancing in ciscos LBs is NOT
 done in hardware. In fact they use multi purpose CPUs to do LB
snip

What myths?  Whether you call it h/w or s/w is semantics.

The LD and PiX used OTS h/w (Ppro, PC mobo) in 1997 (i opened one).  The
interesting piece was a proprietary daughter board but AFAIK, that ran
Cisco's own O/S (which, i understand, is derived from some version of
BSD).  I think they've merged all the functionality in s/w and any use
of non-OTS h/w is more likely for energy efficiency and cost than
performance.

I understood (5+ years ago) that Cisco was moving in the direction of
providing Pix functionality on their routers ... but speculation is
pointless ... I'm sure all the info is on their site.

--gh




load balancer cluster set

2006-07-31 Thread Jim Jagielski

I'm trying to figure out which impl of the the
LB cluster set makes the most sense and would appreciate
the feedback.

Basically, I see 2 different methods:

   1. Members in all cluster sets which have the same or
  lower set numbers are checked

   2. Only members is a specific set number are checked. If
  none are usable, skip to the next cluster set.

In other words, lets assume members a, b and c are in
set 0 and d, e and f are in set 1 and g, h and i are in
set 2. We check a, b and c and they are not usable, so
we now start checking set 1. Should we re-check the
members in set 0 (maybe they are usable now) or
just check members of set 1 (logically, the question
is whether we doing a = set# or == set#). I have
both methods coded and am flip-flopping on which
makes the most sense. I'm leaning towards #1 (=set#).

Comments?


Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 10:08 -0400, Jim Jagielski wrote:
 I'm trying to figure out which impl of the the
 LB cluster set makes the most sense and would appreciate
 the feedback.
 
snip
 Comments?

Are you implementing load balancing/clustering in Apache HTTP Server ?

Why ?

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Plüm , Rüdiger , VF EITO


 -Ursprüngliche Nachricht-
 Von: Jim Jagielski 


 In other words, lets assume members a, b and c are in
 set 0 and d, e and f are in set 1 and g, h and i are in
 set 2. We check a, b and c and they are not usable, so
 we now start checking set 1. Should we re-check the
 members in set 0 (maybe they are usable now) or
 just check members of set 1 (logically, the question
 is whether we doing a = set# or == set#). I have
 both methods coded and am flip-flopping on which
 makes the most sense. I'm leaning towards #1 (=set#).

I would also lean to #1 as this means that once cluster set 0
failed and is back again we are using it again, which seems
natural to me. OTH I guess we need to consider session stickyness
in this case. So sessions that have been migrated to set 1 should
stay there until they vanish or someone knocks them out by disabling
this cluster set (BTW:
feature-creep
 will it be possible to disable complete cluster sets via the manager?
/feature-creep
)and thus forcing them back to cluster set 0.

Regards

Rüdiger



Re: load balancer cluster set

2006-07-31 Thread Graham Leggett
On Mon, July 31, 2006 4:29 pm, Guy Hulbert wrote:

 Are you implementing load balancing/clustering in Apache HTTP Server ?

It was implemented quite a while ago.

 Why ?

Because it's useful?

Regards,
Graham
--




Re: load balancer cluster set

2006-07-31 Thread Jim Jagielski


On Jul 31, 2006, at 10:51 AM, Plüm, Rüdiger, VF EITO wrote:





-Ursprüngliche Nachricht-
Von: Jim Jagielski




In other words, lets assume members a, b and c are in
set 0 and d, e and f are in set 1 and g, h and i are in
set 2. We check a, b and c and they are not usable, so
we now start checking set 1. Should we re-check the
members in set 0 (maybe they are usable now) or
just check members of set 1 (logically, the question
is whether we doing a = set# or == set#). I have
both methods coded and am flip-flopping on which
makes the most sense. I'm leaning towards #1 (=set#).


(BTW:
feature-creep
 will it be possible to disable complete cluster sets via the manager?
/feature-creep
)and thus forcing them back to cluster set 0.



At present, we are more member-centric than set centric.
You can disable ind members of a set, but not a whole set.
If useful, this could be added at some point...

Re: load balancer cluster set

2006-07-31 Thread Jim Jagielski


On Jul 31, 2006, at 10:29 AM, Guy Hulbert wrote:


On Mon, 2006-31-07 at 10:08 -0400, Jim Jagielski wrote:

I'm trying to figure out which impl of the the
LB cluster set makes the most sense and would appreciate
the feedback.


snip

Comments?


Are you implementing load balancing/clustering in Apache HTTP  
Server ?




This is part of the Apache 2.2.x release


Why ?


People want it.



Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 11:18 -0400, Jim Jagielski wrote:
  Why ?
 
 People want it.

Thought so :-(

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Jim Jagielski
Guy Hulbert wrote:
 
 On Mon, 2006-31-07 at 11:18 -0400, Jim Jagielski wrote:
   Why ?
  
  People want it.
 
 Thought so :-(
 

Why :-(   ??

-- 
===
   Jim Jagielski   [|]   [EMAIL PROTECTED]   [|]   http://www.jaguNET.com/
If you can dodge a wrench, you can dodge a ball.


Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 16:54 +0200, Graham Leggett wrote:
  Why ?
 
 Because it's useful?

Nope.  

Load balancing really belongs at the network layer.

IBM released free load-balancing software for linux and windows about
1997.  My former employer's integration group (about 3 people) got a
fully redundant implementation running (on 4 pcs) in about 4 months.

The company abandoned the free s/w version for hardware implementations
on Cisco gear (and others) within about 6 months.  I'm sure the price
for proprietary hardware has dropped substantially since then.

But, I suppose, if people want it ...

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Graham Leggett
On Mon, July 31, 2006 5:32 pm, Guy Hulbert wrote:

 People want it.

 Thought so :-(

Why the :-(...?

httpd tries to deliver what people will find useful, and load balancing is
a very useful part of a multi tier webserver architecture.

Regards,
Graham
--




Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 17:42 +0200, Graham Leggett wrote:
 On Mon, July 31, 2006 5:32 pm, Guy Hulbert wrote:
 
  People want it.
 
  Thought so :-(
 
 Why the :-(...?
 
 httpd tries to deliver what people will find useful, and load balancing is
 a very useful part of a multi tier webserver architecture.

Despite the technical criticism I already posted, I suppose I might find
it useful to have a cheap load-balancing solution at some time in the
future.

However, I see the 'perchild' mpm as a much more pressing need.  I
looked into WebDav about 12 months ago and several people were looking
for this functionality.  I have looked at the alternatives and none of
them are really attractive.

I would also like to see some low-level technical documentation but I am
afraid that I will find the code is that.  I will write some before I
do any work on 'perchild' ... assuming I actually try to do this ... it
won't be a quick project for me.

 
 Regards,
 Graham
 --
 
 

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Graham Leggett
On Mon, July 31, 2006 5:42 pm, Guy Hulbert wrote:

 Nope.

 Load balancing really belongs at the network layer.

I disagree. Load balancing should happen at the layer most capable of
making the most effective balancing decisions.

At the network layer, your metrics are pretty much volume of data or
response time of TCP transaction, and for many purposes these metrics
are fine. For many other purposes, lots of data or a long time does
not mean a loaded server, and you need a better tuned metric that more
accurately represents your real load.

 But, I suppose, if people want it ...

One size doesn't fit all.

Regards,
Graham
--




Re: load balancer cluster set

2006-07-31 Thread Jim Jagielski
Graham Leggett wrote:
 
 On Mon, July 31, 2006 5:32 pm, Guy Hulbert wrote:
 
  People want it.
 
  Thought so :-(
 
 Why the :-(...?
 
 httpd tries to deliver what people will find useful, and load balancing is
 a very useful part of a multi tier webserver architecture.
 

Still not sure why that's a bad thing

-- 
===
   Jim Jagielski   [|]   [EMAIL PROTECTED]   [|]   http://www.jaguNET.com/
If you can dodge a wrench, you can dodge a ball.


Re: load balancer cluster set

2006-07-31 Thread Jim Jagielski
Guy Hulbert wrote:
 
 On Mon, 2006-31-07 at 16:54 +0200, Graham Leggett wrote:
   Why ?
  
  Because it's useful?
 
 Nope.  
 
 Load balancing really belongs at the network layer.
 
 IBM released free load-balancing software for linux and windows about
 1997.  My former employer's integration group (about 3 people) got a
 fully redundant implementation running (on 4 pcs) in about 4 months.
 
 The company abandoned the free s/w version for hardware implementations
 on Cisco gear (and others) within about 6 months.  I'm sure the price
 for proprietary hardware has dropped substantially since then.
 
 But, I suppose, if people want it ...
 

People want to simplify things.

-- 
===
   Jim Jagielski   [|]   [EMAIL PROTECTED]   [|]   http://www.jaguNET.com/
If you can dodge a wrench, you can dodge a ball.


Re: load balancer cluster set

2006-07-31 Thread Mladen Turk

Jim Jagielski wrote:

I'm trying to figure out which impl of the the
LB cluster set makes the most sense and would appreciate
the feedback.

Basically, I see 2 different methods:

   1. Members in all cluster sets which have the same or
  lower set numbers are checked

   2. Only members is a specific set number are checked. If
  none are usable, skip to the next cluster set.

In other words, lets assume members a, b and c are in
set 0 and d, e and f are in set 1 and g, h and i are in
set 2. We check a, b and c and they are not usable, so
we now start checking set 1. Should we re-check the
members in set 0 (maybe they are usable now) or
just check members of set 1 (logically, the question
is whether we doing a = set# or == set#). I have
both methods coded and am flip-flopping on which
makes the most sense. I'm leaning towards #1 (=set#).

Comments?



Something I planned to implement:

Proxy balancer://clusterName#groupRoute1
   BalancerMember .. 1.1
   BalancerMember .. 1.2
/Proxy
Proxy balancer://clusterName#groupRoute2
   BalancerMember .. 2.1
   BalancerMember .. 2.2
/Proxy
Proxy balancer://clusterName#groupRoute3
   BalancerMember .. 3.1
   BalancerMember .. 3.2
/Proxy

In case you have session stickyness, where
jvmRoute is equal for all group members and
all members from groupRoute1 fails, the
groupRoute1 will always be favored depending
on the retry timeout. Now if all members from
groupRoute2 fails, the next election will still
first try to check the corresponding sticky
route members (if they are ready for retry).

So, if you always first try the corresponding
members of the session route balancer, you will
always favor them over the others.
Think this is close to your #1.

Next step is to add the shared memory slot for
balancer, so it can be dynamically maintained.

--
Mladen.





Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 12:04 -0400, Jim Jagielski wrote:
  Nope.  
  
  Load balancing really belongs at the network layer.
snip
  But, I suppose, if people want it ...
  
 
 People want to simplify things.

The simple solution is to buy a bigger piece of hardware or outsource
the problem to the relevent experts.

Trying to do meaningful load-balancing within an application will not be
simple.  At the router it is simple.  All the required data is present
in one spot.

Look.  I really don't want to discourage you.  Especially, since it has
been claimed that the work has already been done.

The real danger, I see, is that you try to become all things to all
people when there does not seem to be resources to solve problems which
are very specific to the core application.

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Graham Leggett
On Mon, July 31, 2006 6:16 pm, Guy Hulbert wrote:

 At the network layer, your metrics are pretty much volume of data or

 Nope.

 Routers can look at anything in the packets which is not encrypted.
 They can also measure server response (by packet stats) directly or via
 SNMP.  There are all sorts of things that *cannot* be done on the server
 without introducing all sorts of p2p communications requirements.

I'm sure they can. This doesn't make them the right solution for all cases.

In a multi tier architecture, you already have front end servers
implementing URL strategies, common logging, all sorts of other things.

Adding an extra router layer to handle load balancing, when your already
existing frontend can do this job is not only extra cost, but extra
complexity and an additional point of failure.

Regards,
Graham
--




Re: load balancer cluster set

2006-07-31 Thread Colm MacCarthaigh
On Mon, Jul 31, 2006 at 12:22:03PM -0400, Guy Hulbert wrote:
 The simple solution is to buy a bigger piece of hardware or outsource
 the problem to the relevent experts.
 
 Trying to do meaningful load-balancing within an application will not be
 simple.  At the router it is simple.  All the required data is present
 in one spot.

Load-balancing can be implemented at any arbitrary point in the stack
(Ethernet/IP/DNS/TCP/HTTP/Application) and each has its own problems and
features. There is nothing particularly appealing about doing it at the
routing layer (though it does present a few novel options like using
anycast or a TCP redirect), and doing it there has a few problems of its
own.

Either way, the more options and the more flexibility, the better. In
the real world, you may find that many operations use multiple 
load-balancing techniques toghether (e.g. Google uses DNS, L2, L3 and 
L4 load-balancing).

-- 
Colm MacCárthaighPublic Key: [EMAIL PROTECTED]


Re: load balancer cluster set

2006-07-31 Thread Graham Leggett
On Mon, July 31, 2006 6:22 pm, Guy Hulbert wrote:

 The real danger, I see, is that you try to become all things to all
 people when there does not seem to be resources to solve problems which
 are very specific to the core application.

Apache httpd is capable not only of switching things off, but removing
unnecessary features entirely as the admin sees fit, so this is a non
problem.

I get the sense that you would rather the developers scratch your itch (in
the form of perchild), rather than theirs (in the form of lb). Getting
perchild going would be great, but I don't see it as any more or less
improtant than lb.

Regards,
Graham
--




Re: load balancer cluster set

2006-07-31 Thread Brian Akins

Guy Hulbert wrote:


However, you may not be able to wait until the linux router project
picks this up  (but it might be worth looking to see what is
available).


Most of the load-balancing we are discussing on this list is not for 
directly customer facing applications.  These are proxies for 
application servers generally, but they need to be highly available.  We 
are not trying to replace Cisco CSM's.  But a hardware HTTP-only aware 
$20k device is not needed when I just need to load balance an app across 
4 tomcat instances, for example.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
Graham.

I already accept that this seems to fait-accomplis.  So I am just
arguing for entertainment purposes.

If the solution is a p2p one then it might be somewhat interesting.
Otherwise, it just seems (to me) to be re-inventing the wheel ...
potentially very badly.

Adding load-balancing/clustering to software projects seems to be
popular (i know of others :-) lately ... it seems like the idea that
every end-user application adds features until it can do email.

On Mon, 2006-31-07 at 18:26 +0200, Graham Leggett wrote:
 On Mon, July 31, 2006 6:16 pm, Guy Hulbert wrote:
 
  At the network layer, your metrics are pretty much volume of data or
 
  Nope.
 
  Routers can look at anything in the packets which is not encrypted.
  They can also measure server response (by packet stats) directly or via
  SNMP.  There are all sorts of things that *cannot* be done on the server
  without introducing all sorts of p2p communications requirements.
 
 I'm sure they can. This doesn't make them the right solution for all cases.
 
 In a multi tier architecture, you already have front end servers
 implementing URL strategies, common logging, all sorts of other things.

The 1997 system I referenced was already a multi-tier architecture.  The
integration group was implementing systems on large world-wide private
networks.

 
 Adding an extra router layer to handle load balancing, when your already
 existing frontend can do this job is not only extra cost, but extra
 complexity and an additional point of failure.

Without knowing the specific network involved this is just wanking.

The implementation of the IBM software solution software solution I
described previously required 4 PCs precisely because of the problem of
redundancy, monitoring and failover.  The PCs were paired with a
heartbeat running on loop-back interfaces.

Do you know anyone running an apache service over the internet without a
router somewhere?  I doubt that IP via carrier pigeon has sufficient
bandwidth.

My only interest in this is you are putting all the additional
complexity into the Apache server.

 
 Regards,
 Graham
 --
 
 

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
I didn't read this very carefully.

On Mon, 2006-31-07 at 18:26 +0200, Graham Leggett wrote:
 I'm sure they can. This doesn't make them the right solution for all
 cases.
 
 In a multi tier architecture, you already have front end servers
 implementing URL strategies, common logging, all sorts of other
 things.
 
 Adding an extra router layer to handle load balancing, when your
 already
snip

This seems reasonable.  Given paragraph 2 (URL strategies etc) Not for
the reasons I've omitted (and responded to separately).  However, I
still don't think this will scale the way router-based solutions can
(already :-).

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Jim Jagielski
 
 My only interest in this is you are putting all the additional
 complexity into the Apache server.
 

Considering the very common usage of Apache being used as
a reverse proxy and the need for URL-specific forwarding,
adding a cluster-like ability to Apache is the obvious
next step.

Will it remove the need for others? Not at all.

-- 
===
   Jim Jagielski   [|]   [EMAIL PROTECTED]   [|]   http://www.jaguNET.com/
If you can dodge a wrench, you can dodge a ball.


Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 18:31 +0200, Graham Leggett wrote:
 I get the sense that you would rather the developers scratch your itch

Their itch is not a problem for me ... and it isn't something I would
necessarily use apache for ... though for a small to medium scale setup
it might be very useful.

 (in
 the form of perchild), rather than theirs (in the form of lb).

Absolutely :-).  I have no intention of writing any code for perchild if
someone else (undoubtedly far more qualified than I) happens to want to
do it.

After looking at the code from subversion and having thought a little
more about 'perchild' I can see a few difficulties and I can see good
reasons why it may not have been worked on.

The reason I am interested in perchild is that combined with WebDav and
Reiser4 it will be possible to create general business applications
which make subversion look like a toy.  Perchild looks like the missing
piece.  It is extremely inconvenient for everything on the back-end to
be owned by one user.

The client-side support for WebDav has been present in windows since
1998.  For some reason, Microsoft just seems to have stopped there.

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Graham Leggett
On Mon, July 31, 2006 6:39 pm, Guy Hulbert wrote:

 I already accept that this seems to fait-accomplis.  So I am just
 arguing for entertainment purposes.

Which in turn means you're just wasting people's time.

Regards,
Graham
--




Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 12:50 -0400, Jim Jagielski wrote:
  
  My only interest in this is you are putting all the additional
  complexity into the Apache server.
  
 
 Considering the very common usage of Apache being used as
 a reverse proxy and the need for URL-specific forwarding,
 adding a cluster-like ability to Apache is the obvious
 next step.

Oh well.  If it is obvious then ok :-).

 
 Will it remove the need for others? Not at all.

If it is p2p then it might ... in the long run.

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Jim Jagielski
Guy Hulbert wrote:
 
 Absolutely :-).  I have no intention of writing any code for perchild if
 someone else (undoubtedly far more qualified than I) happens to want to
 do it.
 
 After looking at the code from subversion and having thought a little
 more about 'perchild' I can see a few difficulties and I can see good
 reasons why it may not have been worked on.
 

perchild is an MPM that would be very useful if it was ever
done. However, to make it very portable is also not trivial,
and it requires additional APR capability which would need to
be added as well...

One reason for a generic scoreboard would be to help make
perchild easier, since we could store the passed fd's in this
location alleviating some of the current problems.

-- 
===
   Jim Jagielski   [|]   [EMAIL PROTECTED]   [|]   http://www.jaguNET.com/
If you can dodge a wrench, you can dodge a ball.


Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 17:30 +0100, Colm MacCarthaigh wrote:
 Either way, the more options and the more flexibility, the better.

This is not true.  There is always a limit.  The difficult part is to
know when you've reached it, of course.

Also, it is a design choice.  For example, perl (TMOWTDI) versus python.

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 19:00 +0200, Graham Leggett wrote:
 On Mon, July 31, 2006 6:39 pm, Guy Hulbert wrote:
 
  I already accept that this seems to fait-accomplis.  So I am just
  arguing for entertainment purposes.
 
 Which in turn means you're just wasting people's time.

It's your choice whether to respond or not.

The exchange is very valuable to me since I am learning a lot more about
the project than I would in any other way.

If I do decide to put work into perchild it will be a very big
investment from my pov ...

 
 Regards,
 Graham
 --
 
 

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 13:05 -0400, Jim Jagielski wrote:
 One reason for a generic scoreboard would be to help make
 perchild easier, since we could store the passed fd's in this
 location alleviating some of the current problems.

Thanks.  

I've seen all the traffic on the scoreboard and this is very useful
context ...

-- 
--gh




Scoreboard was Re: load balancer cluster set

2006-07-31 Thread Brian Akins



I've seen all the traffic on the scoreboard and this is very useful
context ...


Also, I am using a similar scoreboard mechanism to collect lots of per 
 worker stats without the extendedstatus overhead.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: load balancer cluster set

2006-07-31 Thread Graham Leggett
On Mon, July 31, 2006 6:43 pm, Guy Hulbert wrote:

 This seems reasonable.  Given paragraph 2 (URL strategies etc) Not for
 the reasons I've omitted (and responded to separately).  However, I
 still don't think this will scale the way router-based solutions can
 (already :-).

Users of mod_backhand (for httpd v1.3) would disagree, it's a similar
solution that has been around for years. The lb support in v2.x will
hopefully eventually allow users of mod_backhand to migrate to v2.x from
v1.3.

Regards,
Graham
--




Re: Scoreboard was Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 13:21 -0400, Brian Akins wrote:
  I've seen all the traffic on the scoreboard and this is very useful
  context ...
 
 Also, I am using a similar scoreboard mechanism to collect lots of per 
   worker stats without the extendedstatus overhead.

I've been following discussion as much as I am able.

What (I think) I really need to understand is how the request handling
and thread pool code interacts.  For 'perchild' I would also need to
understand how setuid and threading works together.  Looking at the
example configs, I've been guessing that the 'perchild' server forks
several threaded process as each require UID but at least one comment I
saw recently indicates I might be entirely wrong.

I wonder if there is any apache-specific documentation available at this
level of detail?  I have unix-specific references.

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 19:34 +0200, Graham Leggett wrote:
 On Mon, July 31, 2006 6:43 pm, Guy Hulbert wrote:
 
  This seems reasonable.  Given paragraph 2 (URL strategies etc) Not for
  the reasons I've omitted (and responded to separately).  However, I
  still don't think this will scale the way router-based solutions can
  (already :-).
 
 Users of mod_backhand (for httpd v1.3) would disagree, it's a similar
 solution that has been around for years. The lb support in v2.x will
 hopefully eventually allow users of mod_backhand to migrate to v2.x from
 v1.3.

Is google using mod_backhand ?

That's the ultimate case, after all :-)

 
 Regards,
 Graham
 --
 
 

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Brian Akins

Guy Hulbert wrote:

That's the ultimate case, after all :-)


Not necessarily.  Google's answer is to throw tons of hardware at stuff. 
Which is great if you have unlimited space, power, and cooling.  Some 
other sites do some rather interesting things with a relatively small 
number of servers



--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 19:34 +0200, Graham Leggett wrote:
 Users of mod_backhand (for httpd v1.3) would disagree, it's a similar

Greenspun:
http://philip.greenspun.com/scratch/scaling.adp

Asks the right question:

How are load balancers actually built?

and suggests: zeus, mod_backhand, and router solutions but unfortunately
does not give a direct answer.

However, two paragraphs down:

Failover from a broken load balancer to a working one is
essentially a network configuration challenge, beyond the scope
of this textbook. Basically what is required are two identical
load balancers and cooperation with the next routing link in the
chain that connects your server farm to the public Internet.
Those upstream routers must know how to route requests for the
same IP address to one or the other load balancer depending upon
which is up and running. What keeps this from becoming an
endless spiral of load balancing is that the upstream routers
aren't actually looking into the TCP packets to find the GET
request. They're doing the much simpler job of IP routing.

This points up the difficulty of trying to solve the problem at the
application level.

My point was that free routing solutions to this problem were already
available since 1997.

 solution that has been around for years. The lb support in v2.x will

The mod_backhand site seems to date since 2000 and Greenspun's article
is dated 2003, which also seems to be the latest release of
mod_backhand  ...

 hopefully eventually allow users of mod_backhand to migrate to v2.x
 from
 v1.3.

... it certainly seems to be important to create the migration path but
you have yet to convince me that the scalability is the same.

However, you have certainly convinced me to try the apache solution once
it is available ... I have a customer who might need it in a year or so.

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 13:54 -0400, Brian Akins wrote:
 Guy Hulbert wrote:
  That's the ultimate case, after all :-)
 
 Not necessarily.  Google's answer is to throw tons of hardware at
 stuff. 

The point of contention was scalability ... from a human point of view
it is really annoying to have to solve a problem twice but from the
business pov, outgrowing your load balancer might only be a good thing.

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Garrett Rooney

On 7/31/06, Guy Hulbert [EMAIL PROTECTED] wrote:

On Mon, 2006-31-07 at 13:54 -0400, Brian Akins wrote:
 Guy Hulbert wrote:
  That's the ultimate case, after all :-)

 Not necessarily.  Google's answer is to throw tons of hardware at
 stuff.

The point of contention was scalability ... from a human point of view
it is really annoying to have to solve a problem twice but from the
business pov, outgrowing your load balancer might only be a good thing.


Oh please, 99.% of users have nowhere near the scalability
constraints that google operates under.  Are you saying that because
some do we shouldn't provide solutions that work for the rest?

-garrett


Re: load balancer cluster set

2006-07-31 Thread Brian Akins

Guy Hulbert wrote:


The point of contention was scalability ... from a human point of view
it is really annoying to have to solve a problem twice but from the
business pov, outgrowing your load balancer might only be a good thing.



Yes.  But most load balancer can only do layer 7 load balancing. 
Sometimes it is necessary to have very application specific routing. 
Also, in general, most hardware load balancers base their algorithms on 
things such as response time.  Sometimes, it is necessary to know the 
general health of the backend servers.


--
Brian Akins
Chief Operations Engineer
Turner Digital Media Technologies


Re: load balancer cluster set

2006-07-31 Thread Rainer Jung

Jim Jagielski wrote:

I'm trying to figure out which impl of the the
LB cluster set makes the most sense and would appreciate
the feedback.

Basically, I see 2 different methods:

   1. Members in all cluster sets which have the same or
  lower set numbers are checked

   2. Only members is a specific set number are checked. If
  none are usable, skip to the next cluster set.


We have two different use cases for grouping. On is the case, where the 
targets keep some state and replicate the state only to some of the 
other targets. If the set of targets is split into disjoint replication 
groups, it would make sense to use any other member of the same 
replication group, in case the sticky member is dead. This situation 
might be used e.g. for a tomcat cluster, where we only can do one to all 
replication. So a huge cluster needs to be split into disjoint 
replication groups. So for a sticky situation and a request that 
contains a target ID, I think 2 makes the most sense.


In case the backends use a more elaborate replication scheme, 
mod_proxy_balancer would need some additional way of getting the 
information about replication members, like encoding them into the 
Cookie. Unfoirtunately, theres no standard for this.


If we are in a non-sticky session, or the request has no target ID, we 
are back to pure load-balancing (no routing). In this case I think there 
should be a way of expressing preferences for target workers. That's 
closer to number 1.


For mod_jk 1.2.18 we included distance as a measurement of preference 
for the non-sticky case (and the case, where we are sticky, but the 
wohle cluster set is down), and we have domain since about 2 years to 
configure replication sets.


I assume his is, what Mladen is after. So my answer would be: some of 1. 
and some of 2, depending on the request info and the target status.


I would love, if someone came up with a more consistent model.

Rainer



Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 14:02 -0400, Garrett Rooney wrote:
 On 7/31/06, Guy Hulbert [EMAIL PROTECTED] wrote:
  On Mon, 2006-31-07 at 13:54 -0400, Brian Akins wrote:
   Guy Hulbert wrote:
That's the ultimate case, after all :-)
  
   Not necessarily.  Google's answer is to throw tons of hardware at
   stuff.
 
  The point of contention was scalability ... from a human point of view
snip
 
 Oh please, 99.% of users have nowhere near the scalability
 constraints that google operates under.  Are you saying that because
 some do we shouldn't provide solutions that work for the rest?
 
 -garrett

Nope.

Graham asserted that mod_backhand was sufficiently scalable ... which I
inferred to mean sufficiently scalable to make a router-based solution
unnecessary.

For practical use, it seems to be the best solution available for a
small-scale site.  The commercial solutions do not seem to have changed
since 1997 ... it is a more disappointing that the linux-router project
does not seem to have come far enough yet to solve this problem
properly.  At least it did not turn up obviously in the responses to
'google: mod_backhand scalable.

-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Rainer Jung
My experience: some organisations have a network group, that is able to 
understand application communication behaviour and do a very good job in 
making most of these features available via there load balancer 
appliances and then benefit from their central administration, GUIs etc.


On the other hand in some organisations there is a deep split between 
the server/app guys and the network guys, and you will not succeed in 
making the network use the high-level features of their gear.


So in principle most can be done on both sides, but often it's the 
experience of the people, that decides on where to actually build the 
solution.


I did both solutions successfully and even had companies move from on to 
the other when they changed their organization.


I think it's not worth to technically discuss, where the features belong 
to. In practise, it's not really a technical question.


Just my point of view.

Rainer

Brian Akins wrote:

Guy Hulbert wrote:


The point of contention was scalability ... from a human point of view
it is really annoying to have to solve a problem twice but from the
business pov, outgrowing your load balancer might only be a good thing.



Yes.  But most load balancer can only do layer 7 load balancing. 
Sometimes it is necessary to have very application specific routing. 
Also, in general, most hardware load balancers base their algorithms on 
things such as response time.  Sometimes, it is necessary to know the 
general health of the backend servers.


Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
On Mon, 2006-31-07 at 20:15 +0200, Rainer Jung wrote:
 So in principle most can be done on both sides, but often it's the 
 experience of the people, that decides on where to actually build the 
 solution.

Yup.

 
 I did both solutions successfully and even had companies move from on
 to 
 the other when they changed their organization.

Yup.

 
 I think it's not worth to technically discuss, where the features
 belong 
 to. In practise, it's not really a technical question.

Yup.

It seems that linux router is the wrong name.  Here is the correct
project:

http://www.linuxvirtualserver.org/

I really have not looked seriously at load balancing for about 10 years.
It seems that mod_backhand is a good solution that is out there but
needs to be ported to apache2 because people already need it.

If it were *me* writing the code, I would still look to see whether
there is a reasonable alternative ...

Anyhow, I apologize for the long digression ... it's keeping me from
working too :-).


-- 
--gh




Re: load balancer cluster set

2006-07-31 Thread Guy Hulbert
FWIW, this seems much more likely:
http://www.ultramonkey.org/about.shtml

In particular:
http://www.ultramonkey.org/3/installation-debian.sarge.html

On Mon, 2006-31-07 at 14:29 -0400, Guy Hulbert wrote:
 It seems that linux router is the wrong name.  Here is the correct
 project:
 
 http://www.linuxvirtualserver.org/
 
snip
 If it were *me* writing the code, I would still look to see whether
 there is a reasonable alternative ...
 
 Anyhow, I apologize for the long digression ... it's keeping me from
 working too :-).

again

-- 
--gh