Since this seems to apply to everyone ...

I don't have a problem with polling the down server.  Its relatively simple
to ping or attempt a connection to a server with a short timeout.  If it
times out, move on.  Otherwise, you don't know when that server comes back
up.

Your random distribution was the only thing I was commenting on.  Your logic
of whether to 'keep polling' a down server has _nothing_ to do with whether
you use totally random selection or random select then round-robin.
Nothing!

That said, the best way to accomplish this well would be:

Store servers in an array of structs:
struct server {
    ip_addr address;
    int down = 0;
    int try = 0;
}

struct server * servers; // malloc them ...

1) pick random start # from 0 to (num_servers - 1)
2) if servers[this_server].down > 0 then:
  - servers[this_server].try++;
  - if (try < down), increment, go to (2)
  - else, try = 0 ...
3) connect to server
  - if failed:
  - servers[this_server].down++;
4) (success) increment start, set server.down to 0.
5) if another message, increment, go to (2)

That's off the top of my head -- but it would allow for linear (not
exponential) decrease of use of down servers.  Over time, if a server was
down for a long time, it would stop being polled for a connection as often.

----- Original Message -----
From: "David Dyer-Bennet" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, August 02, 2000 1:03 PM
Subject: Re: updated load balancing qmail-qmqpc.c mods


> JuanE <[EMAIL PROTECTED]> writes on 2 August 2000 at 16:37:36 GMT
>  >
>  > I agree with you both (Jay and Michael), at least partially. I agree
that
>  > altough what Jay proposes will work, it is too much computation and
that a
>  > simpler round-robin (after picking initial position) would suffice.
>  >
>  > My comment is that in the event of a down server, the simple round
robin
>  > will flood the next server in the chain with twice the load of the
others.
>  > Jay's solution does not do this (at a high computational cost).
>  >
>  > What I proposed earlier is just one of *many* solutions that addresses
the
>  > flood problem at a lower computational cost.
>  >
>  > Jay, I agree with you that selecting the same server many times in a
row is
>  > not an issue. This is guaranteed by the Law of Averages (for you math
>  > wizzes out there, the Law of Large Numbers).
>
> Sounds like making repeated random picks is the way to go.
>
> If no server is down, your one random pick will handle the mail (same
> cost as picking random starting point for round-robin).
>
> If a server is down *and you hit it*, you pay the cost of a second
> random pick.  This is slightly expensive, but you only pay it when
> you need it.  It's cheaper in elapsed time than trying the next server
> and having it refuse the connection due to overload, for example.  And
> it spreads the load more evenly.
>
> On the third hand, if the servers can manage their incoming
> connections intelligently (say with tcpserver :-) ), the one after a
> down one in a round-robin, while it will get hit a lot, can refuse
> some of the connections, which will then go on to the next after it.
> So you aren't really constrained to running all your servers at less
> than 50% capacity, and the one after the down one won't actually melt
> down.
> --
> Photos: http://dd-b.lighthunters.net/ Minicon:
http://www.mnstf.org/minicon
> Bookworms: http://ouroboros.demesne.com/ SF: http://www.dd-b.net/dd-b
> David Dyer-Bennet / Welcome to the future! / [EMAIL PROTECTED]
>
>
>

Reply via email to