Re: updated load balancing qmail-qmqpc.c mods

2000-08-09 Thread Bruno Wolff III

On Wed, Aug 02, 2000 at 05:08:28PM +,
  JuanE [EMAIL PROTECTED] wrote:
 
 I did not think of that. Good suggestion.
 
 It seems like it would be a good compropmise if you can take your down
 server out of the rotation relatively quickly. If not, then you'll waste
 considerable time polling the busy server (and consequently having your
 connections rejected by tcpserver) while all other servers are breezing at
 50% load.

That may be hard to do. A lot of places may have the list of IP addresses
cached and they typically expire over a time longer than a server will
be down.



Re: updated load balancing qmail-qmqpc.c mods

2000-08-09 Thread David Dyer-Bennet

Bruno Wolff III [EMAIL PROTECTED] writes on 9 August 2000 at 09:12:29 -0500
  On Wed, Aug 02, 2000 at 05:08:28PM +,
JuanE [EMAIL PROTECTED] wrote:
   
   I did not think of that. Good suggestion.
   
   It seems like it would be a good compropmise if you can take your down
   server out of the rotation relatively quickly. If not, then you'll waste
   considerable time polling the busy server (and consequently having your
   connections rejected by tcpserver) while all other servers are breezing at
   50% load.
  
  That may be hard to do. A lot of places may have the list of IP addresses
  cached and they typically expire over a time longer than a server will
  be down.

Which is why, if using round-robin DNS instead of a local
load-balancing front-end, I'd want to keep a couple of spare
configured servers ready to configure with the IP of the down
machine.  (This might well be cheaper than the load-balancing
system.)
-- 
Photos: http://dd-b.lighthunters.net/ Minicon: http://www.mnstf.org/minicon
Bookworms: http://ouroboros.demesne.com/ SF: http://www.dd-b.net/dd-b 
David Dyer-Bennet / Welcome to the future! / [EMAIL PROTECTED]



Re: updated load balancing qmail-qmqpc.c mods

2000-08-03 Thread Pavel Kankovsky

On 2 Aug 2000, Frank D. Cringle wrote:

 Generating a random permutation algorithmically is not too easy.

Oh really?

  int i, j, x;
  int a[N];

  for (i = 0; i  N; ++i)
a[i] = i;
  for (i = N - 1; i  0; --i) {
j = random(i);
x = a[i]; a[i] = a[j]; a[j] = x;
  }

where random(i) is a "good enough" PRNG generating an integer between
0 and i - 1 with the uniform distribution. The result is in a[].

The proof of correctness (i.e. that the aformentioned piece of code
generates a pseudorandomly chosen permutation of (0, ..., N-1) with
the uniform distribution) is left as an exercise for the reader.

--Pavel Kankovsky aka Peak  [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."




Re: updated load balancing qmail-qmqpc.c mods

2000-08-03 Thread Frank D. Cringle

Pavel Kankovsky [EMAIL PROTECTED] writes:
 On 2 Aug 2000, Frank D. Cringle wrote:
 
  Generating a random permutation algorithmically is not too easy.
 
 Oh really?
 
 [ swap each element with a randomly chosen partner ]

Yes, that will do it.  When I was originally working with this stuff
the context involved using "real" random bits and we needed to
conserve them.  In principle you only need log2(N!) bits, and then it
is hard.  I had forgotten the details.  I also made a typo in the
reference to Sedgewick's article.  It was 1977, not 1997.  Time flies
when you are having fun.

-- 
Frank Cringle,  [EMAIL PROTECTED]
voice: (+49 2304) 467101; fax: 943357



Re: updated load balancing qmail-qmqpc.c mods

2000-08-02 Thread Frank D. Cringle

"Austad, Jay" [EMAIL PROTECTED] writes:
 From: JuanE [mailto:[EMAIL PROTECTED]]
 Assume you have 4 servers A, B, C, D, and that server C is down. When your
 random seed hits server A, B or D (with probability 1/4, respectively) the
 message will go through fine. When your random seeds hits server C (again
 with p=1/4), you will realize that it is down and proceed to D. So in
 effect you will hit servers A and B with probability 1/4 each, and server D
 with probability 1/2. This is not very good load balancing, because

 You are correct.  However, since qmail-qmqpc does not continually run, it is
 not possible to do a non-random round robin (as far as I know).

Choose a permutation of the server list at random.  Then step through
this list until you hit a good server or exhaust the list.

 1. A B C D
 2. A B D C
 3. A C B D
 4. A C D B
 
24. D C B A

If your random number is 4, you try A, then C, then D and finally B.

If the number of servers is reasonably small, say up to 7, you could
enumerate all permutations as a static array in the code and then use
a random index into the array to determine the order.  Generating a
random permutation algorithmically is not too easy.  You can either
approximate by shuffling or make a direct choice using some bignum
arithmetic described in "Permutation Generation Methods", Robert
Sedgewick, Computing Surveys, Vol. 9, No. 2, June 1997.

-- 
Frank Cringle,  [EMAIL PROTECTED]
voice: (+49 2304) 467101; fax: 943357



Re: updated load balancing qmail-qmqpc.c mods

2000-08-02 Thread JuanE


Jay,

A small modification to what you have done can get rid of the problem of
banging one server. Instead of just sampling a random starting point for
the search, just get a random number between 1 and N (N being the number of
servers) and if that server is down just sample again until you find a
server that's up. This way, you will redistribute the load randomly to all
servers and not just the next one on the list.

What I propose sounds very expensive computationally, but there might be a
way to bypass using rand. If there is a way to get the time down to
microseconds (I believe gettimeofday() will do this), then you can use the
microseconds in the time as your random variable. Just take the number of
microseconds mod N. This should be much faster than using rand and close
enough to uniform for this purpose (anybody disagree??).

If this is feasible, then it sounds much simpler than trying to deamonize
qmqpc.

Hope this helps.
JES

Austad, Jay writes:

 You are correct.  However, since qmail-qmqpc does not continually run, it is
 not possible to do a non-random round robin (as far as I know).  If qmqpc
 could be turned into a process that ran continuously, it would be possible
 to do regular round-robin and there would be no need for the random stuff at
 all, this may also increase performance because it wouldn't have to read
 /var/qmail/control/qmqpservers every time it started.  I haven't looked at
 it, but does qmail-inject work by just piping it's data into qmail-queue
 (qmail-queue is a link to qmail-qmqpc in a mini-qmail setup)?  If so,
 couldn't qmail-queue be turned into a named pipe, and qmqpc could be
 modified slightly to continously run and wait for input on that named pipe?
 Just a random thought.  Of course, I'm not a programmer and I may not have
 any idea what the hell I'm talking about. :)
 
 This approach that I'm using obviously isn't going to spread the load
 between servers exactly evenly, but it should be close enough when mailing
 large amounts of mail, unless one qmqp server dies.  However, our qmqp
 servers have port 628 for QMQP monitored closely and I'll get paged within a
 couple of minutes if it goes down, so I'm not too worried about it.  I guess
 I could redo it so instead of just rotating the array by a random number, it
 would shuffle it it randomly.  That would keep it from banging on the server
 immediately after the dead one, and give everything an even chance again.
 
 Jay
 
 -Original Message-
 From: JuanE [mailto:[EMAIL PROTECTED]]
 Sent: Tuesday, August 01, 2000 6:57 PM
 To: [EMAIL PROTECTED]
 Subject: Re: updated load balancing qmail-qmqpc.c mods
 
 
 
 Jay,
 
 If I understand this correctly, then I think there is a flaw in this type
 of random round robin. I think I can best describe it with an example.
 
 Assume you have 4 servers A, B, C, D, and that server C is down. When your
 random seed hits server A, B or D (with probability 1/4, respectively) the
 message will go through fine. When your random seeds hits server C (again
 with p=1/4), you will realize that it is down and proceed to D. So in
 effect you will hit servers A and B with probability 1/4 each, and server D
 with probability 1/2. This is not very good load balancing, because
 basically the load of the server that's down is inherited by the next
 server, which is not what you set out to accomplish.
 
 With simple, non-random round-robin, when ever you hit server C, you will
 skip it and go to server D. The next message will go to server A, the next
 to B, etc... Thus, the probability of hitting server D is 1/3 (as well as A
 and B).
 
 All this rests on my interpretation of what you have done, since I am not
 familiar with the code.
 
 JES
 
 Austad, Jay writes:
 
  Ok, I fixed it.  It now just rotates the serverindex.pos[] by a random
  amount, and then loops through until it finds a good server. rand() is
 still
  seeded with milliseconds from the system clock.  I used memcpy() to move
 the
  array around instead of loops to make it more efficient.
  
  Would it be possible to run qmqpc as some sort of daemon?  Then it could
  have memory of which server it contacted last and make sure it went to the
  next one, instead of having the chance of picking the same one again.
  
  Jay
  
  
  
  #include sys/types.h
  #include string.h
  #include memory.h
  #include sys/socket.h
  #include stddef.h
  #include netinet/in.h
  #include arpa/inet.h
  #include stdlib.h
  #include sys/time.h
  #include sys/timeb.h
  #include "substdio.h"
  #include "getln.h"
  #include "readwrite.h"
  #include "exit.h"
  #include "stralloc.h"
  #include "slurpclose.h"
  #include "error.h"
  #include "sig.h"
  #include "ip.h"
  #include "timeoutconn.h"
  #include "timeoutread.h"
  #include "timeoutwrite.h"
  #include "auto_qmail.h"
  #include "control.h"
  #include &q

Re: updated load balancing qmail-qmqpc.c mods

2000-08-02 Thread JuanE


I like the idea of statically listing them, but as you mentioned it is not
very scalable. The shuffling can be very expensive. I believe it would grow
with N^2. Instead of shuffling, just plain random sampling would be better
computationally. See me previous posting.

I think this is a "profile, don't speculate" scenario.

JES

Frank D. Cringle writes:

 "Austad, Jay" [EMAIL PROTECTED] writes:
  From: JuanE [mailto:[EMAIL PROTECTED]]
  Assume you have 4 servers A, B, C, D, and that server C is down. When your
  random seed hits server A, B or D (with probability 1/4, respectively) the
  message will go through fine. When your random seeds hits server C (again
  with p=1/4), you will realize that it is down and proceed to D. So in
  effect you will hit servers A and B with probability 1/4 each, and server D
  with probability 1/2. This is not very good load balancing, because
 
  You are correct.  However, since qmail-qmqpc does not continually run, it is
  not possible to do a non-random round robin (as far as I know).
 
 Choose a permutation of the server list at random.  Then step through
 this list until you hit a good server or exhaust the list.
 
  1. A B C D
  2. A B D C
  3. A C B D
  4. A C D B
  
 24. D C B A
 
 If your random number is 4, you try A, then C, then D and finally B.
 
 If the number of servers is reasonably small, say up to 7, you could
 enumerate all permutations as a static array in the code and then use
 a random index into the array to determine the order.  Generating a
 random permutation algorithmically is not too easy.  You can either
 approximate by shuffling or make a direct choice using some bignum
 arithmetic described in "Permutation Generation Methods", Robert
 Sedgewick, Computing Surveys, Vol. 9, No. 2, June 1997.
 
 -- 
 Frank Cringle,  [EMAIL PROTECTED]
 voice: (+49 2304) 467101; fax: 943357
 






Re: updated load balancing qmail-qmqpc.c mods

2000-08-02 Thread Michael T. Babcock

Re-read my point: its unnecessary.  I didn't say it wouldn't work.  I said
the CPU use of doing it this way was unnecessary over a simpler round-robin
approach (After picking an initial random server).

Note: I think using an array of pointers to server addresses would allow you
to do your rotations a lot faster.  Think "point quicksort".

- Original Message -
From: "Austad, Jay" [EMAIL PROTECTED]


 Even if one server is getting hit multiple times in a row, I don't think
it
 will matter.  I put in 10 servers in my qmqpservers file and some well
 placed printf's in qmqpc and ran it 100,000 times several times through.
 Each server was getting picked about 10% of the time +/- 1% or so.  If
we're
 sending millions of messages, this +/- 1% isn't going to make a
significant
 difference.




RE: updated load balancing qmail-qmqpc.c mods

2000-08-02 Thread Austad, Jay

I agree with you, I forgot to mention that, sorry.  I didn't have enough
Mountain Dew yet.  :)

-Original Message-
From: Michael T. Babcock [mailto:[EMAIL PROTECTED]]
Sent: Wednesday, August 02, 2000 10:44 AM
To: Austad, Jay; [EMAIL PROTECTED]
Subject: Re: updated load balancing qmail-qmqpc.c mods


Re-read my point: its unnecessary.  I didn't say it wouldn't work.  I said
the CPU use of doing it this way was unnecessary over a simpler round-robin
approach (After picking an initial random server).

Note: I think using an array of pointers to server addresses would allow you
to do your rotations a lot faster.  Think "point quicksort".

- Original Message -
From: "Austad, Jay" [EMAIL PROTECTED]


 Even if one server is getting hit multiple times in a row, I don't think
it
 will matter.  I put in 10 servers in my qmqpservers file and some well
 placed printf's in qmqpc and ran it 100,000 times several times through.
 Each server was getting picked about 10% of the time +/- 1% or so.  If
we're
 sending millions of messages, this +/- 1% isn't going to make a
significant
 difference.



Re: updated load balancing qmail-qmqpc.c mods

2000-08-02 Thread JuanE


I agree with you both (Jay and Michael), at least partially. I agree that
altough what Jay proposes will work, it is too much computation and that a
simpler round-robin (after picking initial position) would suffice.

My comment is that in the event of a down server, the simple round robin
will flood the next server in the chain with twice the load of the others.
Jay's solution does not do this (at a high computational cost).

What I proposed earlier is just one of *many* solutions that addresses the
flood problem at a lower computational cost.

Jay, I agree with you that selecting the same server many times in a row is
not an issue. This is guaranteed by the Law of Averages (for you math
wizzes out there, the Law of Large Numbers).

JES

Austad, Jay writes:

 I agree with you, I forgot to mention that, sorry.  I didn't have enough
 Mountain Dew yet.  :)
 
 -Original Message-
 From: Michael T. Babcock [mailto:[EMAIL PROTECTED]]
 Sent: Wednesday, August 02, 2000 10:44 AM
 To: Austad, Jay; [EMAIL PROTECTED]
 Subject: Re: updated load balancing qmail-qmqpc.c mods
 
 
 Re-read my point: its unnecessary.  I didn't say it wouldn't work.  I said
 the CPU use of doing it this way was unnecessary over a simpler round-robin
 approach (After picking an initial random server).
 
 Note: I think using an array of pointers to server addresses would allow you
 to do your rotations a lot faster.  Think "point quicksort".
 
 - Original Message -
 From: "Austad, Jay" [EMAIL PROTECTED]
 
 
  Even if one server is getting hit multiple times in a row, I don't think
 it
  will matter.  I put in 10 servers in my qmqpservers file and some well
  placed printf's in qmqpc and ran it 100,000 times several times through.
  Each server was getting picked about 10% of the time +/- 1% or so.  If
 we're
  sending millions of messages, this +/- 1% isn't going to make a
 significant
  difference.
 






Re: updated load balancing qmail-qmqpc.c mods

2000-08-02 Thread JuanE


David Dyer-Bennet writes:

 JuanE [EMAIL PROTECTED] writes on 2 August 2000 at 16:37:36 GMT
   
   I agree with you both (Jay and Michael), at least partially. I agree that
   altough what Jay proposes will work, it is too much computation and that a
   simpler round-robin (after picking initial position) would suffice.
   
   My comment is that in the event of a down server, the simple round robin
   will flood the next server in the chain with twice the load of the others.
   Jay's solution does not do this (at a high computational cost).
   
   What I proposed earlier is just one of *many* solutions that addresses the
   flood problem at a lower computational cost.
   
   Jay, I agree with you that selecting the same server many times in a row is
   not an issue. This is guaranteed by the Law of Averages (for you math
   wizzes out there, the Law of Large Numbers).
 
 Sounds like making repeated random picks is the way to go.  
 
 If no server is down, your one random pick will handle the mail (same
 cost as picking random starting point for round-robin).
 
 If a server is down *and you hit it*, you pay the cost of a second
 random pick.  This is slightly expensive, but you only pay it when
 you need it.  It's cheaper in elapsed time than trying the next server
 and having it refuse the connection due to overload, for example.  And
 it spreads the load more evenly.
 
 On the third hand, if the servers can manage their incoming
 connections intelligently (say with tcpserver :-) ), the one after a
 down one in a round-robin, while it will get hit a lot, can refuse
 some of the connections, which will then go on to the next after it.
 So you aren't really constrained to running all your servers at less
 than 50% capacity, and the one after the down one won't actually melt
 down. 
 -- 
 Photos: http://dd-b.lighthunters.net/ Minicon: http://www.mnstf.org/minicon
 Bookworms: http://ouroboros.demesne.com/ SF: http://www.dd-b.net/dd-b 
 David Dyer-Bennet / Welcome to the future! / [EMAIL PROTECTED]
 

I did not think of that. Good suggestion.

It seems like it would be a good compropmise if you can take your down
server out of the rotation relatively quickly. If not, then you'll waste
considerable time polling the busy server (and consequently having your
connections rejected by tcpserver) while all other servers are breezing at
50% load.

JES



Re: updated load balancing qmail-qmqpc.c mods

2000-08-02 Thread Michael T. Babcock

My pseudo code was supposed to infer that one would re-select (randomly, if
you wish) a server at a certain % of the time, based on how many times it
had been polled and turned out to be down.  Simply replace my first
(increment, go to #2) with (go to #1).

- Original Message -
From: "JuanE" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, August 02, 2000 12:37 PM
Subject: Re: updated load balancing qmail-qmqpc.c mods



 I agree with you both (Jay and Michael), at least partially. I agree that
 altough what Jay proposes will work, it is too much computation and that a
 simpler round-robin (after picking initial position) would suffice.

 My comment is that in the event of a down server, the simple round robin
 will flood the next server in the chain with twice the load of the others.
 Jay's solution does not do this (at a high computational cost).