On Sun, 12 Jul 2009 18:32:14 -0400, you wrote: >I understand that for a project with more than 100000 active >clients, situations will arise where the servers are the bottleneck, >and need to be "protected" from overload. But I have a >philosophical problem with restraining the _clients_ when it is the >_servers_ that get overloaded. To my mind, if a project cannot >afford the hardware to service 100000 clients, then it should not >*accept* 100000 clients. >
It's a big difference between having the neccessary hardware to service 100k active clients that connects randomly once per day, and having the hardware to service 100k active clients then all 100k active clients tries to connect at once... Since the "problem" being discussed in this thread is Uploads, let's make some "worse-case" numbers: 5 M tasks "in progress", project for some reason has multi-day-outage, and since "worse-case", let's say all 5 M tasks is trying to upload... Let's also say 200k active clients. Now, of course, the absolute "worse-case" in this scenario, with BOINC client-default 2 upload/project, is all 200k active clients tries sending 400k uploads the second the servers comes-back up, but in practice you've got the backoff-mechanism. In BOINC, the exponential backoff is between 1 minute and 4 hours, so let's say each file is on average is backed-off 10k seconds (a little less than 3 hours). Meaning, you have: 5M tasks / 10k seconds = ... 500 new connection-attempts per second... Now, I've no idea if this is "good" or "bad" for a server, but if says then uploads finally starts working again, each upload takes 10 seconds, meaning with new connections per second there will be 5000 active connections at once... Now, I don't know how the old re-introduced client-change really should work, but let's say that instead of each file does random backoff, the Client does the random backoff. If so, you'll have: 200k active clients / 10k seconds = ... 20 new connection-attempts per second... With each upload taking 10 seconds, it will be 200 active connections at once... It's a big difference between 5000 active connections and 200 active connections at once... Is my numbers complete unrealistic? Well, if a project does 1 M task/day, there will on average be 11.6 new uploads per second. With 10 seconds per upload, this means 116 active uploads at once. 200 active connections is 72% more load than "normal operation", so servers should have a fairly good chance to handle this extra load after a multi-day-outage. 5000 active connections on the other hand is 43x more load than "normal operation", and atleast to me it's unrealistic for projects to have servers that can handle this peak load that only happens after long outages. Now, if a project can't handle the "normal operation"-load, it's got problems, and changing the client won't help much if at all. In this instance, limiting #clients, or how much work to send to each client or longer wu's or something, is better options. For the higher load that always accompanies planned or unexpected outages on the other hand, changing the client so instead of giving 10x - 50x higher load than normal to less than 2x higher load than normal will be a big improvement... -- "I make so many mistakes. But then just think of all the mistakes I don't make, although I might." _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
