On Wednesday 04 January 2006 20:40, Greg Edwards wrote: > Hi team, > I was interested to read the thouhts on expanding the Peimenet servers, > having separate comms and database servers etc. This is all a great idea > but it reminded me instantly of the SETI experience. I've been a Seti > (Classic) user sporadically since early 2000 and ithey wee constantly > expanding, splitting off functions onto additional servers, upgrading > disk arrays, adding bandwidth etc as they grew. Every few weeks there > would be a probelm of some sort and one part of the system would be down > for a day, and then it would take many days for the whole assembalge to > restabilise, with servers gradually ploughing through their backlog and > other servers starved of data, users couldn;t get fresh datasets, or > check their stats on the web server(s) etc.
Overcentralised design! The way I see it, one of the benefits of distributing the server is to gain reliability ... if your system design is such that there are more critical points through which everything has to pass, it is likely to be less reliable rather than more. Conversely, if you hive off statistical reports, failure of the statistical report generation server - even when its functions are not duplicated elsewhere - is annoying rather than a serious impediment to the progress of the project as a whole. > So I'm just wondeing that > whilst Primenet may be a simple system that's somewaht overloaded at > times, whetehr keeping it simple could be a good conservative choice ?? If the problem was only overloading, I'd tend to agree. But I fear that a significant part of the problem may be network scanning - it's really hard to avoid this if the server has to be open to the whole net. On the other hand if you have N servers, each of which can log results and issue assignments, it doesn't actually matter all that much if on average half of them have been crashed - about N/2 will still be running so that, with a few retries, the client workload can proceed. There are two models I'd like to explore: a) the model where the N servers act as intermediaries with the master server so that the master server has to communicate only with the intermediaries - net hackers, rogue clients etc. are simply unable to contact the master server directly, therefore find it difficult to crash it. b) the model where each of the N servers acts as master for a portion of the database - e.g. make an 8 bit hash of the exponent and split the master database into 256 parts. Distribute these over the N servers somehow (maybe taking into account the server resources and the bandwidth of its connecting links). Now each of the servers has its own pool of assignments to issue; when a result for a different server's database is received, the server simply stores it for relaying to the appropriate peer at mutual convenience. Both of these models are robust and expandable almost without limit. The first has the benefit of allowing a design which permits the intermediates to be run by untrusted volunteers; the second distributes everything except the overall project status reports, whilst the detailed structure can be made essentially invisible to the client. > Perhaps just upgrade what you have, ie faster cpu, more disk, in one go. If this will really achieve the required result, fine. But I have serious doubts.... Regards Brian Beesley _______________________________________________ Prime mailing list [email protected] http://hogranch.com/mailman/listinfo/prime
