On Sep 30, 2009, at 12:15 AM, Raistmer wrote: >>> Every couple of years, you could set up a new reference machine >>> running >>> parallel with the old reference machine. These two machines could >>> be >>> dialed in so that the credit on some reference tasks was made to be >>> identical. Then the new reference machine can run solo. Since this >>> machine would not have to be an extremely high powered server, it >>> would be >>> easier to get it donated. >> >> If we instead had a calibrated system as I have suggested, there is >> no >> need for reference machines at all ... the new work would be issued >> and the network would establish the parameters. > > You propose to replace one machine per few years overhead (of we > calibrate by real tasks even this is not overhead) to all machines > per week overhead. For what reason?
Ok, one more into the fire ... :) Assume my "gold Plated" scheme is in place. Minimum there are 5 different projects supplying a suite of calibration tasks. Good choices are SaH, Einstein, Rosetta, MW, PG, WCG, Collatz (? guessing here, I have only been running them a couple weeks ?) ... idea is a spread of task classes FP heavy, INT heavy, etc. I can explain rationale for several of the projects and these are JUST SUGGESTIONS ... some of the task inputs would be, like in the case of SaH a suite of known artificial signals and real live samples we run to death, the same basic scheme can be used for Einstein, once a month, or WHATEVER THE DESIGNATED INTERVAL IS a calibration task from one of these attached projects NOT ALL OF THEM ... one AND ONLY ONE task will be run ... if a task from SaH is run, then no task from the other projects will be run ... over time, assuming proper programming the task will rotate. Since we know what the answer should be we are validating the ENTIRE PROCESS from one end to the other ... The anti-social can opt out ... I would not have allowed this, but in BOINC hysteria overrules good practice, and so we allow opt out (though I would never let an opt-out take part in adaptive replication/ validation) ... over time the results use whatever mathematical magic is needed to establish the operational speed of that machine in CS ... note that the point is to establish this with more reliability with real work on real machines over real execution times so that the instability of the benchmarks is eliminated as an issue. My personal feeling is that using other mechanisms we can fill in the gap and the current benchmark can be eliminated ... the ways to accomplish that are discussed elsewhere ... but basically by looking up similar machines and by either making the first task a calibration task or by running the first task set to completion before allowing d/l to fill the cache. In that this is a system function the cost of the time would be paid as is normal for a task from that project with a bonus to encourage opt-in (I love economics, pay to play; or is play to get paid?). Projects that obviously cannot make a calibration task of their own can either use another projects or simply "piggy-back" on the system wide data ... in effect we would be establishing that, for example my i7 920 does n CS per hour and that is what CPDN/Orbit would be awarding for their work. The point is that instead of requiring the counting of FLOPS or Loops or anything else we establish a generalize earnings for a specific computer using a collective of work. The more different work loads we use the more "accurate" our estimate. > It will bring literally zero increase in our faith in result > validity above we have right now. > Prove of validator, single program, running on single or few servers > and accepting data, could and should be made in lab, absolutely no > need to load this task on participants PCs. And calibration of > participants PCs themselves will bring NO ADDITIONAL level of > secuirity just because random errors are still possible. You can't > replace redundancy with such calibration. So calibration will go > together with redundancy bringing no new security and being pure > overhead. For what reason ??? We make the assumption that the validator will catch errors... yet we know that the validator is a program written by people. The point is that if I make a SaH signal and the program returns 15 signals there is a problem somewhere ... yet if that bad answer is paired up with another back answer that is the same the validators will accept both answers. And more and more projects are going to adaptive replication and validating on one task... so the idea that the redundant computing is going to catch errors is slowly being eroded ... We already have the concept of adaptive validation if you care to look it up, this actually is more or less an extension of that concept but in this case we are looking for those systemic errors that everyone but me seems quite comfortable with ... THIS IS NOT THE COMPLETE PROPOSAL ... there are a myriad of details ... but I know that if I make it longer no one will read it ... but this is the core ... _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
