On Sep 28, 2009, at 11:15 PM, Raistmer wrote: >> >> The example I used in the past is this. SaH is basically a signal >> hunter. When was the last time that a test work unit with known >> signals in the input data was subjected to analysis? If anyone who >> reads this board knows this they have not yet answered the question. >> All the testing I know of is to use a task of real data that we >> assume >> we know what is in it because we have run it through the software. >> And because the answer today matches the answer of yesterday we >> assume >> that the software is correct. Unless the software of yesterday was >> bad .. then we are just making today match yesterday's bad analysis. >> > > It's a good point. Maybe Eric could answer if such checks were > performed in lab before releasing SaH application. > But anyway, such testing should be done only once per algorithm > change, it should be done in project's lab and not on participants > PCs.
Here is where you and I disagree because I am looking at a broader picture than just in lab proofs. For one thing though we prove that the software works on the lab computer with your limited tests, and alpha test you neglect that well proven need for the broader beta test. And because that external environment is always in flux I want to continue to test in those board conditions to see how SP affect things, video drivers, and all that other software that we are now pretending does not have an effect ... except it does ... GPU Grid is having no end of troubles because of issues with the Drivers / CUDA version and possibly with the GTX260 cards or maybe their BIOS software (or whatever they call it) > Validity of algorithm used is fundamental question of course, but > should be solved BEFORE app goes into public. > Cause such calibration tasks should calibrate Validator in first > place. Again, you are assuming that a single test or even test suite will catch all conditions. I have talked about several potential situations where the validator will be happy with two agreeable and wrong answers. It is entirely possible that I am wrong that there are problems out there ... I don't think so because if there is one thing I have been able to do is to monitor the boards and what I see does not give me confidence that we have stable systems... but at the moment we do no testing to validate what we "know" is true. But if you are not measuring your error rate you have no idea what it is ... More interestingly we do not know what effects other software components may be having on BOINC Science Applications while they are running ... I proved earlier this year that Trak issue #6 (I think it is) that refers to the "Heartbeat" problems (and rated "Critical") is still alive and well and can allow IBERCIVIS and some other project's tasks to cause tasks for other projects to crash ... Bottom line, software interacts in strange ways that are not predictable in the lab ... > Then, being calibrated Validator can do same calibration work on > user returned results deciding what result is good and what is wrong. > There is no need for enormous resource waste running such > calibration tasks on each and every PC joined to project. Me personally, I would pay the price ... then again, I am an engineer and I love knowing the accuracy of what I am doing. Were I running a project you would be, if you were attached, running calibration tasks even if you did not know it ... because, if I was not measuring my error rate how could I present results with a straight face saying that I know what they mean? Running the calibration tasks on each PC is not needed necessarily to test the speed of the PCs I agree, but if we don't forget the other parts of the point of this ... and that is to increase our confidence on the machines that are calculating our results. We can debate the numbers of machines tested needed to obtain higher confidence and with experience that number most likely will drop over time ... I love the terms applied to the proposal when we have not even agreed upon the rate of tests or even the extent of the allowable opt-out or any other details ... "ENORMOUS"? How do you know it is enormous? If we allow opt-out it is zero ... though the usually specified system used to justify rejection runs one SaH Task a week ... which means one CUDA system can run off the production of that systems contribution in an afternoon ... But the real question is that how large is our hubris that we know the quality of our answers, when we have bothered to measure so little in our processes? To continually say that the validators will proof the answers then you have to prove that the validators have never made a mistake ... bug fixes to the validators prove that they are not infallible ... > Your approach in this part could be compared with such situation: > I refuse to use ruler, I always wanna compare all lenghts I need to > measure with meter etalon. > What would be with geometry if each and every measurement could be > done only after direct comparison with etalon of meter ?.... Walk into my web said the spider ... sorry, but this is contained in the proposal and matches standard calibration techniques. and the answer is that not all the extant PCs would be calibrated against the top level standards ... but once a PC was calibrated against the primary standard it can be used to calibrate standards to level 2 (in that it would be a level one standard calibrated against the primary standard) ... level two standards would be able to calibrate level 3 and there is where I would stop ... in effect you have a tree with a broader spread at each level ... _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
