> I inject it into one node in our network I will process it and get back > and I can see if the system did detect the one pulse as expected. IF so, > as Martin correctly stated we have one end to end test ... now, do that > over a bunch of nodes, at this point we can start to answer several > questions ... are they all detecting the one pulse and if so what are the > limits of error on that return. There is no noise so the answers should > be identical... but they won't be ... different OS, different processors, > different compilers, on and on ... but now we can start to determine > where are the sources of these errors ... and oh by the way isolate the > machines that are returning just bad answers ... the side effect is that > we have also done the benchmark. Again, you describe correct procedure, just applied to incorrect level. There are just TOO MANY (and you gave good examples of some of them) reasons for host to return invalid result. It could be aim of specific BOINC-based project to investigate what types of error are dominant, what OSes/hardware more faulty than others and so on. It would be very interesting investigation, no objections. But all this info not directly related to such particular project as SETI. If peaks in returned result differs in number or their power differs more than some predefined error they considered to be different. It's out of project's scope to judge what lead to such error, what result right and waht is wrong - it's unknown! It just stated that results disagree and ask another host. I bring example from real life, from our work on opt AstroPulse (AP). When linux build of opt app was done there was many cases of result inconsistence between linux and windows hosts. Linux vs linux validated OK, windows vs Windows - too. What conclusion was? Windows and Linux version process same data too differently to be validated versus each other within given error range. Changes were made to exclude such variation. This example shows that current approach is enough to catch differencies in OS/dcompilers/precision standarts. BTW, we (at Lunatics ) still don't know what results were "more correct" from scientific point of view. It's just another level. If we would go on that level, yes, indeed, we will construct artifical tasks with apriory known result and will feed them to our app and current version of validator. Maybe it's even worth to do (to be personally sure all is fine), but there is no need to do the same for all participants. Validator code just the same for all. Anyone can find errors in it personally, no need to do it via distributed computing.
> Which is why I suggest a suite. Yes, I got it. I just trying to point that primarily goal, to improve science, is unreachable by this method IMO and secondary goals will come with too big overhead to be implemented even if primarily goal is gone. > > Which is because there has been so much negativity about one aspect or > another without considering the whole. Whole constructed from parts, some of them more important, some less. If important parts are questionable, then we could even skip less important ones from consideration. But in real less important ones were considered too, nothing missed IMO. >If you want, and some have, you can harp on the fact that the "benchmark" >task will take as long as a real task and that this is "waste" if you >insist on ignoring the other purposes served. No-no-no, other purposes were analyzed one by one too. > I have it ... and if all participants were trustworthy then there is no > need to check up on them ... though even Reagan said "Trust, but verify" LoL, Lenin said that first :))))))))))))))))))))))) But seriously, well, everyone can be unsure and everyone can do his own small investigation, provided project is opensource. For closed source project... yes, all we can do is to trust project's founders. Calibration tasks will go from same project, validator comes from same project, BOINC itself can't say nothing about specific project results... We have no choice but to trust them, projects' founders. Actually, even here one could make some experiments: For example, I can OC my GPU/CPU to such degree when correct result will be highly unpossible (or just intentionally damage formed result before returning it to project) Then I can run few tasks from project under question. If I will see they accepting such obviously broken results and saying all fine, give us more.... Well, I will have big doubts in what "science" this project does and could either abandon it or even post results of such experiment to forums to make them public. Not bad testing BTW ;) > Besides, a million people calling Eric? Forums-> bring attention to problem ->reproduce problem in another place/by another peoples... sooner or later project's dev will know. _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
