On Sep 28, 2009, at 12:01 PM, [email protected] wrote:

> Most of the computation errors that computers make are because they  
> are
> overheating or grossly overclocked.  These tend to produce random  
> errors,
> and it is these that the redundancy will weed out as the odds of two  
> random
> answers of 128 bits being the same is a very good approximation of  
> 0.  One
> in 10^36 about.  Redundancy also weeds out cheats.  Because of this,  
> all
> projects that do not have a quick calculation from the answer back  
> to the
> original question should implement redundancy.

And these are in part the systems that we should be trying to find.  I  
don't know how many times I have had a person tell me that over- 
clocking cannot be at fault because their computer works fin on  
project x but not on project Y ... so there must be a problem with  
project Y...

And the logical flaw here is that you assume two things, one that  
there is redundancy and that two that validation will always catch  
errors.  Neither is necessarily true.  Because of pressure from people  
like you projects are moving more and more to single redundancy and  
essentially hoping that they catch the errors with validation.   
Watching the boards even on the limited projects I am looking at these  
days shows that this is not necessarily true.  Were it true there  
would not be so much work on the validators by the projects ... to put  
it another way, validators are just as buggy as software in general...

The truth is we do not have any metrics on the quality of the  
computing systems in use because we don't look and you keep squinting  
as hard as possible to avoid the possibility of seeing the potential  
for error.

As to cheats, most of our efforts to catch cheats seems to come down  
more on innocents than on detecting real cheating.  Sadly this is  
another place where we measure nothing so we have no way of knowing if  
we are catching cheats or not.  Anecdotal evidence says that the  
cheats are alive and well and it is just the poor guy that spent a  
long time calculating a result that gets hammered with a no validation  
on his/her work through no fault of their own.   For those that bother  
to think back this is another place where I have argued that the  
draconian policy on disallowing credit is a disincentive to the  
average guy and not the cheats because they have different avenues to  
game the system... my suggestion was to award a nominal fraction of  
the claim for the try.

I would point out that if we were characterizing systems better it  
would be possible to see if systems are returning results in numbers  
that are consistent with the capabilities of the resources within  
those systems.

> Catching something like the Intel bug of a few years ago would require
> running a short task exactly once in the lifetime of the  
> installation of
> BOINC on a particular hardware platform.  It does NOT require  
> wasting a
> collective million hours of CPU time every week doing gold plated
> benchmarks.

I am using that as an example of how you are dancing about to avoid  
the possibilities that exist that two computers can return the same  
answer which means that they will pass validation but the answer is  
still wrong.  Twice.  I am not saying that this is a guarantee that we  
will catch those errors, but at least it is better than not looking at  
all and *ASSUMING* that redundant results will catch the error ...  
which it won't if there is no redundancy.

> One suggestion is to use a reputation based method.  If validation  
> fails
> enough times on your machine, your reputation degrades, and you get  
> to do a
> reference task, and if you fail that, you may have your quota  
> instantly
> dropped to 1.  If your quota is 1 and you are doing a reference  
> task, all
> you get is reference tasks until your computer starts returning valid
> results.  If a machine always generate valid results, then you might  
> reduce
> the redundancy for that machine to something like 1 in 5 so that it  
> would
> be allowed to solo 4 out of 5 tasks.

The reputation based method is already part of the process but the  
reputation on project x is not shared with project y and we are also  
not looking to see, because once again we are not catching the  
metrics, if there is an explanation as to why that is occurring.  Is  
is systemic or specific to a single system or what?  Adaptive  
Validation I think is the term used ...

> Most machines work almost all of the time.  There are a few machines  
> that
> fail always.

And the ones that fail all the time are not the issue... most of the  
time the smoke gives them away ... but you left out that class that I  
have been talking about, and that is that other class that does not  
fail all the time, but also does not work all of the time.  My systems  
are usually rock solid on just about every project extant.  Yet I have  
had consistent problems with long running models on PG... Why? I don't  
know, is it my machines, their code? phases of the moon?  We don't  
know because we don't measure.  If I run PG I select sub-projects that  
don't take over 30 minutes ...

I am just saying that we can kill several birds with one stone.  One  
is to get an improved and reliable credit system, in addition we can  
start to improve the confidence we have in the systems that are  
producing our results and third that we can start to proof some of the  
software we are using (this may or may not be possible for all  
projects, but, with one or two examples maybe we can start to figure  
out the rest), and lastly, we can start to learn about those systems  
and how they operate and how their operation impacts the projects.

> jm7
>
>
>
>             "Paul D. Buck"
>             <p.d.b...@comcast
>             .net 
> >                                                      To
>                                       "Lynn W. Taylor" <[email protected] 
> >
>             09/28/2009  
> 02:42                                           cc
>             PM                        [email protected], BOINC
>                                       Developers Mailing List
>                                       <[email protected]>
>                                                                    
> Subject
>                                       Re: [boinc_dev] [boinc_alpha]  
> Card
>                                       Gflops in BOINC 6.10
>
>
>
>
>
>
>
>
>
>
>
> On Sep 28, 2009, at 11:13 AM, Lynn W. Taylor wrote:
>
>> The benchmark affects the estimated run time, and the amount of work
>> downloaded.  It affects credit, and credit is "fun" but it's not
>> science.
>
> Then you are also guilty of not reading the proposal.  I have always
> said that while running calibration tasks that the same compensation
> would be paid for a calibration task as for any other task.  In fact,
> I said that it could qualify for a bonus to encourage participation in
> the system.  In that we have resistance as you and John express
> because you don't seem interested in any attempt to improve the
> operation of the system as a whole.
>
>> As you correctly said, it will measure something about the quality
>> of the network.
>>
>> What John said (and you apparently didn't read) is that the science
>> application produces a result, and that result is either valid (the
>> calculations were performed correctly) or they're wrong (the result
>> of the calculations is incorrect and useless).
>
> Validation is not a magic wand and only usually means that two (or
> more) computers returned the same answer not that the answer is in
> fact correct.  With more and more projects move to single system
> answers this means that this leg of redundancy is disappearing.  I
> grant that there are some cases where there can be an absolute
> validation, but I am suspicious of some of the claims of infallibility
> because history says that thinking that computer generated results are
> correct results.  Another case in point is the F-Div bug where two
> Intel processors would return identical wrong results ... how many
> other bugs like that exist, we don't know ... but assume that they
> don't exist because we have not seen them ... of course we are also
> not looking ... and working very hard to never look ...
>
>> It's a two way street, Paul.
>
> Yes, it should be.  Sadly what usually happens is that people don't
> read what I write but skim it and then reject it in favor of not doing
> anything or on grounds that are spurious.
>
>> If you have a technical argument, present it (and only the technical
>> argument).  When you accuse people of not reading, or not adopting
>> your ideas because they're your ideas, that becomes a self-
>> fulfilling prophesy.
>
> If you have suggestions to improve one of my ideas you will find that
> I am always happy to include those into the proposal.  And I only
> accuse people of not reading when it is clearly obvious that they are
> not reading.  If they are objecting on the grounds that I do x when I
> clearly have stated that we will not do x ... or that I don't have
> feature y when I clearly defined that I have feature y ...
>
> So, if I am restricted to only presenting technical ideas and
> arguments to those technical ideas, why can you do otherwise?
>
> As to the other point, well, it is not that they are just my ideas ...
> UCB pretty much rejects all ideas that do not originate at UCB ... or,
> it is proposed as a UCB idea a year or two after it was proposed ...
>
>
>

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to