On Sep 29, 2009, at 5:34 AM, [email protected] wrote:

> I have mostly not been hearing that live work would be the  
> reference.  What
> I have mostly been hearing is that we should do reference tasks  
> where the
> result is known, and the FLOP count can be known as well.  Running  
> these
> frequently is what I was objecting to as it wastes large amounts of
> otherwise useful processor time.

I know... of course I also have been told that I cannot point out that  
this is another clear indication that you have not been carefully  
reading what I have been writing.  If I explain completely all the ins  
and outs the complaint is that it is too long, if I stick to the  
specific objections the complaint is that you are not saying  
enough ... I don't mind either option ... pick one ... do you want me  
to stick to the specific point you object to, or do you want me to be  
complete ... pick one ...

Were we to implement my proposal there would be two classes, or more  
of work.  All would be "real" work in that the test tasks would be  
just like the real tasks in that they would take as long to process  
because there would be no difference in the construction of the task.   
TO put it another way in the context of SaH, the test task would be  
the same type input file the only difference would be that the data  
within would be artificially generated.  In other words a known signal.

Back to the waste.  I will reiterate.  This would be viable argument  
if you were not so sanguine about rampant waste in other aspects of  
the BOINC system.  Sorry, to me this rings false and trivial.  It is  
even more so when I think it is a straw man argument with little basis  
in actual systems extant.  Even worse, you hypothetical one task per  
week machine is such a pitiful resource that how would the loss of  
that machine impact the projects.  And the answer is that one GPU will  
do the work of that machine's lifetime in an afternoon ...

Lastly, like Redundant computing and Validation of duplicate or  
triplicate processing this is a cost we should be willing to pay for  
increased quality of known results.

As to waste, how much more wasteful than to find out that large  
batches of results are potentially unusable because of a series of  
flaws that allowed bad data to pass validation?

How big is the problem? You may be right and there is no problem.  But  
just as security by obscurity is not a safe answer, neither is  
pretending that these potential problems don't exist.

> I agree that having a reference machine doing work would eliminate  
> the need
> for gold plated benchmarks.  However, it does not entirely eliminate  
> the
> need for basic benchmarks as there is a very wide range in computation
> speeds for the computers that are in use on BOINC projects.  We  
> still need
> the basic benchmarks (the 5 minute variety that we have now) to give  
> us a
> starting point for the CPU scheduler and work fetch algorithms.

I again don't agree.  There are many ways to make the estimates  
including using the processor identification information to find  
matches in the database and estimating capabilities from that data.   
As far as new workloads, aside from Alpha projects this is another  
canard.  From the alpha and beta testing we know how long tasks  
take ... unless the tasks are inherently non-deterministic, in which  
case once again the point is that we cannot estimate run time anyway ...
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to