I know you have some slower systems that have broad attachment and  
this is a panic issue for you ... but ...

When the repaired the Hubble this last few months ago they went into a  
mode where little science is being done because they are testing and  
calibrating so that when the do go back to doing science, it will be  
accurate.  For a collection of scientists and engineers we as a group  
are collectively very reluctant to actually subject our systems to any  
sort of tests.  And with there are so many that seem to delight in  
running their systems at the upper edge f stability, I find this even  
more curious ...

Just as important I just lost ,to a slight case of death, one of my  
GPUs, but had it been doing bad calculations before it died enough to  
drive to pink screens?  I don't know ... and if it had, for how long?

When I suggested testing regimes in the past I did say, and still do,  
that we should be using full length tasks for testing.  Yes it is a  
cost of doing business and yes it will reduce the amount of results  
returned.  But I thought the point was to get known good results ...  
not just results of doubtful quality.

However, if we did implement a scheme such as I suggested in  
"Calibration" or as Dr. Anderson suggested in the new paper; if we are  
coordinating across projects there is no reason that more than one  
project would issue the "calibration" task per week than there is a  
need to run the benchmark every hour.

Assume we have 50 total projects and from those we cull to those that  
have relatively short task run times (eliminate Orbit and CPDN for  
example) then your client would ask for a calibration task on schedule  
from one of those projects that supplies calibration tasks and to  
which you are attached.  One could argue that other projects would be  
allowed to issue those calibration tasks should they chose to do so,  
but they would essentially be issuing, for example, SaH test tasks  
from CPDN site ... it may or may not be desirable...

On completion, as we store the information about the running of the  
benchmark we would store the information about the running of the  
calibration task.

Thus we validate the hardware with known tasks with known answers and  
if the suite is large enough it will make it challenging to come up  
with a cheat list ... just like some compilers had benchmark detection  
code ... we make test task detection harder ... and we do not increase  
the overhead significantly.

Oh, and eventually we get rid of the benchmark ...

On Sep 21, 2009, at 6:16 AM, [email protected] wrote:

> I missed the detail about "reference jobs".  If these are per  
> project, they
> have to be extremely short and run very infrequently.  With 65 or so
> projects, if every one has a full length job that runs every 5 days,  
> NO
> progress on actual science will ever occur on machines that are  
> attached to
> many projects.  Even if they last 5 minutes, that is still over 5  
> hours per
> 5 days, or about 4% of the available CPU time.  If they take an hour  
> each,
> that is 65 hours / 5 days, or just under half of the available CPU  
> time.
>
> I see two options for running reference tasks:
>
> 1)  Have a standard reference task that BOINC runs once, not once per
> project (but this works out the same as running benchmarks).
> 2)  Limit them to 5 minutes each, and run them once a month.
>
> jm7
>
>
>
>             "Paul D. Buck"
>             <p.d.b...@comcast
>             .net 
> >                                                      To
>             Sent by:                  David Anderson
>             boinc_dev-bounces         <[email protected]>
>              
> @ssl.berkeley.edu                                          cc
>                                       BOINC Developers Mailing List
>                                       <[email protected]>
>             09/21/2009 06:54                                       
> Subject
>             AM                        Re: [boinc_dev] [boinc_alpha]  
> Card
>                                       Gflops in BOINC 6.10
>
>
>
>
>
>
>
>
>
>
> Though it looks like the conversation died down again ... I think
> there are a couple points yet to be made.
>
> If I had one and only one objection to be made it is that this system
> seems to be based upon the benchmarking system without any attempts
> being made to correct for those deficiencies (as best I can tell). To
> my mind the worst feature of the benchmarks was not that they were
> inaccurate, but they cannot be replicated.  Repeated runs even on
> systems that are quiescent can get reported results that cover as
> spread with as much as a 20% variance.
>
> The concept of a "reference job" I am happy to see as that was the
> cornerstone of the proposal I made for use of calibration to quantify
> and test our systems in the BOINC universe. See:
> http://www.boinc-wiki.info/Improved_Benchmarking_System_Using_Calibration_Concepts
>
>
> I still see SaH as one of the "best" sources in that the source is
> public and probably the best understood.  Most importantly it should
> be relatively easy to make known test tasks by hand that have known
> characteristics that can be tested and to a great extent perhaps even
> tested with instrumented code so that precise counts of FLOPS could be
> made.
>
> An assumption is made that the GPU versions will be more efficient.  I
> think Aqua found that the converse is true (I do not know this for
> sure, it was in a post I read the other day in discussing projects
> with GPU applications that they dropped the GPU version because it was
> worse than the CPU version - multi-threaded).
>
> It may be that I am too dense to get it, but I also do not see how
> this proposal would adequately address the quality metrics we might
> extract from those projects where there are applications that span the
> types and classes of computing resources.  For example, the two "best"
> projects at this time are MilkyWay and Collatz in that they have
> applications that span all three of the currently available computing
> resources: CPU, Nvidia CUDA, and ATI Stream.
>
> And finally, the issue of optimized applications vs. "stock"
> application ... the hardware will report the same FLOPS but it seems
> to me the faster execution time of the optimized application will
> cause problems.
>
> Opps, two more finallies, you would require a change to all science
> applications to make this effective and you still require the projects
> to make an initial estimate regardless of its accuracy (predicted
> number of app units).
>
> On Aug 28, 2009, at 12:45 PM, David Anderson wrote:
>
>> I'm coming around to the viewpoint that projects shouldn't be  
>> expected
>> to supply estimates of job duration or application performance.
>> I think it's feasible to maintain these estimates dynamically,
>> based on actual job runtimes.
>> I've sketched a set of changes that would accomplish this:
>> http://boinc.berkeley.edu/trac/wiki/AutoFlops
>> Comments welcome.
>>
>> BTW, a bonus of the proposed design is that it provides
>> a project-independent credit-granting policy.
>>
>> -- David
>>
>> Richard Haselgrove wrote:
>>> ...  if projects
>>> are expected to fine-tune performance metrics down to the individual
>>> plan_class level, then I'm sorry, but they just won't. I've had to
>>> shout
>>> (loudly and repeatedly) at both AQUA and GPUGrid to get them to
>>> adjust
>>> rsc_fpops_est to within an order of magnitude of reality (in AQUA's
>>> case,
>>> two orders of magnitude).
>> _______________________________________________
>> boinc_dev mailing list
>> [email protected]
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
>
> _______________________________________________
> boinc_dev mailing list
> [email protected]
> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
> To unsubscribe, visit the above URL and
> (near bottom of page) enter your email address.
>
>
>

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to