Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...

William Thu, 13 Feb 2014 18:03:06 -0800

Fixing the estimates is hard.  Worth improving, but not a reliable fix strategy 
by itself.


Improving the percent complete and estimated remaining run 
time calculation is a lot easier - but the proposal is that this be a 
non-default fix, which makes it also unreliable because projects cannot be 
relied upon to opt into the fix.

Duration Correction Factor - either this is a form of improved calculation or 
else it relies on opt-in from the projects or opt-in from the user, the latter 
being disastrous and both being unreliable.

Reliable fix strategy:

1) Improve the default percent complete and estimated remaining run 
time calculations - this becomes linear.

2) Provide a dynamic calculations opt-in flag for those projects wishing to 
stay with original runtime estimates.  Gross errors (including failure to 
opt-in) now become the fault of the project, not the BOINC client and 
especially not the user.  Also try to improve the dynamic calculations (less 
heavily weighted against the linear result).


~~~~~
"Rightful liberty is unobstructed action according to our will within limits 
drawn around us by the equal rights of others. I do not add 'within the limits 
of the law' because law is often but the tyrant's will, and always so when it 
violates the rights of the individual." - Thomas Jefferson



On Thursday, February 13, 2014 4:10 PM, "McLeod, John" <john.mcl...@sap.com> 
wrote:
 
Another is to have the client reinstate some form of Duration Correction Factor 
– so that the client did not have to wait for the server to update the 
estimates.  It might make sense to make the DCF for projects that use the 
server side calculation react faster than the DCF for projects that do not.  
This would also have to be done with the understanding that the server side 
correction is in effect so new downloads would start with no DCF applied at 
first.
>
>From: Jon Sonntag [mailto:j...@thesonntags.com]
>Sent: Thursday, February 13, 2014 2:45 PM
>To: McLeod, John
>Cc: elliott...@verizon.net; BOINC Developers Mailing List @berkeley.edu
>Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
>
>When the original runtime estimate send with a WU is close, then the current 
>dynamic algorithm works very well. I think we all agree on that.  I think we 
>also agree that to give volunteer's the best initial experience on a project 
>is to have the first first several workunits actually complete somewhere close 
>to the original estimate. That, or have the estimate adjust as quickly as 
>possible to correct the time remaining
>
>There are at least two approaches to fixing the duration and progress done.
>One way is to fix the estimates.  The host_app_version table helps but only if 
>the project uses flops and spends the majority of its time doing flop 
>calculations.  The old credit system allowed for both flops and iops and/or a 
>flops to integer math ration. Having something like that for estimates would 
>help a lot.
>
>Another way is to allow the estimate to continue to be off but attempt to hide 
>that fact by improving the percent complete and estimated remaining run time 
>calculations.  Not all projects would be able to take advantage of this and 
>would continue using the current logic.  Those projects that can report 
>progress in a linear fashion would set a non-default server config option that 
>would be passed down to the client to allow the client do remaining runtime 
>estimates in a linear way.
>
>Why can't we just put in better estimates in the first place?  Unfortunately, 
>the flops to iops ratios are not the same from one processor to another, 
>especially with GPUs.  For example, how do you find a good original estimate 
>with such a large standard deviation in the Collatz AMD OpenCL app?
>mean: 9.138804e+15
>stdev: 7.139728e+15
>samples: 5000
>The values range from 1.900000e+13 to 1.197880e+17.  There are orders of 
>magnitudes difference in the way some GPUs handle integer math compared to the 
>projected_flops versus others.  Unfortunately, the host_app_version table only 
>has flops and not iops which would, I suspect, have a much smaller standard 
>deviation.
>
>Jon Sonntag
>
>On Thu, Feb 13, 2014 at 8:33 AM, McLeod, John 
><john.mcl...@sap.com<mailto:john.mcl...@sap.com>> wrote:
>If and only if the progress is actually linear.  There are projects where the 
>first 10% of the time runs the progress bar to 90% mostly because there are 
>non-determinable portions of the time.
>
>So
>10% at 5 minutes
>20% at 10 minutes
>30% at 15 minutes
>40% at 20 minutes
>50% at 25 minutes
>60% at 30 minutes
>70% at 35 minutes
>80% at 40 minutes
>90% at 45 minutes
>Done at 2 hours...
>
>It appears to be linear until it isn't.  If everything were as nice as you say 
>for all of the projects, then, yes, we could move to a strictly linear model.  
>The point is that it isn't.
>
>-----Original Message-----
>From: boinc_dev 
>[mailto:boinc_dev-boun...@ssl.berkeley.edu<mailto:boinc_dev-boun...@ssl.berkeley.edu>]
> On Behalf Of Jon Sonntag
>Sent: Wednesday, February 12, 2014 5:28 PM
>To: elliott...@verizon.net<mailto:elliott...@verizon.net>
>Cc: BOINC Developers Mailing List @berkeley.edu<http://berkeley.edu>
>Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
>
>If after 5 minutes, a workunit  is 10% done and after 10 minutes it is 20%
>done, I don't need a domain expert.  A 4th grade student should be able to
>calculate that it will take a total of 50 minutes to complete and that 40
>minutes remain.
>
>Jon Sonntag
>
>P.S. I went to a tax professional once. They charged a lot and they got it
>wrong.  The IRS corrected it and sent me a refund.
>
>
>
>On Tue, Feb 11, 2014 at 6:18 AM, Charles Elliott 
><elliott...@verizon.net<mailto:elliott...@verizon.net>>wrote:
>
>> Although I am a CS grad student, I urge you to reconsider choosing CS grad
>> students to work on this problem and consider instead using domain experts
>> in statistics and/or Operations Research or Systems, or perhaps even an
>> interdisciplinary team.  Old research shows  that it is much more
>> cost-effective to hire domain experts and teach them to program computers
>> than it is to hire CS grads and try to teach them the domain.  Suppose your
>> income tax preparation was a complex process.  Which would you want do it:
>> a
>> CS grad who wrote the fastest program possible, or a tax law expert who
>> could save you months of work on an IRS tax audit and keep you out of jail?
>>
>> Charles Elliott
>>
>> -----Original Message-----
>> From: boinc_dev 
>> [mailto:boinc_dev-boun...@ssl.berkeley.edu<mailto:boinc_dev-boun...@ssl.berkeley.edu>]
>>  On Behalf Of
>> David Anderson
>> Sent: Monday, February 10, 2014 10:58 PM
>> To: boinc_dev@ssl.berkeley.edu<mailto:boinc_dev@ssl.berkeley.edu>
>> Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
>>
>> In general we've put statistics-gathering into server rather than client
>> because
>> - it gives uniform data over the entire host population
>> - it puts the data all in one place
>>
>> Currently these statistics are just the bare essentials:
>> mean and standard deviation of elapsed time, turnaround time, and
>> credit-related quantities.
>> We maintain these per (host, app version) and per app version.
>> We use them to estimate job duration and to compute credit.
>>
>> As you point out, there are many other types of info we could track, and
>> many visualizations that could offered.
>> This is an area were having a few CS grad students working on BOINC would
>> be
>> a big help.
>>
>> -- David
>>
>> On 10-Feb-2014 4:01 PM, Max Power wrote:
>> >
>> > Many types of distributed computing applications don't due uniform
>> > processing (and reporting on percent done) like SETI, Astropulse or
>> > Einstein ... and the biological science applications (and image
>> > rendering ones) have taken some time to discipline the reporting of
>> percent done.
>> >
>> > What the BOINC Client does not do is use the hashsums of computing
>> > applications (as sometimes they run in pairs as in Climate Prediction)
>> > to form a local knowledge base of
>> >
>> > -- work unit size (average, median, standard deviation)
>> > -- work unit computation length  (average, median, standard deviation)
>> > -- completed work unit average size  (average, median, standard
>> > deviation)
>> > -- disk use  (average, median, standard deviation)
>> > -- these could be uplinked to the BOINC design groups and the projects
>> > themselves ... as you probably have to do an SQL query to find this
>> > stuff out
>> > -- THE "STATS" tab is almost totally devoid of usable statistics ...
>> > and the ones above relating to runtime are graphable and usable ...
>> >
>> >
>> > I am not saying this will fix the wonky estimated run time problem ...
>> > only regular application reporting to the BOINC client will ever do
>> > that. However, the averaged knowledge from these parameters could
>> > improve it when the daft application is not reporting.
>> >
>> >
>> > MP, DSN @ H
>> >
>> >
>> > -----Original Message----- From: McLeod, John
>> > Sent: 10 February 2014 05:48
>> > To: Jon Sonntag ; BOINC Developers Mailing 
>> > l...@berkeley.edu<mailto:l...@berkeley.edu>
>> > Subject: Re: [boinc_dev] Estimated Time Remaining
>> >
>> > Not all applications report  smooth % complete.  So the calculation of
>> > time remaining involve the initial estimate as well.  Given the bad
>> > information given for both % complete and initial estimate, there is
>> > no method of predicting how much longer the task will take that is
>> > completely right.  The most reliable appears to be to combine the
>> > initial estimate the DCF (if in use for the project) the % complete,
>> > and the time spent already (the only really well known item in the list)
>> to come up with an estimate.
>> >
>> >
>> > _______________________________________________
>> > boinc_dev mailing list
>> > boinc_dev@ssl.berkeley.edu<mailto:boinc_dev@ssl.berkeley.edu>
>> > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> > To unsubscribe, visit the above URL and (near bottom of page) enter
>> > your email address.
>> _______________________________________________
>> boinc_dev mailing list
>> boinc_dev@ssl.berkeley.edu<mailto:boinc_dev@ssl.berkeley.edu>
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
>>
>> _______________________________________________
>> boinc_dev mailing list
>> boinc_dev@ssl.berkeley.edu<mailto:boinc_dev@ssl.berkeley.edu>
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
>>
>_______________________________________________
>boinc_dev mailing list
>boinc_dev@ssl.berkeley.edu<mailto:boinc_dev@ssl.berkeley.edu>
>
>http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>To unsubscribe, visit the above URL and
>(near bottom of page) enter your email address.
>
>_______________________________________________
>boinc_dev mailing list
>boinc_dev@ssl.berkeley.edu
>http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>To unsubscribe, visit the above URL and
>(near bottom of page) enter your email address.
>
>
_______________________________________________
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...

Reply via email to