Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
There are two different values, rsc_fpops_bound is used for the resource limit, and rsc_fpops_est is used for the estimated time and progress. So wanting to avoid 'resource limit exceeded' errors has nothing to do with providing a good estimate. El lunes, 17 de febrero de 2014, Richard Haselgrove r.haselgr...@btopenworld.com escribió: Unfortunately, some projects take the easy cop-out - applying a massive rsc_fpops_bound to circumvent resource limit exceeded, instead of resolving it properly from first principles. I suspect that some of the smaller projects have enough on their plate getting their heads round their own scientific research needs, and don't have enough time and energy left to switch into computer scientist mode and concentrate on the boincification of their application and workflow. From: Oliver Bock oliver.b...@aei.mpg.de javascript:; To: David Anderson da...@ssl.berkeley.edu javascript:;; Rytis Slatkevičius ryti...@gmail.com javascript:; Cc: BOINC Developers Mailing List boinc_dev@ssl.berkeley.edujavascript:; Sent: Monday, 17 February 2014, 9:22 Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... On 14/02/14 8:30 , David Anderson wrote: I'd prefer to figure out why the static estimates are off. If an app's jobs are of a size proportional to wu.rsc_fpops_est, the static estimates should be almost exact, even for a host's first jobs. The static estimates are often very rough ones because it's sometime not that easy to get a handle on accurate ones. In the extreme the estimate gets just set in such a way that you don't run into a resource limit exceeded issue - so you add at least some headroom. However, even in such a case you may know that your app will behave more or less linear. This is why I think having the said opt-in flag would be very useful for such apps. Cheers, Oliver ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu javascript:; http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu javascript:; http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. -- Nicolás ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
On 14/02/14 8:30 , David Anderson wrote: I'd prefer to figure out why the static estimates are off. If an app's jobs are of a size proportional to wu.rsc_fpops_est, the static estimates should be almost exact, even for a host's first jobs. The static estimates are often very rough ones because it's sometime not that easy to get a handle on accurate ones. In the extreme the estimate gets just set in such a way that you don't run into a resource limit exceeded issue - so you add at least some headroom. However, even in such a case you may know that your app will behave more or less linear. This is why I think having the said opt-in flag would be very useful for such apps. Cheers, Oliver smime.p7s Description: S/MIME Cryptographic Signature ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
Hi John, On 14/02/14 3:48 , McLeod, John wrote: Having the current method be opt in is no better than having a new method be opt in – for exactly the same reasons. I concur with William: if projects miss to opt for using the linear/dynamic flag they'll only hurt themselves. This is a self-correcting issue as projects have a strong interest in retaining volunteers and not drive them away by using/causing sub-optimal runtime estimates. Best, Oliver From: William [mailto:bcdecbi...@yahoo.com] Sent: Thursday, February 13, 2014 9:02 PM To: McLeod, John; Jon Sonntag Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... Fixing the estimates is hard. Worth improving, but not a reliable fix strategy by itself. Improving the percent complete and estimated remaining run time calculation is a lot easier - but the proposal is that this be a non-default fix, which makes it also unreliable because projects cannot be relied upon to opt into the fix. Duration Correction Factor - either this is a form of improved calculation or else it relies on opt-in from the projects or opt-in from the user, the latter being disastrous and both being unreliable. Reliable fix strategy: 1) Improve the default percent complete and estimated remaining run time calculations - this becomes linear. 2) Provide a dynamic calculations opt-in flag for those projects wishing to stay with original runtime estimates. Gross errors (including failure to opt-in) now become the fault of the project, not the BOINC client and especially not the user. Also try to improve the dynamic calculations (less heavily weighted against the linear result). smime.p7s Description: S/MIME Cryptographic Signature ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
Unfortunately, some projects take the easy cop-out - applying a massive rsc_fpops_bound to circumvent resource limit exceeded, instead of resolving it properly from first principles. I suspect that some of the smaller projects have enough on their plate getting their heads round their own scientific research needs, and don't have enough time and energy left to switch into computer scientist mode and concentrate on the boincification of their application and workflow. From: Oliver Bock oliver.b...@aei.mpg.de To: David Anderson da...@ssl.berkeley.edu; Rytis Slatkevičius ryti...@gmail.com Cc: BOINC Developers Mailing List boinc_dev@ssl.berkeley.edu Sent: Monday, 17 February 2014, 9:22 Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... On 14/02/14 8:30 , David Anderson wrote: I'd prefer to figure out why the static estimates are off. If an app's jobs are of a size proportional to wu.rsc_fpops_est, the static estimates should be almost exact, even for a host's first jobs. The static estimates are often very rough ones because it's sometime not that easy to get a handle on accurate ones. In the extreme the estimate gets just set in such a way that you don't run into a resource limit exceeded issue - so you add at least some headroom. However, even in such a case you may know that your app will behave more or less linear. This is why I think having the said opt-in flag would be very useful for such apps. Cheers, Oliver ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
No, William wants linear to be the default and the old version to be Opt in. I would rather see if there was some way to fix this without splitting it. If there isn't, then make the linear version be opt in, not the old one. -Original Message- From: Oliver Bock [mailto:oliver.b...@aei.mpg.de] Sent: Monday, February 17, 2014 4:26 AM To: McLeod, John; William; Jon Sonntag Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... * PGP - S/MIME Signed by an unverified key: 2/17/2014 at 4:25:53 AM Hi John, On 14/02/14 3:48 , McLeod, John wrote: Having the current method be opt in is no better than having a new method be opt in – for exactly the same reasons. I concur with William: if projects miss to opt for using the linear/dynamic flag they'll only hurt themselves. This is a self-correcting issue as projects have a strong interest in retaining volunteers and not drive them away by using/causing sub-optimal runtime estimates. Best, Oliver From: William [mailto:bcdecbi...@yahoo.com] Sent: Thursday, February 13, 2014 9:02 PM To: McLeod, John; Jon Sonntag Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... Fixing the estimates is hard. Worth improving, but not a reliable fix strategy by itself. Improving the percent complete and estimated remaining run time calculation is a lot easier - but the proposal is that this be a non-default fix, which makes it also unreliable because projects cannot be relied upon to opt into the fix. Duration Correction Factor - either this is a form of improved calculation or else it relies on opt-in from the projects or opt-in from the user, the latter being disastrous and both being unreliable. Reliable fix strategy: 1) Improve the default percent complete and estimated remaining run time calculations - this becomes linear. 2) Provide a dynamic calculations opt-in flag for those projects wishing to stay with original runtime estimates. Gross errors (including failure to opt-in) now become the fault of the project, not the BOINC client and especially not the user. Also try to improve the dynamic calculations (less heavily weighted against the linear result). * Oliver Bock oliver.b...@aei.mpg.de * Issuer: GermanGrid - Unverified ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
On 17/02/14 14:31 , McLeod, John wrote: No, William wants linear to be the default and the old version to be Opt in. Right, I see. He sort of mixed up dynamic calculations opt-in flag with original runtime estimates where it's actually the other way round - the intended new linear switch would cause a fully dynamic calculation. If there isn't, then make the linear version be opt in, not the old one. Yep, that's what I'd like to see (see earlier mails). Oliver -Original Message- From: Oliver Bock [mailto:oliver.b...@aei.mpg.de] Sent: Monday, February 17, 2014 4:26 AM To: McLeod, John; William; Jon Sonntag Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... * PGP - S/MIME Signed by an unverified key: 2/17/2014 at 4:25:53 AM Hi John, On 14/02/14 3:48 , McLeod, John wrote: Having the current method be opt in is no better than having a new method be opt in – for exactly the same reasons. I concur with William: if projects miss to opt for using the linear/dynamic flag they'll only hurt themselves. This is a self-correcting issue as projects have a strong interest in retaining volunteers and not drive them away by using/causing sub-optimal runtime estimates. Best, Oliver From: William [mailto:bcdecbi...@yahoo.com] Sent: Thursday, February 13, 2014 9:02 PM To: McLeod, John; Jon Sonntag Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... Fixing the estimates is hard. Worth improving, but not a reliable fix strategy by itself. Improving the percent complete and estimated remaining run time calculation is a lot easier - but the proposal is that this be a non-default fix, which makes it also unreliable because projects cannot be relied upon to opt into the fix. Duration Correction Factor - either this is a form of improved calculation or else it relies on opt-in from the projects or opt-in from the user, the latter being disastrous and both being unreliable. Reliable fix strategy: 1) Improve the default percent complete and estimated remaining run time calculations - this becomes linear. 2) Provide a dynamic calculations opt-in flag for those projects wishing to stay with original runtime estimates. Gross errors (including failure to opt-in) now become the fault of the project, not the BOINC client and especially not the user. Also try to improve the dynamic calculations (less heavily weighted against the linear result). * Oliver Bock oliver.b...@aei.mpg.de * Issuer: GermanGrid - Unverified smime.p7s Description: S/MIME Cryptographic Signature ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
This is exactly why linear should be the default. Dynamic should be available only to those projects that care enough to set it up properly. Linear should apply to the lazy ones unless and until they take the time to deliberately opt in. Unfortunately, some projects take the easy cop-out - applying a massive rsc_fpops_bound to circumvent resource limit exceeded, instead of resolving it properly from first principles. I suspect that some of the smaller projects have enough on thei plate getting their heads round their own scientific research needs, and don't have enough time and energy left to switch into computer scientist mode and concentrate on the boincification of their application and workflow. ~ Rightful liberty is unobstructed action according to our will within limits drawn around us by the equal rights of others. I do not add 'within the limits of the law' because law is often but the tyrant's will, and always so when it violates the rights of the individual. - Thomas Jefferson On Monday, February 17, 2014 9:02 AM, Oliver Bock oliver.b...@aei.mpg.de wrote: On 17/02/14 14:31 , McLeod, John wrote: No, William wants linear to be the default and the old version to be Opt in. Right, I see. He sort of mixed up dynamic calculations opt-in flag with original runtime estimates where it's actually the other way round - the intended new linear switch would cause a fully dynamic calculation. If there isn't, then make the linear version be opt in, not the old one. Yep, that's what I'd like to see (see earlier mails). Oliver -Original Message- From: Oliver Bock [mailto:oliver.b...@aei.mpg.de] Sent: Monday, February 17, 2014 4:26 AM To: McLeod, John; William; Jon Sonntag Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... * PGP - S/MIME Signed by an unverified key: 2/17/2014 at 4:25:53 AM Hi John, On 14/02/14 3:48 , McLeod, John wrote: Having the current method be opt in is no better than having a new method be opt in – for exactly the same reasons. I concur with William: if projects miss to opt for using the linear/dynamic flag they'll only hurt themselves. This is a self-correcting issue as projects have a strong interest in retaining volunteers and not drive them away by using/causing sub-optimal runtime estimates. Best, Oliver From: William [mailto:bcdecbi...@yahoo.com] Sent: Thursday, February 13, 2014 9:02 PM To: McLeod, John; Jon Sonntag Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... Fixing the estimates is hard. Worth improving, but not a reliable fix strategy by itself. Improving the percent complete and estimated remaining run time calculation is a lot easier - but the proposal is that this be a non-default fix, which makes it also unreliable because projects cannot be relied upon to opt into the fix. Duration Correction Factor - either this is a form of improved calculation or else it relies on opt-in from the projects or opt-in from the user, the latter being disastrous and both being unreliable. Reliable fix strategy: 1) Improve the default percent complete and estimated remaining run time calculations - this becomes linear. 2) Provide a dynamic calculations opt-in flag for those projects wishing to stay with original runtime estimates. Gross errors (including failure to opt-in) now become the fault of the project, not the BOINC client and especially not the user. Also try to improve the dynamic calculations (less heavily weighted against the linear result). * Oliver Bock oliver.b...@aei.mpg.de * Issuer: GermanGrid - Unverified ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
On 17/02/14 15:39 , William wrote: This is exactly why linear should be the default. Dynamic should be available only to those projects that care enough to set it up properly. Linear should apply to the lazy ones unless and until they take the time to deliberately opt in. And this is where I think you got the (new) terms wrong: dynamic means not to use any static estimates but to rely completely on what the app reports as fraction done. This is (de facto) reliable for apps with the said linear behavior. Bottom line: the linear option isn't the opposite/alternative of a dynamic estimate, they go hand in hand. Cheers, Oliver smime.p7s Description: S/MIME Cryptographic Signature ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
OK, to clarify: Old method: mixing project's pre-estimate with linear estimate based on what the app reports as fraction done, according to a mixing curve that heavily favors the project's pre-estimate early in the computation. This old method should become a project opt-in. New method: linear estimate based on what the app reports as fraction done. This should become the new default. Possible even better method: keep track of actual timing results for the last N runs and use the average of all those to either smooth (improve the accuracy of) or replace what the app reports as fraction done. Another opt-in (or two), but these would be user opt-ins (vs. project opt-ins). Overrides either the new default method or the project opt-in method (if used). ~ Rightful liberty is unobstructed action according to our will within limits drawn around us by the equal rights of others. I do not add 'within the limits of the law' because law is often but the tyrant's will, and always so when it violates the rights of the individual. - Thomas Jefferson On Monday, February 17, 2014 10:55 AM, Oliver Bock oliver.b...@aei.mpg.de wrote: On 17/02/14 15:39 , William wrote: This is exactly why linear should be the default. Dynamic should be available only to those projects that care enough to set it up properly. Linear should apply to the lazy ones unless and until they take the time to deliberately opt in. And this is where I think you got the (new) terms wrong: dynamic means not to use any static estimates but to rely completely on what the app reports as fraction done. This is (de facto) reliable for apps with the said linear behavior. Bottom line: the linear option isn't the opposite/alternative of a dynamic estimate, they go hand in hand. Cheers, Oliver ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
The current method works tolerably well for all projects. Strict linear based on % complete will fail miserably for fairly high proportion of projects. Please note that we had this discussion years ago when we did the original work. It was determined that straight linear based on % complete would not work at all for some projects - therefore it was determined to be unsuitable. From: William [mailto:bcdecbi...@yahoo.com] Sent: Monday, February 17, 2014 11:14 AM To: Oliver Bock; McLeod, John; Jon Sonntag Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... OK, to clarify: Old method: mixing project's pre-estimate with linear estimate based on what the app reports as fraction done, according to a mixing curve that heavily favors the project's pre-estimate early in the computation. This old method should become a project opt-in. New method: linear estimate based on what the app reports as fraction done. This should become the new default. Possible even better method: keep track of actual timing results for the last N runs and use the average of all those to either smooth (improve the accuracy of) or replace what the app reports as fraction done. Another opt-in (or two), but these would be user opt-ins (vs. project opt-ins). Overrides either the new default method or the project opt-in method (if used). ~ Rightful liberty is unobstructed action according to our will within limits drawn around us by the equal rights of others. I do not add 'within the limits of the law' because law is often but the tyrant's will, and always so when it violates the rights of the individual. - Thomas Jefferson On Monday, February 17, 2014 10:55 AM, Oliver Bock oliver.b...@aei.mpg.demailto:oliver.b...@aei.mpg.de wrote: On 17/02/14 15:39 , William wrote: This is exactly why linear should be the default. Dynamic should be available only to those projects that care enough to set it up properly. Linear should apply to the lazy ones unless and until they take the time to deliberately opt in. And this is where I think you got the (new) terms wrong: dynamic means not to use any static estimates but to rely completely on what the app reports as fraction done. This is (de facto) reliable for apps with the said linear behavior. Bottom line: the linear option isn't the opposite/alternative of a dynamic estimate, they go hand in hand. Cheers, Oliver ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
Hi! The question, however, is this: is BOINC as smart as a 4th grader - can it avoid falsely claiming that work units won't finish on time, thus misleading users into aborting work units that appear to have absolutely no chance of making their deadline? Maybe the real problem is that the info BOINC manager displays as an estimate has no measure of confidence or uncertainty attached (it lacks error bars, if you want). If an estimate is just it will take 100 hrs, 10 min, 5 sec , updated every second, the user reaction will (understandably) be different compared to something like (say) 100 hrs (+/- 80 hrs). No matter what scheme is used for the estimation, there will always be some uncertainty and while I'm not familiar with the BOINC code on the projected runtime, if things like the standard deviation of runtime per task is kept (I think this was mentioned), plus some [new] project supplied measure of uncertainty of the flops estimate for a workunit, it might be possible to give a better (more useful, less misleading) estimate by including an uncertainty. Just my 2 cents. Cheers HBE - Heinz-Bernd Eggenstein Max Planck Institute for Gravitational Physics Callinstrasse 38 D-30167 Hannover, Germany Tel.: +49-511-762-19466 (Room 037) From: William bcdecbi...@yahoo.com To: Jon Sonntag j...@thesonntags.com, elliott...@verizon.net elliott...@verizon.net, Cc: BOINC Developers Mailing List @berkeley.edu boinc_dev@ssl.berkeley.edu Date: 02/13/2014 12:53 AM Subject:Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... Sent by:boinc_dev boinc_dev-boun...@ssl.berkeley.edu The question, however, is this: is BOINC as smart as a 4th grader - can it avoid falsely claiming that work units won't finish on time, thus misleading users into aborting work units that appear to have absolutely no chance of making their deadline? Signs point to NO. ~ Rightful liberty is unobstructed action according to our will within limits drawn around us by the equal rights of others. I do not add 'within the limits of the law' because law is often but the tyrant's will, and always so when it violates the rights of the individual. - Thomas Jefferson On Wednesday, February 12, 2014 5:28 PM, Jon Sonntag j...@thesonntags.com wrote: If after 5 minutes, a workunit is 10% done and after 10 minutes it is 20% done, I don't need a domain expert. A 4th grade student should be able to calculate that it will take a total of 50 minutes to complete and that 40 minutes remain. Jon Sonntag P.S. I went to a tax professional once. They charged a lot and they got it wrong. The IRS corrected it and sent me a refund. On Tue, Feb 11, 2014 at 6:18 AM, Charles Elliott elliott...@verizon.netwrote: Although I am a CS grad student, I urge you to reconsider choosing CS grad students to work on this problem and consider instead using domain experts in statistics and/or Operations Research or Systems, or perhaps even an interdisciplinary team. Old research shows that it is much more cost-effective to hire domain experts and teach them to program computers than it is to hire CS grads and try to teach them the domain. Suppose your income tax preparation was a complex process. Which would you want do it: a CS grad who wrote the fastest program possible, or a tax law expert who could save you months of work on an IRS tax audit and keep you out of jail? Charles Elliott -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of David Anderson Sent: Monday, February 10, 2014 10:58 PM To: boinc_dev@ssl.berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... In general we've put statistics-gathering into server rather than client because - it gives uniform data over the entire host population - it puts the data all in one place Currently these statistics are just the bare essentials: mean and standard deviation of elapsed time, turnaround time, and credit-related quantities. We maintain these per (host, app version) and per app version. We use them to estimate job duration and to compute credit. As you point out, there are many other types of info we could track, and many visualizations that could offered. This is an area were having a few CS grad students working on BOINC would be a big help. -- David On 10-Feb-2014 4:01 PM, Max Power wrote: Many types of distributed computing applications don't due uniform processing (and reporting on percent done) like SETI, Astropulse or Einstein ... and the biological science applications (and image rendering ones) have taken some time to discipline the reporting of percent done. What the BOINC Client does not do is use the hashsums of computing applications (as sometimes they run in pairs as in Climate Prediction) to form
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
On 13/02/14 11:00 , Heinz-Bernd Eggenstein wrote: If an estimate is just it will take 100 hrs, 10 min, 5 sec , updated every second, the user reaction will (understandably) be different compared to something like (say) 100 hrs (+/- 80 hrs). In addition to HBE's approach above, I really like Michael's suggestion to allow the apps to tell the client that their fraction done reporting is indeed accurate, i.e. because of the app's known linear progress behavior. In such cases the estimate should be fully dynamic. This should be an opt-in flag, overriding the default static/dynamic approach. JM2C, Oliver smime.p7s Description: S/MIME Cryptographic Signature ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
If and only if the progress is actually linear. There are projects where the first 10% of the time runs the progress bar to 90% mostly because there are non-determinable portions of the time. So 10% at 5 minutes 20% at 10 minutes 30% at 15 minutes 40% at 20 minutes 50% at 25 minutes 60% at 30 minutes 70% at 35 minutes 80% at 40 minutes 90% at 45 minutes Done at 2 hours... It appears to be linear until it isn't. If everything were as nice as you say for all of the projects, then, yes, we could move to a strictly linear model. The point is that it isn't. -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of Jon Sonntag Sent: Wednesday, February 12, 2014 5:28 PM To: elliott...@verizon.net Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... If after 5 minutes, a workunit is 10% done and after 10 minutes it is 20% done, I don't need a domain expert. A 4th grade student should be able to calculate that it will take a total of 50 minutes to complete and that 40 minutes remain. Jon Sonntag P.S. I went to a tax professional once. They charged a lot and they got it wrong. The IRS corrected it and sent me a refund. On Tue, Feb 11, 2014 at 6:18 AM, Charles Elliott elliott...@verizon.netwrote: Although I am a CS grad student, I urge you to reconsider choosing CS grad students to work on this problem and consider instead using domain experts in statistics and/or Operations Research or Systems, or perhaps even an interdisciplinary team. Old research shows that it is much more cost-effective to hire domain experts and teach them to program computers than it is to hire CS grads and try to teach them the domain. Suppose your income tax preparation was a complex process. Which would you want do it: a CS grad who wrote the fastest program possible, or a tax law expert who could save you months of work on an IRS tax audit and keep you out of jail? Charles Elliott -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of David Anderson Sent: Monday, February 10, 2014 10:58 PM To: boinc_dev@ssl.berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... In general we've put statistics-gathering into server rather than client because - it gives uniform data over the entire host population - it puts the data all in one place Currently these statistics are just the bare essentials: mean and standard deviation of elapsed time, turnaround time, and credit-related quantities. We maintain these per (host, app version) and per app version. We use them to estimate job duration and to compute credit. As you point out, there are many other types of info we could track, and many visualizations that could offered. This is an area were having a few CS grad students working on BOINC would be a big help. -- David On 10-Feb-2014 4:01 PM, Max Power wrote: Many types of distributed computing applications don't due uniform processing (and reporting on percent done) like SETI, Astropulse or Einstein ... and the biological science applications (and image rendering ones) have taken some time to discipline the reporting of percent done. What the BOINC Client does not do is use the hashsums of computing applications (as sometimes they run in pairs as in Climate Prediction) to form a local knowledge base of -- work unit size (average, median, standard deviation) -- work unit computation length (average, median, standard deviation) -- completed work unit average size (average, median, standard deviation) -- disk use (average, median, standard deviation) -- these could be uplinked to the BOINC design groups and the projects themselves ... as you probably have to do an SQL query to find this stuff out -- THE STATS tab is almost totally devoid of usable statistics ... and the ones above relating to runtime are graphable and usable ... I am not saying this will fix the wonky estimated run time problem ... only regular application reporting to the BOINC client will ever do that. However, the averaged knowledge from these parameters could improve it when the daft application is not reporting. MP, DSN @ H -Original Message- From: McLeod, John Sent: 10 February 2014 05:48 To: Jon Sonntag ; BOINC Developers Mailing l...@berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining Not all applications report smooth % complete. So the calculation of time remaining involve the initial estimate as well. Given the bad information given for both % complete and initial estimate, there is no method of predicting how much longer the task will take that is completely right. The most reliable appears to be to combine the initial estimate the DCF (if in use for the project) the % complete, and the time spent already
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
High priority does not mean that the task will not complete on time. It means that if the tasks run in normal Round Robin between projects and First In First Out within a project there is a risk that there will be tasks that will not complete on time. -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of William Sent: Wednesday, February 12, 2014 6:50 PM To: Jon Sonntag; elliott...@verizon.net Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... The question, however, is this: is BOINC as smart as a 4th grader - can it avoid falsely claiming that work units won't finish on time, thus misleading users into aborting work units that appear to have absolutely no chance of making their deadline? Signs point to NO. ~ Rightful liberty is unobstructed action according to our will within limits drawn around us by the equal rights of others. I do not add 'within the limits of the law' because law is often but the tyrant's will, and always so when it violates the rights of the individual. - Thomas Jefferson On Wednesday, February 12, 2014 5:28 PM, Jon Sonntag j...@thesonntags.com wrote: If after 5 minutes, a workunit is 10% done and after 10 minutes it is 20% done, I don't need a domain expert. A 4th grade student should be able to calculate that it will take a total of 50 minutes to complete and that 40 minutes remain. Jon Sonntag P.S. I went to a tax professional once. They charged a lot and they got it wrong. The IRS corrected it and sent me a refund. On Tue, Feb 11, 2014 at 6:18 AM, Charles Elliott elliott...@verizon.netwrote: Although I am a CS grad student, I urge you to reconsider choosing CS grad students to work on this problem and consider instead using domain experts in statistics and/or Operations Research or Systems, or perhaps even an interdisciplinary team. Old research shows that it is much more cost-effective to hire domain experts and teach them to program computers than it is to hire CS grads and try to teach them the domain. Suppose your income tax preparation was a complex process. Which would you want do it: a CS grad who wrote the fastest program possible, or a tax law expert who could save you months of work on an IRS tax audit and keep you out of jail? Charles Elliott -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of David Anderson Sent: Monday, February 10, 2014 10:58 PM To: boinc_dev@ssl.berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... In general we've put statistics-gathering into server rather than client because - it gives uniform data over the entire host population - it puts the data all in one place Currently these statistics are just the bare essentials: mean and standard deviation of elapsed time, turnaround time, and credit-related quantities. We maintain these per (host, app version) and per app version. We use them to estimate job duration and to compute credit. As you point out, there are many other types of info we could track, and many visualizations that could offered. This is an area were having a few CS grad students working on BOINC would be a big help. -- David On 10-Feb-2014 4:01 PM, Max Power wrote: Many types of distributed computing applications don't due uniform processing (and reporting on percent done) like SETI, Astropulse or Einstein ... and the biological science applications (and image rendering ones) have taken some time to discipline the reporting of percent done. What the BOINC Client does not do is use the hashsums of computing applications (as sometimes they run in pairs as in Climate Prediction) to form a local knowledge base of -- work unit size (average, median, standard deviation) -- work unit computation length (average, median, standard deviation) -- completed work unit average size (average, median, standard deviation) -- disk use (average, median, standard deviation) -- these could be uplinked to the BOINC design groups and the projects themselves ... as you probably have to do an SQL query to find this stuff out -- THE STATS tab is almost totally devoid of usable statistics ... and the ones above relating to runtime are graphable and usable ... I am not saying this will fix the wonky estimated run time problem ... only regular application reporting to the BOINC client will ever do that. However, the averaged knowledge from these parameters could improve it when the daft application is not reporting. MP, DSN @ H -Original Message- From: McLeod, John Sent: 10 February 2014 05:48 To: Jon Sonntag ; BOINC Developers Mailing l...@berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining Not all applications report smooth % complete. So the calculation of time remaining involve
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
When the original runtime estimate send with a WU is close, then the current dynamic algorithm works very well. I think we all agree on that. I think we also agree that to give volunteer's the best initial experience on a project is to have the first first several workunits actually complete somewhere close to the original estimate. That, or have the estimate adjust as quickly as possible to correct the time remaining There are at least two approaches to fixing the duration and progress done. One way is to fix the estimates. The host_app_version table helps but only if the project uses flops and spends the majority of its time doing flop calculations. The old credit system allowed for both flops and iops and/or a flops to integer math ration. Having something like that for estimates would help a lot. Another way is to allow the estimate to continue to be off but attempt to hide that fact by improving the percent complete and estimated remaining run time calculations. Not all projects would be able to take advantage of this and would continue using the current logic. Those projects that can report progress in a linear fashion would set a non-default server config option that would be passed down to the client to allow the client do remaining runtime estimates in a linear way. Why can't we just put in better estimates in the first place? Unfortunately, the flops to iops ratios are not the same from one processor to another, especially with GPUs. For example, how do you find a good original estimate with such a large standard deviation in the Collatz AMD OpenCL app? mean: 9.138804e+15 stdev: 7.139728e+15 samples: 5000 The values range from 1.90e+13 to 1.197880e+17. There are orders of magnitudes difference in the way some GPUs handle integer math compared to the projected_flops versus others. Unfortunately, the host_app_version table only has flops and not iops which would, I suspect, have a much smaller standard deviation. Jon Sonntag On Thu, Feb 13, 2014 at 8:33 AM, McLeod, John john.mcl...@sap.com wrote: If and only if the progress is actually linear. There are projects where the first 10% of the time runs the progress bar to 90% mostly because there are non-determinable portions of the time. So 10% at 5 minutes 20% at 10 minutes 30% at 15 minutes 40% at 20 minutes 50% at 25 minutes 60% at 30 minutes 70% at 35 minutes 80% at 40 minutes 90% at 45 minutes Done at 2 hours... It appears to be linear until it isn't. If everything were as nice as you say for all of the projects, then, yes, we could move to a strictly linear model. The point is that it isn't. -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of Jon Sonntag Sent: Wednesday, February 12, 2014 5:28 PM To: elliott...@verizon.net Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... If after 5 minutes, a workunit is 10% done and after 10 minutes it is 20% done, I don't need a domain expert. A 4th grade student should be able to calculate that it will take a total of 50 minutes to complete and that 40 minutes remain. Jon Sonntag P.S. I went to a tax professional once. They charged a lot and they got it wrong. The IRS corrected it and sent me a refund. On Tue, Feb 11, 2014 at 6:18 AM, Charles Elliott elliott...@verizon.net wrote: Although I am a CS grad student, I urge you to reconsider choosing CS grad students to work on this problem and consider instead using domain experts in statistics and/or Operations Research or Systems, or perhaps even an interdisciplinary team. Old research shows that it is much more cost-effective to hire domain experts and teach them to program computers than it is to hire CS grads and try to teach them the domain. Suppose your income tax preparation was a complex process. Which would you want do it: a CS grad who wrote the fastest program possible, or a tax law expert who could save you months of work on an IRS tax audit and keep you out of jail? Charles Elliott -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of David Anderson Sent: Monday, February 10, 2014 10:58 PM To: boinc_dev@ssl.berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... In general we've put statistics-gathering into server rather than client because - it gives uniform data over the entire host population - it puts the data all in one place Currently these statistics are just the bare essentials: mean and standard deviation of elapsed time, turnaround time, and credit-related quantities. We maintain these per (host, app version) and per app version. We use them to estimate job duration and to compute credit. As you point out, there are many other types of info we could track, and many visualizations that could offered
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
Fixing the estimates is hard. Worth improving, but not a reliable fix strategy by itself. Improving the percent complete and estimated remaining run time calculation is a lot easier - but the proposal is that this be a non-default fix, which makes it also unreliable because projects cannot be relied upon to opt into the fix. Duration Correction Factor - either this is a form of improved calculation or else it relies on opt-in from the projects or opt-in from the user, the latter being disastrous and both being unreliable. Reliable fix strategy: 1) Improve the default percent complete and estimated remaining run time calculations - this becomes linear. 2) Provide a dynamic calculations opt-in flag for those projects wishing to stay with original runtime estimates. Gross errors (including failure to opt-in) now become the fault of the project, not the BOINC client and especially not the user. Also try to improve the dynamic calculations (less heavily weighted against the linear result). ~ Rightful liberty is unobstructed action according to our will within limits drawn around us by the equal rights of others. I do not add 'within the limits of the law' because law is often but the tyrant's will, and always so when it violates the rights of the individual. - Thomas Jefferson On Thursday, February 13, 2014 4:10 PM, McLeod, John john.mcl...@sap.com wrote: Another is to have the client reinstate some form of Duration Correction Factor – so that the client did not have to wait for the server to update the estimates. It might make sense to make the DCF for projects that use the server side calculation react faster than the DCF for projects that do not. This would also have to be done with the understanding that the server side correction is in effect so new downloads would start with no DCF applied at first. From: Jon Sonntag [mailto:j...@thesonntags.com] Sent: Thursday, February 13, 2014 2:45 PM To: McLeod, John Cc: elliott...@verizon.net; BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... When the original runtime estimate send with a WU is close, then the current dynamic algorithm works very well. I think we all agree on that. I think we also agree that to give volunteer's the best initial experience on a project is to have the first first several workunits actually complete somewhere close to the original estimate. That, or have the estimate adjust as quickly as possible to correct the time remaining There are at least two approaches to fixing the duration and progress done. One way is to fix the estimates. The host_app_version table helps but only if the project uses flops and spends the majority of its time doing flop calculations. The old credit system allowed for both flops and iops and/or a flops to integer math ration. Having something like that for estimates would help a lot. Another way is to allow the estimate to continue to be off but attempt to hide that fact by improving the percent complete and estimated remaining run time calculations. Not all projects would be able to take advantage of this and would continue using the current logic. Those projects that can report progress in a linear fashion would set a non-default server config option that would be passed down to the client to allow the client do remaining runtime estimates in a linear way. Why can't we just put in better estimates in the first place? Unfortunately, the flops to iops ratios are not the same from one processor to another, especially with GPUs. For example, how do you find a good original estimate with such a large standard deviation in the Collatz AMD OpenCL app? mean: 9.138804e+15 stdev: 7.139728e+15 samples: 5000 The values range from 1.90e+13 to 1.197880e+17. There are orders of magnitudes difference in the way some GPUs handle integer math compared to the projected_flops versus others. Unfortunately, the host_app_version table only has flops and not iops which would, I suspect, have a much smaller standard deviation. Jon Sonntag On Thu, Feb 13, 2014 at 8:33 AM, McLeod, John john.mcl...@sap.commailto:john.mcl...@sap.com wrote: If and only if the progress is actually linear. There are projects where the first 10% of the time runs the progress bar to 90% mostly because there are non-determinable portions of the time. So 10% at 5 minutes 20% at 10 minutes 30% at 15 minutes 40% at 20 minutes 50% at 25 minutes 60% at 30 minutes 70% at 35 minutes 80% at 40 minutes 90% at 45 minutes Done at 2 hours... It appears to be linear until it isn't. If everything were as nice as you say for all of the projects, then, yes, we could move to a strictly linear model. The point is that it isn't. -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edumailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of Jon Sonntag Sent: Wednesday, February 12, 2014 5
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
I want to make sure everyone realizes that some apps inherently can't supply an accurate fraction done. It's easy if your app has the form for i=1,N fixed computation It not easy if your app has the form lengthy preprocessing for i=1,n fixed computation lengthy postprocessing or while true do fixed computation if convergence criterion met break ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
This is exactly why Jon and Mike are proposing two different modes of runtime estimation: for linear projects (probably applications only, as a project can have multiple apps), add a flag to mark the application as linear; the remaining projects would use the same method as now. Since the new developers probably won't know about the flag, the current estimate should be the default. David, I want to make sure: is it that you oppose the suggestion itself, or is the time constraints that stop you from implementing it? If the latter, we could submit patches. -- Pagarbiai / Sincerely Rytis Slatkevičius +370 670 7 On Fri, Feb 14, 2014 at 4:54 AM, David Anderson da...@ssl.berkeley.eduwrote: I want to make sure everyone realizes that some apps inherently can't supply an accurate fraction done. It's easy if your app has the form for i=1,N fixed computation It not easy if your app has the form lengthy preprocessing for i=1,n fixed computation lengthy postprocessing or while true do fixed computation if convergence criterion met break ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
[boinc_dev] Estimated Time Remaining, frictional reporting : Duration Correction Factor CLIENT + Duration Correction Factor SERVER is a good autonomous solution ...
Estimated Time Remaining, frictional reporting : Duration Correction Factor CLIENT + Duration Correction Factor SERVER is a good autonomous solution for all parties involved. This approach allows for some sensible Artificial Intelligence to be done at the Client. -- The Client probably has better knowledge about possible work unit completion [and overall progress] than the project server ever will. -- This is sort of the nature of distributed computing, when one is using different kinds of computers to run the numbers. The Client can build up a behavioral database [on a per application basis], and use statistics for deriving Duration Correction Factor CLIENT. The Client -- Server hint data structure -- The Client need only pass back to the server 2 or 3 numbers to correct Duration Correction Factor SERVER. -- If the Server does not support decoding the data structure it will be ignored. -- BOINC Server version numbers can be used to turn off or on this hinting, thus avoiding unnecessary traffic generation, will I hope this is possible... -- How or where this will be transmitted, received and processed (and used) I do not know. The Data Structure -- Version number : Unsigned Byte -- Certainty (%) : (Unsigned or Unsigned) Byte (5 decimal to 95d in steps of 5), signed to use sign as parity bit... -- Seconds Delta : Signed Int ~= 9 hours differential possible per communication; Math = [32k/60]/60 -- Sample age seconds : Unsigned Byte (only use 100 to 180 decimal, mostly 101 to 123 used), resample greater than 67s -- Reserved for future use : Unsigned Int, could provide the last unsigned int of Unix Time (signed int64) timestamp of sample time -- Optional : Sequence Number : Byte -- aka how many times has a correction been sent after UTC 00:00 ... Keep it under 100! -- Checksum : CRC-16-CCITT, CRC32-mpeg if Client ID and Application Hashsum are used. -- So under 80 bits, certainly under 100 bits. -- Adding the Client ID (and Application Hashsum) will up the bits used, but still keep the traffic overhead to under 1000 bits. Meanwhile, Duration Correction Factor SERVER is still available. Modifying Duration Correction Factor SERVER (to accept hints from Duration Correction Factor CLIENT) may take some re-designing, but can evolve over time. Possibly 2 or 3 correctional constants might be needed at Duration Correction Factor SERVER, but can only be derived from real time Client performance data. MP DSN @ H -Original Message- From: McLeod, John Sent: 13 February 2014 13:09 To: Jon Sonntag Cc: BOINC Developers Mailing List @berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... Another is to have the client reinstate some form of Duration Correction Factor – so that the client did not have to wait for the server to update the estimates. It might make sense to make the DCF for projects that use the server side calculation react faster than the DCF for projects that do not. This would also have to be done with the understanding that the server side correction is in effect so new downloads would start with no DCF applied at first. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
If after 5 minutes, a workunit is 10% done and after 10 minutes it is 20% done, I don't need a domain expert. A 4th grade student should be able to calculate that it will take a total of 50 minutes to complete and that 40 minutes remain. Jon Sonntag P.S. I went to a tax professional once. They charged a lot and they got it wrong. The IRS corrected it and sent me a refund. On Tue, Feb 11, 2014 at 6:18 AM, Charles Elliott elliott...@verizon.netwrote: Although I am a CS grad student, I urge you to reconsider choosing CS grad students to work on this problem and consider instead using domain experts in statistics and/or Operations Research or Systems, or perhaps even an interdisciplinary team. Old research shows that it is much more cost-effective to hire domain experts and teach them to program computers than it is to hire CS grads and try to teach them the domain. Suppose your income tax preparation was a complex process. Which would you want do it: a CS grad who wrote the fastest program possible, or a tax law expert who could save you months of work on an IRS tax audit and keep you out of jail? Charles Elliott -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of David Anderson Sent: Monday, February 10, 2014 10:58 PM To: boinc_dev@ssl.berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... In general we've put statistics-gathering into server rather than client because - it gives uniform data over the entire host population - it puts the data all in one place Currently these statistics are just the bare essentials: mean and standard deviation of elapsed time, turnaround time, and credit-related quantities. We maintain these per (host, app version) and per app version. We use them to estimate job duration and to compute credit. As you point out, there are many other types of info we could track, and many visualizations that could offered. This is an area were having a few CS grad students working on BOINC would be a big help. -- David On 10-Feb-2014 4:01 PM, Max Power wrote: Many types of distributed computing applications don't due uniform processing (and reporting on percent done) like SETI, Astropulse or Einstein ... and the biological science applications (and image rendering ones) have taken some time to discipline the reporting of percent done. What the BOINC Client does not do is use the hashsums of computing applications (as sometimes they run in pairs as in Climate Prediction) to form a local knowledge base of -- work unit size (average, median, standard deviation) -- work unit computation length (average, median, standard deviation) -- completed work unit average size (average, median, standard deviation) -- disk use (average, median, standard deviation) -- these could be uplinked to the BOINC design groups and the projects themselves ... as you probably have to do an SQL query to find this stuff out -- THE STATS tab is almost totally devoid of usable statistics ... and the ones above relating to runtime are graphable and usable ... I am not saying this will fix the wonky estimated run time problem ... only regular application reporting to the BOINC client will ever do that. However, the averaged knowledge from these parameters could improve it when the daft application is not reporting. MP, DSN @ H -Original Message- From: McLeod, John Sent: 10 February 2014 05:48 To: Jon Sonntag ; BOINC Developers Mailing l...@berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining Not all applications report smooth % complete. So the calculation of time remaining involve the initial estimate as well. Given the bad information given for both % complete and initial estimate, there is no method of predicting how much longer the task will take that is completely right. The most reliable appears to be to combine the initial estimate the DCF (if in use for the project) the % complete, and the time spent already (the only really well known item in the list) to come up with an estimate. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
The question, however, is this: is BOINC as smart as a 4th grader - can it avoid falsely claiming that work units won't finish on time, thus misleading users into aborting work units that appear to have absolutely no chance of making their deadline? Signs point to NO. ~ Rightful liberty is unobstructed action according to our will within limits drawn around us by the equal rights of others. I do not add 'within the limits of the law' because law is often but the tyrant's will, and always so when it violates the rights of the individual. - Thomas Jefferson On Wednesday, February 12, 2014 5:28 PM, Jon Sonntag j...@thesonntags.com wrote: If after 5 minutes, a workunit is 10% done and after 10 minutes it is 20% done, I don't need a domain expert. A 4th grade student should be able to calculate that it will take a total of 50 minutes to complete and that 40 minutes remain. Jon Sonntag P.S. I went to a tax professional once. They charged a lot and they got it wrong. The IRS corrected it and sent me a refund. On Tue, Feb 11, 2014 at 6:18 AM, Charles Elliott elliott...@verizon.netwrote: Although I am a CS grad student, I urge you to reconsider choosing CS grad students to work on this problem and consider instead using domain experts in statistics and/or Operations Research or Systems, or perhaps even an interdisciplinary team. Old research shows that it is much more cost-effective to hire domain experts and teach them to program computers than it is to hire CS grads and try to teach them the domain. Suppose your income tax preparation was a complex process. Which would you want do it: a CS grad who wrote the fastest program possible, or a tax law expert who could save you months of work on an IRS tax audit and keep you out of jail? Charles Elliott -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of David Anderson Sent: Monday, February 10, 2014 10:58 PM To: boinc_dev@ssl.berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... In general we've put statistics-gathering into server rather than client because - it gives uniform data over the entire host population - it puts the data all in one place Currently these statistics are just the bare essentials: mean and standard deviation of elapsed time, turnaround time, and credit-related quantities. We maintain these per (host, app version) and per app version. We use them to estimate job duration and to compute credit. As you point out, there are many other types of info we could track, and many visualizations that could offered. This is an area were having a few CS grad students working on BOINC would be a big help. -- David On 10-Feb-2014 4:01 PM, Max Power wrote: Many types of distributed computing applications don't due uniform processing (and reporting on percent done) like SETI, Astropulse or Einstein ... and the biological science applications (and image rendering ones) have taken some time to discipline the reporting of percent done. What the BOINC Client does not do is use the hashsums of computing applications (as sometimes they run in pairs as in Climate Prediction) to form a local knowledge base of -- work unit size (average, median, standard deviation) -- work unit computation length (average, median, standard deviation) -- completed work unit average size (average, median, standard deviation) -- disk use (average, median, standard deviation) -- these could be uplinked to the BOINC design groups and the projects themselves ... as you probably have to do an SQL query to find this stuff out -- THE STATS tab is almost totally devoid of usable statistics ... and the ones above relating to runtime are graphable and usable ... I am not saying this will fix the wonky estimated run time problem ... only regular application reporting to the BOINC client will ever do that. However, the averaged knowledge from these parameters could improve it when the daft application is not reporting. MP, DSN @ H -Original Message- From: McLeod, John Sent: 10 February 2014 05:48 To: Jon Sonntag ; BOINC Developers Mailing l...@berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining Not all applications report smooth % complete. So the calculation of time remaining involve the initial estimate as well. Given the bad information given for both % complete and initial estimate, there is no method of predicting how much longer the task will take that is completely right. The most reliable appears to be to combine the initial estimate the DCF (if in use for the project) the % complete, and the time spent already (the only really well known item in the list) to come up with an estimate. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu
Re: [boinc_dev] Estimated Time Remaining, frictional reporting ...
Although I am a CS grad student, I urge you to reconsider choosing CS grad students to work on this problem and consider instead using domain experts in statistics and/or Operations Research or Systems, or perhaps even an interdisciplinary team. Old research shows that it is much more cost-effective to hire domain experts and teach them to program computers than it is to hire CS grads and try to teach them the domain. Suppose your income tax preparation was a complex process. Which would you want do it: a CS grad who wrote the fastest program possible, or a tax law expert who could save you months of work on an IRS tax audit and keep you out of jail? Charles Elliott -Original Message- From: boinc_dev [mailto:boinc_dev-boun...@ssl.berkeley.edu] On Behalf Of David Anderson Sent: Monday, February 10, 2014 10:58 PM To: boinc_dev@ssl.berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining, frictional reporting ... In general we've put statistics-gathering into server rather than client because - it gives uniform data over the entire host population - it puts the data all in one place Currently these statistics are just the bare essentials: mean and standard deviation of elapsed time, turnaround time, and credit-related quantities. We maintain these per (host, app version) and per app version. We use them to estimate job duration and to compute credit. As you point out, there are many other types of info we could track, and many visualizations that could offered. This is an area were having a few CS grad students working on BOINC would be a big help. -- David On 10-Feb-2014 4:01 PM, Max Power wrote: Many types of distributed computing applications don't due uniform processing (and reporting on percent done) like SETI, Astropulse or Einstein ... and the biological science applications (and image rendering ones) have taken some time to discipline the reporting of percent done. What the BOINC Client does not do is use the hashsums of computing applications (as sometimes they run in pairs as in Climate Prediction) to form a local knowledge base of -- work unit size (average, median, standard deviation) -- work unit computation length (average, median, standard deviation) -- completed work unit average size (average, median, standard deviation) -- disk use (average, median, standard deviation) -- these could be uplinked to the BOINC design groups and the projects themselves ... as you probably have to do an SQL query to find this stuff out -- THE STATS tab is almost totally devoid of usable statistics ... and the ones above relating to runtime are graphable and usable ... I am not saying this will fix the wonky estimated run time problem ... only regular application reporting to the BOINC client will ever do that. However, the averaged knowledge from these parameters could improve it when the daft application is not reporting. MP, DSN @ H -Original Message- From: McLeod, John Sent: 10 February 2014 05:48 To: Jon Sonntag ; BOINC Developers Mailing l...@berkeley.edu Subject: Re: [boinc_dev] Estimated Time Remaining Not all applications report smooth % complete. So the calculation of time remaining involve the initial estimate as well. Given the bad information given for both % complete and initial estimate, there is no method of predicting how much longer the task will take that is completely right. The most reliable appears to be to combine the initial estimate the DCF (if in use for the project) the % complete, and the time spent already (the only really well known item in the list) to come up with an estimate. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.