It's an attractive idea. Reading between the lines I judge the suggestion is having separate computation deadline adjustments for each project rather than the single global one we have now. Makes sense to me. How long do you think it might take the statistics to stabilize for various projects? The pattern for s...@h should only take a couple of months? But a project which only goes down maybe once a year due to major hardware failure or such might be an exception to your "problem is fixed"?
Given an implementation like the AVERAGE_VAR which is used for the elapsed time per flop data used for app_version estimate scaling, there isn't much data space used nor computational complexity. That's just exponential smoothing rather than a true mean, of course, but should be adequate. I'm not sure the value of the changes justify the recoding and testing of the client software which would be needed. The idea in just slightly different form is the first in the Core client section of the requested programming help - http://boinc.berkeley.edu/trac/wiki/DevProjects - so I presume if someone with the needed skills makes the effort it would be accepted. -- Joe On 16 Aug 2010 at 9:02, [email protected] wrote: > What I would like to see for this is to have the client keep track of: > > 1) The mean and standard deviation of the time from attempted upload to > completed upload. > 2) The mean and standard deviation of the time from attempted report to > completed report. > 3) The mean and standard deviation of the time from first at 100% or > higher to completed task. > > Subtract mean + 3*stddev for each of these from the computation deadline, > and the problem is fixed. BOINC will attempt to complete tasks for > projects that have long waits for uploads and reports at times much earlier > than they would have. This will also have the effect of reducing the > anguish among users that have late work. > > jm7 > > > > "Josef W. Segur" > <jse...@westelcom > .com> To > Sent by: David Anderson > <boinc_dev-bounce <[email protected]>, > [email protected] [email protected] > u> cc > > Subject > 08/15/2010 12:06 [boinc_dev] Deadlines during > AM outages > > > > > > > > > > > The BOINC core client attempts to get work completed, uploaded, and > reported > somewhat before the actual report deadline. If the project is down that > attempt will fail. For the 3 day outages at s...@h, the only way of making > the computation deadline early enough is to set at least 3 days for the > "Connect about every" preference, and that may not be practical if the > computer is also doing work for other projects. > > A missed deadline means the inefficiency of sending another replication. > Thinking about ways to avoid that, it seems a method of extending result > deadlines until there's a chance for completed work to be reported may be > useful not only for the s...@h scheduled outages, but also for any outage > of significant length at most projects. > > Here's one method which seems practical, implemented with a config boolean > "defer_res_report_deadline" which the project would set when not able to > accept reports, clear when the outage had finished and enough time had > passed for completed work to be reported. This is a pseudo-diff starting > at about line 225 of transitioner.cpp: > > ====================================================================== > // Scan this WU's results, and > // 1) count those in various server states; > // 2) identify time-out results and update their server state and > outcome > // 3) find the max result suffix (in case need to generate new ones) > // 4) see if we have a new result to validate > // (outcome SUCCESS and validate_state INIT) > // > + x = now + 4*3600 + rand()%(4*3600); // 4 to 8 hour deferral if needed > for (i=0; i<items.size(); i++) { > TRANSITIONER_ITEM& res_item = items[i]; > > if (!res_item.res_id) continue; > ntotal++; > > rs = result_suffix(res_item.res_name); > if (rs > max_result_suffix) max_result_suffix = rs; > > switch (res_item.res_server_state) { > case RESULT_SERVER_STATE_UNSENT: > nunsent++; > break; > case RESULT_SERVER_STATE_IN_PROGRESS: > if (res_item.res_report_deadline < now) { > + if (config.defer_res_report_deadline) { > + res_item.res_report_deadline = x; > + retval = transitioner.update_result(res_item); > + ninprogress++; > + } else { > log_messages.printf(MSG_NORMAL, > "[WU#%d %s] [RESULT#%d %s] result timed out (%d < > %d) server_state:IN_PROGRESS=>OVER; outcome:NO_REPLY\n", > wu_item.id, wu_item.name, res_item.res_id, > res_item.res_name, > res_item.res_report_deadline, (int)now > ); > ... > ... > + } > ====================================================================== > > I've not attempted to add log messages and such, that's just a sketch. > There > are many obvious variations possible, but this one seems most generally > useful to me. > -- > Joe _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
