As has been mentioned a few times, there's a fundamental difficulty
in applying host punishment quickly because it can depend on when
a result is validated.
While the result_overflow case is specific to s...@h, it's characterized
by very quick execution. It might make sense for the core client to
observe that the last n tasks done by an app_version have all completed
in less than 1/20 the estimated time and communicate that suspicious
data to the Scheduler. If n gets above some reasonable amount, the
Scheduler could apply limits and/or ensure that the host's results are
not accepted without a validation.
On the other end of the scale, if an app_version takes more than 90% of
the rsc_fpops_bound to complete a task that probably also indicates a
developing problem. It's not a good idea to send a lot of work to a host
which may soon be unable to complete it, so using that to trigger limits
might be worthwhile too. Parenthetically, I'm assuming David's code to
adjust rsc_fpops_est is accompanied by code to ensure the bound is still
sensibly larger than the estimate.
--
Joe
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.