I have been thinking some about what statistics we actually want to capture
for both work fetch and CPU scheduling.

For CPU scheduling, we want the worst case, and for work fetch we want the
average case.  CPU scheduling should be all about ensuring that deadlines
are met, and work fetch should be all about ensuring that there is enough
work on the computer to get through the disconnected period.

Unfortunately, just using the maximum value of recorded DCF ever does not
allow for changing situations at projects.  I would suggest that we keep a
list that spans a duration of at least 5* the maximum time from task
download to deadline, and at least a month and at least one entry for each
project application version.  When any task is completed, its actual DCF is
inserted in the list, and the list is truncated such that all tasks already
in the list that have a recorded DCF < the recorded DCF of the task just
completed, those tasks are removed from the list.  Also, at this time any
task with a recorded date of completion would be removed from the list if
it had a date older than the discard date.  The current DCF for any
application is 1 if there is no DCF recorded for any application for the
project, the maximum DCF for the application if there is a DCF recorded for
the application, and the maximum DCF recorded for the project if it is a
new application.

For work fetch, using mean DCF is probably what we want to use even if the
distribution is not approximately normal.  I would suggest that we use the
currend DCF calculation for that without the quick rise part of the
calculation so the number will drift towards a new solution if the project
changes its fpops estimate.

jm7

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to