I have been thinking some about what statistics we actually want to capture for both work fetch and CPU scheduling.
For CPU scheduling, we want the worst case, and for work fetch we want the average case. CPU scheduling should be all about ensuring that deadlines are met, and work fetch should be all about ensuring that there is enough work on the computer to get through the disconnected period. Unfortunately, just using the maximum value of recorded DCF ever does not allow for changing situations at projects. I would suggest that we keep a list that spans a duration of at least 5* the maximum time from task download to deadline, and at least a month and at least one entry for each project application version. When any task is completed, its actual DCF is inserted in the list, and the list is truncated such that all tasks already in the list that have a recorded DCF < the recorded DCF of the task just completed, those tasks are removed from the list. Also, at this time any task with a recorded date of completion would be removed from the list if it had a date older than the discard date. The current DCF for any application is 1 if there is no DCF recorded for any application for the project, the maximum DCF for the application if there is a DCF recorded for the application, and the maximum DCF recorded for the project if it is a new application. For work fetch, using mean DCF is probably what we want to use even if the distribution is not approximately normal. I would suggest that we use the currend DCF calculation for that without the quick rise part of the calculation so the number will drift towards a new solution if the project changes its fpops estimate. jm7 _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
