[email protected] wrote: [...] > I really don't understand the reasoning of schedule(job) or reschedule. > There can never be a perfect understanding of what just changed in the > system because one of the things that changes all of the time is the > estimated remaining runtime of tasks, and this is one of the items that > needs to drive the calculation of what is going to miss deadline. What is > going to miss deadline depends on all of the other tasks on the host, and a > single task cannot be isolated from the rest for this test.
There appears to be a phenomenal amount of effort both in programming and in scheduler CPU time in trying to meet exactly all deadlines down to the last millisecond and for all eventualities. We should never run a system to be so finely deadline critical. It becomes overly difficult and unstable. Perhaps the deadline exactness and trying to meet all scenarios precisely should be relaxed somewhat? "Chill-out" on the scheduler and development effort? In other words, move to a KISS solution? New rule: If we are going to accept that the project servers are going to be or can be unreasonable, then the client must have the option to be equally obnoxious (but completely honest) and reject WUs that are unworkable, rather than attempting a critical futility (and failing days later). Add a bit of margin and then you can have only the TSI, and user suspend/release as your scheduler trigger events. The work fetch and send just independently keeps the WU cache filled but not overfilled. The next question is whether it is better/simpler to keep (a linked list) FIFO for all resource instances of the same type or a FIFO per each resource instance. Immediately junk unstarted WUs for resend elsewhere if deadline trouble ensues, rather than panic to try to scrape them in late. That will also implement a natural feedback loop for project admins to get in tune with what deadlines are reasonable for their WUs. They will see by their returns rate whether they are being too aggressive when compared to the capability and cache sizes/lengths of their volunteers. In my view, the present system is wide open for queue jumping abuse by a project demanding short deadlines... (Just set a silly short deadline, put everyone into EDF, and accept all the results back regardless of whenever they come.) Simple? Regards, Martin -- -------------------- Martin Lomas m_boincdev ml1 co uk.ddSPAM.dd -------------------- _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
