[email protected] wrote:
[...]
> I really don't understand the reasoning of schedule(job) or reschedule.
> There can never be a perfect understanding of what just changed in the
> system because one of the things that changes all of the time is the
> estimated remaining runtime of tasks, and this is one of the items that
> needs to drive the calculation of what is going to miss deadline.  What is
> going to miss deadline depends on all of the other tasks on the host, and a
> single task cannot be isolated from the rest for this test.

There appears to be a phenomenal amount of effort both in programming 
and in scheduler CPU time in trying to meet exactly all deadlines down 
to the last millisecond and for all eventualities.

We should never run a system to be so finely deadline critical. It 
becomes overly difficult and unstable.

Perhaps the deadline exactness and trying to meet all scenarios 
precisely should be relaxed somewhat? "Chill-out" on the scheduler and 
development effort?

In other words, move to a KISS solution?


New rule:

If we are going to accept that the project servers are going to be or 
can be unreasonable, then the client must have the option to be equally 
obnoxious (but completely honest) and reject WUs that are unworkable, 
rather than attempting a critical futility (and failing days later).

Add a bit of margin and then you can have only the TSI, and user 
suspend/release as your scheduler trigger events. The work fetch and 
send just independently keeps the WU cache filled but not overfilled.


The next question is whether it is better/simpler to keep (a linked 
list) FIFO for all resource instances of the same type or a FIFO per 
each resource instance.

Immediately junk unstarted WUs for resend elsewhere if deadline trouble 
ensues, rather than panic to try to scrape them in late.

That will also implement a natural feedback loop for project admins to 
get in tune with what deadlines are reasonable for their WUs. They will 
see by their returns rate whether they are being too aggressive when 
compared to the capability and cache sizes/lengths of their volunteers.

In my view, the present system is wide open for queue jumping abuse by a 
project demanding short deadlines... (Just set a silly short deadline, 
put everyone into EDF, and accept all the results back regardless of 
whenever they come.)


Simple?

Regards,
Martin


-- 
--------------------
Martin Lomas
m_boincdev ml1 co uk.ddSPAM.dd
--------------------
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to