Re: [boinc_dev] CPU prioritization.

Paul D. Buck Fri, 15 Jan 2010 06:44:11 -0800

Maybe, but much of the rest of your discussion are points I have been trying to 
make on my 4 core and better machines.


For example your point #1 below has been one I have been making for over a 
year.  On multi-core machines, particularly 4 core and better (and the higher 
the core count the less the preemptions make sense), preemption of tasks occurs 
far more often than it should.  One of the causes of this has been addressed in 
6.10.29 I suppose from the situation I saw arise on IBERCIVIS the other day ...

One of the simpler ways to address this would be to address the interval 
between running the scheduling and enforce.  For example, on high core count 
machines and with work on hand I certainly do question the point of 
rescheduling everything on the completion of each download.

I more or less have proved this with a TSi set to 12 hours and I don't miss 
deadlines.

On Jan 15, 2010, at 6:25 AM, [email protected] wrote:

> Doesn't help dual core machines.  It is on a dual that the problem is
> currently happening.
> 
> jm7
> 
> On Jan 15, 2010, at 6:15 AM, [email protected] wrote:

> The problem with least slack is that we do not always have good estimates,
> and sometimes a tasks run time is grossly over estimated, and the slack is
> negative.  Thus all tasks will have to have a negative slack to run, thus
> all tasks with an earlier deadline are condemned to be late.  Least slack
> may work if the run times are known exactly, but it has a tendency to come
> apart when the slack times are not known very well.  Least slack will not
> solve the problem I am seeing on my machine at the moment.
> 
> The EDF simulation was suggested in order to not do preemption in many
> cases where it was not needed.  EDF is still needed in these cases, but
> preemption is not.  There are a couple of reasons for this.
> 
> 1)  I have watched CPDN get 15 second time slices because tasks from a
> different project needed EDF immediately after download.  They would
> complete on time if they waited for something else to finish its normal
> time slot, but they do preempt and get instant CPU.
> 2)  I have a large number of tasks with 0 to 5 seconds of CPU time left
> from the other project taking up swap file space.  These could very well
> have been completed.  Note that when a task is running, the project is
> getting 100% of a CPU, even if that is more than the resource fraction for
> that project.  This means that the task will exit the deadline danger
> sometime during its run even without an overestimate of the run time.

My post:
> 
> Could we not get closer to this with with a simpler modification?
> 
> As I understand the system as it now stands the MP tasks gets **ALL** CPUs
> when they run.  What would be wrong with throttling this down by one or
> more CPUs so that, for example, on an 8 CPU system the task only gets 6 or
> 7 CPUs leaving the others free for use by other tasks.
> 
> I suggested this before as a preference setting where the participant can
> set not only the total number of CPUs to be used by BOINC, but also can set
> the total number applied to any one MP task.  This could be as it is for
> the current setting a percentage.  So, I could say that no MP task could
> use more than 50% of the CPUs... on my quad that would mean that it could
> use 2 and on the i7s it could only use 4 cores.
> 
> At 75% it would be 3 and 6 respectfully.
> 
> I mean, the same task on my quad would get 4 cores while on the i7 it would
> get 8 ... why is it wrong to restrict the task on the i7 to 4 cores?
> 
> On Jan 15, 2010, at 5:59 AM, [email protected] wrote:
> 
>> I understand very well that bin packing is NP complete.  However, we
>> usually do not have exactly enough work to fill the bins to capacity and
> so
>> have quite a bit of slack to play with.  The key is to use that slack to
>> our best advantage.
>> 
>> Yes, EDF all the time is a pitfall that needs to be avoided as there are
>> scenarios where long running projects get starved.  Note that in this
> case,
>> work fetch would be stopped until nearly the end of the scenario, and no
>> new work would be coming in.
>> 
>> Because the problem is NP complete it is very difficult to prove that
> there
>> is no case where there is a disaster with the proposed change.  However,
>> heuristics are all about taking the probabilities of scenarios and
>> attempting to figure out which tool to use in each scenario.\\
>> 
>> Since the scenario that I indicated is actually happening on one of my
>> machines only a few weeks after the change to the scheduler, it has to be
>> at least moderately common.
>> 
>> If there is a set of tasks in deadline trouble, we need to run those
> tasks
>> in EDF in order to meet deadlines.  Giving multi CPU tasks higher
> priority
>> in this list because they are multi CPU tasks is a way of ensuring an
>> arbitrary number of tasks are an arbitrary amount of time late.  If the
>> scenario was such that the multi CPU task had an earlier deadline than
> one
>> of the single CPU tasks, the multi CPU task would run in preference to
> that
>> single CPU task.
>> 
>> The only scenarios that I can think of that get into trouble running
> single
>> CPU task with an earlier deadline in preference to a multiple CPU task
> with
>> a later deadline where both are in deadline trouble that I can think of
>> also have the feature that there is no way at all to do the packing in
> such
>> a way that both deadlines can be met under any circumstances.  i.e.
>> impossible situations that we need to try to get work fetch to avoid for
>> us.
>> 
>> The CPU scheduling heuristic that we have been using up until multi CPU
>> tasks was added has been:
>> 
>> If there are tasks in deadline trouble
>>   Run tasks in EDF.
>> If there are any CPUs still available
>>   Run some tasks in RR.
>> 
>> Currently it appears to be:
>> 
>> If there is a multi CPU task in deadline trouble
>>   Run the multi CPU task
>> else  If there are any single CPU tasks in deadline trouble
>>   Run the single CPU tasks in EDF.
>> 
>> If there are any CPUs remaining  and there is a multi CPU task that is in
>> the top N tasks
>>   run the multi CPU task.
>> else if there are any CPUs remaining
>>   run some single CPU tasks in RR.
>> 
>> All I am proposing to do is to move back closer to the original heuristic
>> (which seemed to be working fairly well until multi CPU tasks were
> added).
>> The current method of determining what to run has a proven major problem.
>> 
>> BTW, rather than returning work late, I have currently suspended
> processing
>> on the multi CPU task in order to complete the tasks that are due today
>> that have not been started yet.  All CPUs are currently busy doing work,
>> and doing it this way will allow the work to be returned.
>> 
>> jm7
>> 
>> 
>> 
>>            David Anderson
> 
>>            <[email protected]
> 
>>            ey.edu>                                                    To
> 
>>            Sent by:                  [email protected]
> 
>>            <boinc_dev-bounce                                          cc
> 
>>            [email protected]         [email protected]
> 
>>            u>                                                    Subject
> 
>>                                      Re: [boinc_dev] CPU prioritization.
> 
>> 
>>            01/14/2010 04:28
> 
>>            PM
> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> [email protected] wrote:
>>> I was thinking about the problem I was seeing, and I came up with a
>> simple
>>> scenario to demonstrate the problem.
>>> 
>> ...
>> 
>> The underlying problem is that bin-packing
>> (or multiprocessor scheduling) is NP-complete.
>> EDF is non-optimal in some cases on multiprocessors.
>> No matter what scheduling policy we use
>> (short of exponential-time exhaustive search)
>> there will be scenarios where it misses deadlines
>> that could be met by some other schedule.
>> This problem exists whether or not there are multi-CPU jobs.
>> 
>> In the example you give, suppose we reach the point where
>> we have a multi-CPU job and a 1-CPU both in deadline trouble.
>> Things work out better if we run the 1-CPU job.
>> But how do we know this?  What policy says so?
>> If there is such a policy, how do we know it doesn't
>> fail disastrously in other scenarios?
>> _______________________________________________
>> boinc_dev mailing list
>> [email protected]
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
>> 
>> 
>> 
>> _______________________________________________
>> boinc_dev mailing list
>> [email protected]
>> http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
>> To unsubscribe, visit the above URL and
>> (near bottom of page) enter your email address.
> 
> 
> 
> 

_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] CPU prioritization.

Reply via email to