On Apr 25, 2009, at 1:48 AM, Richard Haselgrove wrote:

> Paul D. Buck wrote
>
>> Again, the point being that we still have the issue were BOINC  
>> continually changes its mind about what should be running.
>>
>> One of the causes is that we call the schedulling routine so  
>> often.   But that is not all of it.  Even if we do throttle the  
>> number of calls  to this routine it will not stop the switching  
>> because the deadline  rules are still skee-woof, but it will help  
>> reduce the onset slightly.
>
> Again, I disagree.
>
> 'Request CPU reschedule'. doesn't (shouldn't) automatically mean  
> that a change will happen - I snipped many occurrences from my logs  
> where the result was NOP. 'Request CPU reschedule'.calls a TEST to  
> see IF a reschedule is necessary/appropropriate. Jiggling with the  
> number of times the test is run won't make the slightest difference  
> if the test itself is flawed.

In cpu_sched at lines 812 to 816 we have the "guard" code:

    if (now - last_time > CPU_SCHED_ENFORCE_PERIOD) {
         must_enforce_cpu_schedule = true;
     }
     if (!must_enforce_cpu_schedule) return false;
     must_enforce_cpu_schedule = false;

which SEEMS to intend to limit the rate of call to this routine to  
once per 60 seconds.  I am not clear on this ... I can only find  
"must_enforce_cpu_schedule" 4 times in the file.  And the logic and  
scoping rules *I* know say this should not work at all ... which, may  
be why it does not work at all ...

Whatever the intent it is not working as it should.  For one thing it  
does not limit the entry into this routine to that maximum of once per  
60 seconds ... which would be the wrong rule in any case.  It should  
be once per 60 seconds unless there is an idle resource ...



As proof of my contention that this code is not working look at the  
following "trimmed" log:

4/25/2009 12:17:11 PM           Re-reading cc_config.xml
4/25/2009 12:17:11 PM           [cpu_sched_debug] Request CPU reschedule: Core  
client configuration
4/25/2009 12:17:12 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:12 PM           [cpu_sched_debug] schedule_cpus(): start
4/25/2009 12:17:12 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
schedule_cpus
4/25/2009 12:17:12 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:12 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:17:14 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:14 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:14 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:14 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:17:16 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:16 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:16 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:16 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:16 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:17:19 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:19 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:19 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:17:21 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:21 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:21 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:17:30 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:30 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:30 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:30 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:17:46 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:46 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:46 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:17:48 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:48 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:48 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:17:53 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:53 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:53 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:17:55 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:17:55 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:55 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:18:11 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:18:11 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:12 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:18:17 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:18:17 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:18:17 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:18:17 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:17 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:18:19 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:18:19 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:19 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:18:36 PM           [cpu_sched_debug] Request CPU reschedule:  
application exited
4/25/2009 12:18:36 PM   rose...@home    Computation for task  
frb_0_8_mike_chosen_cst_hb_t331__IGNORE_THE_REST_1DNLA_7_11068_50_0  
finished
4/25/2009 12:18:36 PM           [cpu_sched_debug] Request CPU reschedule:  
handle_finished_apps
4/25/2009 12:18:36 PM           [cpu_sched_debug] schedule_cpus(): start
4/25/2009 12:18:36 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:36 PM   PrimeGrid       Resuming task ap26_4523565_4523568_0_0  
using ap26 version 102
4/25/2009 12:18:36 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:18:37 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:18:37 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:37 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:18:38 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:18:38 PM   rose...@home    Started upload of  
frb_0_8_mike_chosen_cst_hb_t331__IGNORE_THE_REST_1DNLA_7_11068_50_0_0
4/25/2009 12:18:38 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:38 PM           [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:18:39 PM   rose...@home    Sending scheduler request: To fetch  
work.
4/25/2009 12:18:39 PM   rose...@home    Requesting new tasks
4/25/2009 12:18:40 PM           [cpu_sched_debug] Request enforce CPU schedule: 
 
Checkpoint reached
4/25/2009 12:18:40 PM           [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:40 PM           [cpu_sched_debug] enforce_schedule: end

4/25/2009 12:18:42 PM   rose...@home    Finished upload of  
frb_0_8_mike_chosen_cst_hb_t331__IGNORE_THE_REST_1DNLA_7_11068_50_0_0
4/25/2009 12:18:45 PM   rose...@home    Scheduler request completed: got 1  
new tasks
4/25/2009 12:18:45 PM           Re-reading cc_config.xml

In this minute and a half the routine was called ~17 times ... or more  
like once every 10 seconds.

Anyone that wants the full log of this time period I will be happy to  
send it to you ...

If we can just cut down on the lunacy I might be able to start to make  
sense out of the behavior of the system and be able to get better logs  
so we can then pursue the issue of why the rules are so bad.  Now,  
with this obsessive calling to re-schedule I may have to wade through  
100-1,000 of these iterations just to find the key elements.

I know, Richard you don't see as much need for this, but, this is the  
reason I cannot do more on the other end.

Oh, and I won't even mention the waste of compute time doing nothing  
once every 10 seconds ...
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to