On Apr 25, 2009, at 1:48 AM, Richard Haselgrove wrote:
> Paul D. Buck wrote
>
>> Again, the point being that we still have the issue were BOINC
>> continually changes its mind about what should be running.
>>
>> One of the causes is that we call the schedulling routine so
>> often. But that is not all of it. Even if we do throttle the
>> number of calls to this routine it will not stop the switching
>> because the deadline rules are still skee-woof, but it will help
>> reduce the onset slightly.
>
> Again, I disagree.
>
> 'Request CPU reschedule'. doesn't (shouldn't) automatically mean
> that a change will happen - I snipped many occurrences from my logs
> where the result was NOP. 'Request CPU reschedule'.calls a TEST to
> see IF a reschedule is necessary/appropropriate. Jiggling with the
> number of times the test is run won't make the slightest difference
> if the test itself is flawed.
In cpu_sched at lines 812 to 816 we have the "guard" code:
if (now - last_time > CPU_SCHED_ENFORCE_PERIOD) {
must_enforce_cpu_schedule = true;
}
if (!must_enforce_cpu_schedule) return false;
must_enforce_cpu_schedule = false;
which SEEMS to intend to limit the rate of call to this routine to
once per 60 seconds. I am not clear on this ... I can only find
"must_enforce_cpu_schedule" 4 times in the file. And the logic and
scoping rules *I* know say this should not work at all ... which, may
be why it does not work at all ...
Whatever the intent it is not working as it should. For one thing it
does not limit the entry into this routine to that maximum of once per
60 seconds ... which would be the wrong rule in any case. It should
be once per 60 seconds unless there is an idle resource ...
As proof of my contention that this code is not working look at the
following "trimmed" log:
4/25/2009 12:17:11 PM Re-reading cc_config.xml
4/25/2009 12:17:11 PM [cpu_sched_debug] Request CPU reschedule: Core
client configuration
4/25/2009 12:17:12 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:12 PM [cpu_sched_debug] schedule_cpus(): start
4/25/2009 12:17:12 PM [cpu_sched_debug] Request enforce CPU schedule:
schedule_cpus
4/25/2009 12:17:12 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:12 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:17:14 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:14 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:14 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:14 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:17:16 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:16 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:16 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:16 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:16 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:17:19 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:19 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:19 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:17:21 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:21 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:21 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:17:30 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:30 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:30 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:30 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:17:46 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:46 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:46 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:17:48 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:48 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:48 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:17:53 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:53 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:53 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:17:55 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:17:55 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:17:55 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:18:11 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:18:11 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:12 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:18:17 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:18:17 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:18:17 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:18:17 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:17 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:18:19 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:18:19 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:19 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:18:36 PM [cpu_sched_debug] Request CPU reschedule:
application exited
4/25/2009 12:18:36 PM rose...@home Computation for task
frb_0_8_mike_chosen_cst_hb_t331__IGNORE_THE_REST_1DNLA_7_11068_50_0
finished
4/25/2009 12:18:36 PM [cpu_sched_debug] Request CPU reschedule:
handle_finished_apps
4/25/2009 12:18:36 PM [cpu_sched_debug] schedule_cpus(): start
4/25/2009 12:18:36 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:36 PM PrimeGrid Resuming task ap26_4523565_4523568_0_0
using ap26 version 102
4/25/2009 12:18:36 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:18:37 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:18:37 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:37 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:18:38 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:18:38 PM rose...@home Started upload of
frb_0_8_mike_chosen_cst_hb_t331__IGNORE_THE_REST_1DNLA_7_11068_50_0_0
4/25/2009 12:18:38 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:38 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:18:39 PM rose...@home Sending scheduler request: To fetch
work.
4/25/2009 12:18:39 PM rose...@home Requesting new tasks
4/25/2009 12:18:40 PM [cpu_sched_debug] Request enforce CPU schedule:
Checkpoint reached
4/25/2009 12:18:40 PM [cpu_sched_debug] enforce_schedule(): start
4/25/2009 12:18:40 PM [cpu_sched_debug] enforce_schedule: end
4/25/2009 12:18:42 PM rose...@home Finished upload of
frb_0_8_mike_chosen_cst_hb_t331__IGNORE_THE_REST_1DNLA_7_11068_50_0_0
4/25/2009 12:18:45 PM rose...@home Scheduler request completed: got 1
new tasks
4/25/2009 12:18:45 PM Re-reading cc_config.xml
In this minute and a half the routine was called ~17 times ... or more
like once every 10 seconds.
Anyone that wants the full log of this time period I will be happy to
send it to you ...
If we can just cut down on the lunacy I might be able to start to make
sense out of the behavior of the system and be able to get better logs
so we can then pursue the issue of why the rules are so bad. Now,
with this obsessive calling to re-schedule I may have to wade through
100-1,000 of these iterations just to find the key elements.
I know, Richard you don't see as much need for this, but, this is the
reason I cannot do more on the other end.
Oh, and I won't even mention the waste of compute time doing nothing
once every 10 seconds ...
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.