On Apr 27, 2009, at 12:42 PM, [email protected] wrote:
> We need to run the check for what should be run on a variety of
> events and
> at least once every task switch interval.
An argument you have made before, and not defended with any evidence,
if I may return the favor.
Your only statement is that it doesn't hurt anything to do the test.
I should put you in the situation of trying to prove the negative ...
but I won't. But, if it does not hurt anything to waste the time,
then I can also use that argument to fight for more accurate
benchmarks run every hour. Won't hurt to run them... and we will have
more accurate numbers... and they don't take that long to run ...
heck, why not on the half hour? even better ...
> We need to do schedule enforcement on a different set of events, one
> of
> which is going to occur very frequently - checkpoint.
And all we need to do is to check to see if the application that has
just made a checkpoint is at TSI. There is no, repeat no need to do
anything else. Yet we do the full monte.
I just spent about 6 hours running a test on 5 systems and fiddling
with the logs:
Xeon 32, Dual Xeons with HT for 4 CPUs, running XP Pro 32-Bit, around
50 projects, 1 CUDA (9800GT), about 3-4,000 CS per day w/o CUDA (if
memory serves)
Xeon 64, Dual Xeons with HT for 4 CPUs, running XP Pro 64-Bit, ABC and
YoYo, 1 ATI GPU on MW, about 3.5-4,500 CS per day w/o CUDA (if memory
serves)
Mac Pro, OS-X, 8 Cores, Xeons, about 25 projects, no GPU
Q9300, XP Pro 32-bit, 4 cores, standard 50 projects, 1 CUDA (GTX280)
i7 940, 8 CPUs, XP Pro 32-bit, standard 50 projects, 4 CUDA (2 GTX 295
cards) about twice as fast as the Q9300 ~10-12,000 CS w/o CUDA (if
memory serves)
I ran a log on these systems for 3 hours, I was away and they all used
the same cc_config file. I don't feel that this is as typical as my
normal because of other issues I have raised (and seemingly ignored
there also); but at any rate I got these results:
Xeon 32 Xeon 64 Mac Pro Q9300 i7
enforce_schedule(): start 241 387 272 262 407
Request CPU reschedule: application exited 3 11 14 22
19
Request CPU reschedule: Core client configuration 1 1 1
1 1
Request CPU reschedule: files downloaded 1 9 15 26
21
Request CPU reschedule: handle_finished_apps 3 11 14 22
19
Request CPU reschedule: Idle state change 2 2 2 2
6
Request CPU reschedule: Scheduling period elapsed 0 4 0
0
Request enforce CPU schedule: Checkpoint reached 86 379 129
92 302
Request enforce CPU schedule: schedule_cpus 7 17 25 47
44
schedule_cpus(): start 7 17 25 48 44
So, you tell me, how does juggling the schedule this often for these
many reasons make sense?
I care less about an excuse we have to check after checkpoint... why?
My TSI is 12 hours... 99% of the tasks should run to completion before
hitting TSI.
I also don't buy the after download argument. If Work Fetch is
working correctly I should not be fetching more work than can be
handled. If we are, obsessing in the Resource Scheduler is not the
way to approach the situation of a broken work fetch algorithm.
And, I point out again, had I been running SaH I would have completed
about 15 more tasks on the Xeon 32, 18-20 on the Q9300, and 72 on the
i7 ... also were MW still issuing reliable work I was doing their
tasks in about 10 seconds per (60K CS per day that machine alone,
Xeon64)
The super deadline project does not exist, it died because it had
unrealistic expectations. So why do you keep using it as an excuse to
defend the indefensible? I really don't get it.
Under what scenario do we really, really need to reschedule every
resource in a system 2 times a minute, or even more often than that...
When a resource comes free at task completion or at expire of TSI, I
can buy rescheduling the one resource... but not the whole system ...
even windows does not do that ...
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.