David Anderson wrote:
> Please set <cpu_sched_debug> and send message log
That's going to be a lot of output...
One brief snapshot is attached showing a s...@h task restart by the scheduler.
Background:
cc_config.xml:
<cc_config>
<options>
<save_stats_days>365</save_stats_days>
</options>
<log_flags>
<http_debug>0</http_debug>
<work_fetch_debug>0</work_fetch_debug>
<debt_debug>0</debt_debug>
<sched_op_debug>0</sched_op_debug>
<cpu_sched_debug>1</cpu_sched_debug>
</log_flags>
</cc_config>
This is a 4 cpu core PC with one nVidia GTS 250 512MB CUDA graphics card.
I have the following projects/shares (cpdn is idle and set "no new work"):
egrep '((project_name)|(debt)|(_share))' client_state.xml
<project_name>cosmol...@home</project_name>
<short_term_debt>-14653.551452</short_term_debt>
<long_term_debt>-39543.342118</long_term_debt>
<cuda_debt>-56.903380</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>5.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>climateprediction.net</project_name>
<short_term_debt>0.000000</short_term_debt>
<long_term_debt>0.000000</long_term_debt>
<cuda_debt>0.000000</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>40.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>a...@home</project_name>
<short_term_debt>0.000000</short_term_debt>
<long_term_debt>-61474.936649</long_term_debt>
<cuda_debt>-76165.322433</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>5.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>lhcathome</project_name>
<short_term_debt>0.000000</short_term_debt>
<long_term_debt>-78934.047863</long_term_debt>
<cuda_debt>-28.775104</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>20.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>a...@home</project_name>
<short_term_debt>-3826.820241</short_term_debt>
<long_term_debt>-99667.540401</long_term_debt>
<cuda_debt>-37867.724068</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>5.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>Artificial Intelligence System</project_name>
<short_term_debt>0.000000</short_term_debt>
<long_term_debt>-1729003.855207</long_term_debt>
<cuda_debt>-82841.400267</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>10.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>or...@home</project_name>
<short_term_debt>0.000000</short_term_debt>
<long_term_debt>-81245.498066</long_term_debt>
<cuda_debt>-39.828411</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>10.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>Collatz Conjecture</project_name>
<short_term_debt>-1999.030270</short_term_debt>
<long_term_debt>-97879.493556</long_term_debt>
<cuda_debt>0.000000</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>5.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>milky...@home</project_name>
<short_term_debt>0.000000</short_term_debt>
<long_term_debt>-85576.225036</long_term_debt>
<cuda_debt>-30.553987</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>5.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>s...@home</project_name>
<short_term_debt>86400.000000</short_term_debt>
<long_term_debt>0.000000</long_term_debt>
<cuda_debt>-103507.984849</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>5.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>PrimeGrid</project_name>
<short_term_debt>-60416.926068</short_term_debt>
<long_term_debt>-101914.403636</long_term_debt>
<cuda_debt>-45.959851</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>5.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
<project_name>boincsimap</project_name>
<short_term_debt>-5505.256030</short_term_debt>
<long_term_debt>0.000000</long_term_debt>
<cuda_debt>0.000000</cuda_debt>
<ati_debt>0.000000</ati_debt>
<resource_share>20.000000</resource_share>
<ams_resource_share>0.000000</ams_resource_share>
Small section of Boinc log attached.
> Martin wrote:
>> This has just got to be "non-optimal"...
>>
>> I run s...@h at a low priority amongst other projects that have a higher
>> priority.
>>
>> I'm running the nVidia CUDA GPU application for s...@h using the anonymous
>> platform. There is no CPU application specified for s...@h.
>>
>> No other projects utilise the GPU.
>>
>> My intention is that to maximise utilisation of the hardware, s...@h is to
>> be run only on the GPU, whilst in parallel the other non-GPU projects
>> are run on the CPU.
>>
>>
>> On first trying the s...@h GPU application, all ran fine and continuously.
>>
>> Now, after a few days, s...@h is running only periodically despite s...@h WUs
>> queued up waiting to run.
>>
>> The CPU ustilisation is less than 1% for the s...@h GPU application.
>>
>>
>> Is this a case of s...@h debt freezing it out? Even though that makes the
>> GPU idle?
--
--------------------
Martin Lomas
m_boincdev ml1 co uk.ddSPAM.dd
--------------------
19-Nov-2009 13:18:21 [---] [cpu_sched_debug] Request CPU reschedule: Idle state
change
19-Nov-2009 13:18:21 [---] [cpu_sched_debug] schedule_cpus(): start
19-Nov-2009 13:18:21 [s...@home] [cpu_sched_debug] scheduling
12au09ac.4517.22976.13.10.34_3 (coprocessor job, FIFO)
19-Nov-2009 13:18:21 [---] [cpu_sched_debug] reserving 1.000000 of coproc CUDA
19-Nov-2009 13:18:21 [Collatz Conjecture] [cpu_sched_debug] highest debt:
-1878.815032 collatz_1258317870_215942_1
19-Nov-2009 13:18:21 [Collatz Conjecture] [cpu_sched_debug] scheduling
collatz_1258317870_215942_1 (CPU job, debt order)
19-Nov-2009 13:18:21 [...@home] [cpu_sched_debug] highest debt: -3706.605003
abc_sieve_wu_00346805_1
19-Nov-2009 13:18:21 [...@home] [cpu_sched_debug] scheduling
abc_sieve_wu_00346805_1 (CPU job, debt order)
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] highest debt: -5495.297491
9111601.078859_1
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] scheduling 9111601.078859_1
(CPU job, debt order)
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] highest debt: -12295.297491
9111601.079450_1
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] scheduling 9111601.079450_1
(CPU job, debt order)
19-Nov-2009 13:18:21 [---] [cpu_sched_debug] Request enforce CPU schedule:
schedule_cpus
19-Nov-2009 13:18:21 [---] [cpu_sched_debug] enforce_schedule(): start
19-Nov-2009 13:18:21 [---] [cpu_sched_debug] preliminary job list:
19-Nov-2009 13:18:21 [s...@home] [cpu_sched_debug] 0:
12au09ac.4517.22976.13.10.34_3 (MD: no; UTS: no)
19-Nov-2009 13:18:21 [Collatz Conjecture] [cpu_sched_debug] 1:
collatz_1258317870_215942_1 (MD: no; UTS: yes)
19-Nov-2009 13:18:21 [...@home] [cpu_sched_debug] 2: abc_sieve_wu_00346805_1
(MD: no; UTS: yes)
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] 3: 9111601.078859_1 (MD:
no; UTS: yes)
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] 4: 9111601.079450_1 (MD:
no; UTS: yes)
19-Nov-2009 13:18:21 [---] [cpu_sched_debug] final job list:
19-Nov-2009 13:18:21 [s...@home] [cpu_sched_debug] 0:
12au09ac.4517.22976.13.10.34_3
19-Nov-2009 13:18:21 [Collatz Conjecture] [cpu_sched_debug] 1:
collatz_1258317870_215942_1
19-Nov-2009 13:18:21 [...@home] [cpu_sched_debug] 2: abc_sieve_wu_00346805_1
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] 3: 9111601.078859_1
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] 4: 9111601.079450_1
19-Nov-2009 13:18:21 [s...@home] [cpu_sched_debug] scheduling
12au09ac.4517.22976.13.10.34_3
19-Nov-2009 13:18:21 [Collatz Conjecture] [cpu_sched_debug] scheduling
collatz_1258317870_215942_1
19-Nov-2009 13:18:21 [...@home] [cpu_sched_debug] scheduling
abc_sieve_wu_00346805_1
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] scheduling 9111601.078859_1
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] scheduling 9111601.079450_1
19-Nov-2009 13:18:21 [PrimeGrid] [cpu_sched_debug] psp_llr_40746063_5 sched
state 1 next 1 task state 0
19-Nov-2009 13:18:21 [cosmol...@home] [cpu_sched_debug] wu_110909_101603_1_1_0
sched state 1 next 1 task state 0
19-Nov-2009 13:18:21 [...@home] [cpu_sched_debug] abc_sieve_wu_00346805_1 sched
state 2 next 2 task state 1
19-Nov-2009 13:18:21 [cosmol...@home] [cpu_sched_debug] wu_110909_101607_0_0_0
sched state 1 next 1 task state 0
19-Nov-2009 13:18:21 [s...@home] [cpu_sched_debug]
12au09ac.4517.22976.13.10.34_3 sched state 1 next 2 task state 0
19-Nov-2009 13:18:21 [Collatz Conjecture] [cpu_sched_debug]
collatz_1258317870_215942_1 sched state 2 next 2 task state 1
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] 9111601.078859_1 sched
state 2 next 2 task state 1
19-Nov-2009 13:18:21 [boincsimap] [cpu_sched_debug] 9111601.079450_1 sched
state 2 next 2 task state 1
19-Nov-2009 13:18:21 [s...@home] Restarting task 12au09ac.4517.22976.13.10.34_3
using setiathome_enhanced version 608
19-Nov-2009 13:18:21 [---] [cpu_sched_debug] enforce_schedule: end
19-Nov-2009 13:19:21 [---] [cpu_sched_debug] enforce_schedule(): start
19-Nov-2009 13:19:21 [---] [cpu_sched_debug] preliminary job list:
19-Nov-2009 13:19:21 [s...@home] [cpu_sched_debug] 0:
12au09ac.4517.22976.13.10.34_3 (MD: no; UTS: no)
19-Nov-2009 13:19:21 [Collatz Conjecture] [cpu_sched_debug] 1:
collatz_1258317870_215942_1 (MD: no; UTS: yes)
19-Nov-2009 13:19:21 [...@home] [cpu_sched_debug] 2: abc_sieve_wu_00346805_1
(MD: no; UTS: yes)
19-Nov-2009 13:19:21 [boincsimap] [cpu_sched_debug] 3: 9111601.078859_1 (MD:
no; UTS: yes)
19-Nov-2009 13:19:21 [boincsimap] [cpu_sched_debug] 4: 9111601.079450_1 (MD:
no; UTS: yes)
19-Nov-2009 13:19:21 [---] [cpu_sched_debug] final job list:
19-Nov-2009 13:19:21 [s...@home] [cpu_sched_debug] 0:
12au09ac.4517.22976.13.10.34_3
19-Nov-2009 13:19:21 [Collatz Conjecture] [cpu_sched_debug] 1:
collatz_1258317870_215942_1
19-Nov-2009 13:19:21 [...@home] [cpu_sched_debug] 2: abc_sieve_wu_00346805_1
19-Nov-2009 13:19:21 [boincsimap] [cpu_sched_debug] 3: 9111601.078859_1
19-Nov-2009 13:19:21 [boincsimap] [cpu_sched_debug] 4: 9111601.079450_1
19-Nov-2009 13:19:21 [s...@home] [cpu_sched_debug] scheduling
12au09ac.4517.22976.13.10.34_3
19-Nov-2009 13:19:21 [Collatz Conjecture] [cpu_sched_debug] scheduling
collatz_1258317870_215942_1
19-Nov-2009 13:19:21 [...@home] [cpu_sched_debug] scheduling
abc_sieve_wu_00346805_1
19-Nov-2009 13:19:21 [boincsimap] [cpu_sched_debug] scheduling 9111601.078859_1
19-Nov-2009 13:19:21 [boincsimap] [cpu_sched_debug] scheduling 9111601.079450_1
19-Nov-2009 13:19:21 [PrimeGrid] [cpu_sched_debug] psp_llr_40746063_5 sched
state 1 next 1 task state 0
19-Nov-2009 13:19:21 [cosmol...@home] [cpu_sched_debug] wu_110909_101603_1_1_0
sched state 1 next 1 task state 0
19-Nov-2009 13:19:21 [...@home] [cpu_sched_debug] abc_sieve_wu_00346805_1 sched
state 2 next 2 task state 1
19-Nov-2009 13:19:21 [cosmol...@home] [cpu_sched_debug] wu_110909_101607_0_0_0
sched state 1 next 1 task state 0
19-Nov-2009 13:19:21 [s...@home] [cpu_sched_debug]
12au09ac.4517.22976.13.10.34_3 sched state 2 next 2 task state 1
19-Nov-2009 13:19:21 [Collatz Conjecture] [cpu_sched_debug]
collatz_1258317870_215942_1 sched state 2 next 2 task state 1
19-Nov-2009 13:19:21 [boincsimap] [cpu_sched_debug] 9111601.078859_1 sched
state 2 next 2 task state 1
19-Nov-2009 13:19:21 [boincsimap] [cpu_sched_debug] 9111601.079450_1 sched
state 2 next 2 task state 1
19-Nov-2009 13:19:21 [---] [cpu_sched_debug] enforce_schedule: end
19-Nov-2009 13:20:21 [---] [cpu_sched_debug] enforce_schedule(): start
19-Nov-2009 13:20:21 [---] [cpu_sched_debug] preliminary job list:
19-Nov-2009 13:20:21 [s...@home] [cpu_sched_debug] 0:
12au09ac.4517.22976.13.10.34_3 (MD: no; UTS: no)
19-Nov-2009 13:20:21 [Collatz Conjecture] [cpu_sched_debug] 1:
collatz_1258317870_215942_1 (MD: no; UTS: yes)
19-Nov-2009 13:20:21 [...@home] [cpu_sched_debug] 2: abc_sieve_wu_00346805_1
(MD: no; UTS: yes)
19-Nov-2009 13:20:21 [boincsimap] [cpu_sched_debug] 3: 9111601.078859_1 (MD:
no; UTS: yes)
19-Nov-2009 13:20:21 [boincsimap] [cpu_sched_debug] 4: 9111601.079450_1 (MD:
no; UTS: yes)
19-Nov-2009 13:20:21 [---] [cpu_sched_debug] final job list:
19-Nov-2009 13:20:21 [s...@home] [cpu_sched_debug] 0:
12au09ac.4517.22976.13.10.34_3
19-Nov-2009 13:20:21 [Collatz Conjecture] [cpu_sched_debug] 1:
collatz_1258317870_215942_1
19-Nov-2009 13:20:21 [...@home] [cpu_sched_debug] 2: abc_sieve_wu_00346805_1
19-Nov-2009 13:20:21 [boincsimap] [cpu_sched_debug] 3: 9111601.078859_1
19-Nov-2009 13:20:21 [boincsimap] [cpu_sched_debug] 4: 9111601.079450_1
19-Nov-2009 13:20:21 [s...@home] [cpu_sched_debug] scheduling
12au09ac.4517.22976.13.10.34_3
19-Nov-2009 13:20:21 [Collatz Conjecture] [cpu_sched_debug] scheduling
collatz_1258317870_215942_1
19-Nov-2009 13:20:21 [...@home] [cpu_sched_debug] scheduling
abc_sieve_wu_00346805_1
19-Nov-2009 13:20:21 [boincsimap] [cpu_sched_debug] scheduling 9111601.078859_1
19-Nov-2009 13:20:21 [boincsimap] [cpu_sched_debug] scheduling 9111601.079450_1
19-Nov-2009 13:20:21 [PrimeGrid] [cpu_sched_debug] psp_llr_40746063_5 sched
state 1 next 1 task state 0
19-Nov-2009 13:20:21 [cosmol...@home] [cpu_sched_debug] wu_110909_101603_1_1_0
sched state 1 next 1 task state 0
19-Nov-2009 13:20:21 [...@home] [cpu_sched_debug] abc_sieve_wu_00346805_1 sched
state 2 next 2 task state 1
19-Nov-2009 13:20:21 [cosmol...@home] [cpu_sched_debug] wu_110909_101607_0_0_0
sched state 1 next 1 task state 0
19-Nov-2009 13:20:21 [s...@home] [cpu_sched_debug]
12au09ac.4517.22976.13.10.34_3 sched state 2 next 2 task state 1
19-Nov-2009 13:20:21 [Collatz Conjecture] [cpu_sched_debug]
collatz_1258317870_215942_1 sched state 2 next 2 task state 1
19-Nov-2009 13:20:21 [boincsimap] [cpu_sched_debug] 9111601.078859_1 sched
state 2 next 2 task state 1
19-Nov-2009 13:20:21 [boincsimap] [cpu_sched_debug] 9111601.079450_1 sched
state 2 next 2 task state 1
19-Nov-2009 13:20:21 [---] [cpu_sched_debug] enforce_schedule: end
_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.