There is a problem with setting a floor under which the DCF is not moved at all. There have been projects with runtimes of only a few seconds. There is already a slow down in the reduction of the DCF (or was the last time I looked at the code anyway) if the run time was less than 10% of the expected run time.
One possibility is to not count the run time at all towards DCF if the task generates an error that is not just a "short task" error like the s...@h -9 "error". Another thing that would help this and the case of a run of s...@h -9 tasks would be keeping the average DCF for all tasks for an application. This would tend to generate a very good answer eventually, but it might be very wrong near the beginning if the user project started with a hundred or so short tasks. jm7 <boinc_dev-boun...@ssl.berkeley.edu> wrote on 12/17/2009 09:12:42 PM: > Pappa <m...@geeksamazing.com> > Sent by: <boinc_dev-boun...@ssl.berkeley.edu> > > 12/17/2009 09:12 PM > > To > > "'David Anderson'" <da...@ssl.berkeley.edu> > > cc > > "'Stephen Maclagan'" <stephen.macla...@tiscali.co.uk>, > boinc_dev@ssl.berkeley.edu, boinc_al...@ssl.berkeley.edu > > Subject > > Re: [boinc_dev] [boinc_alpha] Maximum time Exceeded on Hybrid ATI > Astropulse app > > I am typing fast before a reboot as I am out of computer resources. > > Currently running 6.10.24 (shortly, I know I need to load 6.10.25) > > This goes along with the discussion about debts. I was in a situation where > on one host which does Seti GPU and NFS CPU (project pairing on an AMD > Host). I ran into a situation where due to the Seti VLAR WU's on GPU, it > exhausted "all" the computer resources. What happened NEXT on the next WU > It errored! No big deal As the CPU run time was about 10+ seconds the > scheduler then adjusts DCF (down) eventually more was requested. Then with > more errors it needs even more work. As things continued to error with "CPU > run" time DCF went even further down. The Runaway condition continued until > Seti Main seeing the Errors told "quota" to shut the machine off. No sanity > check was in place for did the WU complete successfully after some CPU run > time as reported to the scheduler before adjusting DCF. As the scheduler > reported the "completion" (success or failure aside) it needed more work to > continue. DCF continued to go down to request even more work. Then even more > work (death spiral). The End result was that I was stopped from getting more > work (quota), and had around 200 WU's in the cache. > > The machine has an accurate flops guesstimate. As a function of the extended > run time of a VLAR on completion is drives DCF up which is good (sorta but > not the issue) but as I had not rebooted to manually clear computer > resources. Which caused the problem. So with the ATI Apps in Beta it is > still indicative of the issue. > > I set NNT, rebooted and started to let things work themselves out. I am now > down to about 50 WU's and very large Negative Debt. > > 17-Dec-2009 17:02:10 [---] [wfd] target work buffer: 8640.00 + 86400.00 sec > 17-Dec-2009 17:02:10 [s...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > -225286.66 backoff dt 0.00 int 0.00 (no new tasks) (overworked) > 17-Dec-2009 17:02:10 [s...@home] [wfd] overall LTD -2474223.61 > > The only suggestion I could make is "success or failure aside" is if CPU > runtime is less than some value "do not adjust DCF." Application > initialization requires some finite value of CPU runtime (CPU or GPU). Even > if the Application runtime is one minute, you have a basic value resource > type aside. It becomes a Sanity Check. Then you could popup a Dialog Box > (only one please) telling the user there is a problem with Boinc and their > selected project(s). > > Snippet from the log > > 17-Dec-2009 17:02:10 [---] [wfd] ------- start work fetch state ------- > 17-Dec-2009 17:02:10 [---] [wfd] target work buffer: 8640.00 + 86400.00 sec > 17-Dec-2009 17:02:10 [---] [wfd] CPU: shortfall 0.00 nidle 0.00 saturated > 109814.95 busy 0.00 RS fetchable 100.00 runnable 100.00 > 17-Dec-2009 17:02:10 [boincsimap] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [einst...@home] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [...@home] [wfd] CPU: fetch share 1.00 LTD 0.00 backoff > dt 0.00 int 0.00 > 17-Dec-2009 17:02:10 [BOINC alpha test] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [milky...@home] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [nque...@home Project] [wfd] CPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (master fetch pending) (comm deferred) (no new > tasks) (too many uploads) > 17-Dec-2009 17:02:10 [s...@home] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [s...@home Beta Test] [wfd] CPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) (blocked by prefs) > 17-Dec-2009 17:02:10 [spinhe...@home] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [GPUGRID] [wfd] CPU: fetch share 0.00 LTD 0.00 backoff > dt 0.00 int 0.00 (no new tasks) (blocked by prefs) > 17-Dec-2009 17:02:10 [---] [wfd] NVIDIA GPU: shortfall 0.00 nidle 0.00 > saturated 254092.10 busy 0.00 RS fetchable 100.00 runnable 100.00 > 17-Dec-2009 17:02:10 [boincsimap] [wfd] NVIDIA GPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [einst...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [...@home] [wfd] NVIDIA GPU: fetch share 1.00 LTD 0.00 > backoff dt 0.00 int 0.00 > 17-Dec-2009 17:02:10 [BOINC alpha test] [wfd] NVIDIA GPU: fetch share 0.00 > LTD 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [milky...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [nque...@home Project] [wfd] NVIDIA GPU: fetch share > 0.00 LTD 0.00 backoff dt 0.00 int 0.00 (master fetch pending) (comm > deferred) (no new tasks) (too many uploads) > 17-Dec-2009 17:02:10 [s...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > -225286.66 backoff dt 0.00 int 0.00 (no new tasks) (overworked) > 17-Dec-2009 17:02:10 [s...@home Beta Test] [wfd] NVIDIA GPU: fetch share > 0.00 LTD 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [spinhe...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [GPUGRID] [wfd] NVIDIA GPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [boincsimap] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [einst...@home] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [...@home] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [BOINC alpha test] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [milky...@home] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [nque...@home Project] [wfd] overall LTD 0.00 > Currently running 6.10.24 (shortly, I know I need to load 6.10.25) > > This goes along with the discussion about debts. I was in a situation where > on one host which does Seti GPU and NFS CPU (project pairing on an AMD > Host). I ran into a situation where due to the Seti VLAR WU's on GPU, it > exhausted "all" the computer resources. What happened NEXT on the next WU > It errored! No big deal As the CPU run time was about 10+ seconds the > scheduler then adjusts DCF (down) eventually more was requested. Then with > more errors it needs even more work. As things continued to error with CPU > run time DCF went even further down. The Runaway condition continued until > Seti Main seeing the Errors told "quota" to shut the machine off. No sanity > check was in place for did the WU complete successfully after some CPU run > time as reported to the scheduler before adjusting DCF. As the scheduler > reported the "completion" (success or failure aside) it needed more work to > continue. DCF continued to go down to request even more work. Then even more > work (death spiral). The End result was that I was stopped from getting more > work, and had around 200 WU's in the cache. > > The machine has an accurate flops guesstimate. As a function of the extended > run time of a VLAR on completion is drives DCF up which is good (sorta but > not the issue) but as I had not rebooted to manually clear computer > resources. Which caused the problem. So with the ATI Apps in Beta it is > still indicative of the issue. > > I set NNT, rebooted and started to let things work themselves out. I am now > down to about 50 WU's and very large Negative Debt. > > 17-Dec-2009 17:02:10 [---] [wfd] target work buffer: 8640.00 + 86400.00 sec > 17-Dec-2009 17:02:10 [s...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > -225286.66 backoff dt 0.00 int 0.00 (no new tasks) (overworked) > 17-Dec-2009 17:02:10 [s...@home] [wfd] overall LTD -2474223.61 > > The only suggestion I could make is "success or failute aside" is if CPU > runtime is less than some value do not adjust DCF. Application > initialization requires some finite value of CPU runtime (CPU or GPU). So > even if the Application runtime is one minute, you have a basic value. It > becomes a Sanity Check. Then you could popup a Dialog Box (only one please) > telling the user there is a problem with Boinc and their selected > project(s). > > 17-Dec-2009 17:02:10 [---] [wfd] ------- start work fetch state ------- > 17-Dec-2009 17:02:10 [---] [wfd] target work buffer: 8640.00 + 86400.00 sec > 17-Dec-2009 17:02:10 [---] [wfd] CPU: shortfall 0.00 nidle 0.00 saturated > 109814.95 busy 0.00 RS fetchable 100.00 runnable 100.00 > 17-Dec-2009 17:02:10 [boincsimap] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [einst...@home] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [...@home] [wfd] CPU: fetch share 1.00 LTD 0.00 backoff > dt 0.00 int 0.00 > 17-Dec-2009 17:02:10 [BOINC alpha test] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [milky...@home] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [nque...@home Project] [wfd] CPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (master fetch pending) (comm deferred) (no new > tasks) (too many uploads) > 17-Dec-2009 17:02:10 [s...@home] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [s...@home Beta Test] [wfd] CPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) (blocked by prefs) > 17-Dec-2009 17:02:10 [spinhe...@home] [wfd] CPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [GPUGRID] [wfd] CPU: fetch share 0.00 LTD 0.00 backoff > dt 0.00 int 0.00 (no new tasks) (blocked by prefs) > 17-Dec-2009 17:02:10 [---] [wfd] NVIDIA GPU: shortfall 0.00 nidle 0.00 > saturated 254092.10 busy 0.00 RS fetchable 100.00 runnable 100.00 > 17-Dec-2009 17:02:10 [boincsimap] [wfd] NVIDIA GPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [einst...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [...@home] [wfd] NVIDIA GPU: fetch share 1.00 LTD 0.00 > backoff dt 0.00 int 0.00 > 17-Dec-2009 17:02:10 [BOINC alpha test] [wfd] NVIDIA GPU: fetch share 0.00 > LTD 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [milky...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [nque...@home Project] [wfd] NVIDIA GPU: fetch share > 0.00 LTD 0.00 backoff dt 0.00 int 0.00 (master fetch pending) (comm > deferred) (no new tasks) (too many uploads) > 17-Dec-2009 17:02:10 [s...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > -225286.66 backoff dt 0.00 int 0.00 (no new tasks) (overworked) > 17-Dec-2009 17:02:10 [s...@home Beta Test] [wfd] NVIDIA GPU: fetch share > 0.00 LTD 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [spinhe...@home] [wfd] NVIDIA GPU: fetch share 0.00 LTD > 0.00 backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [GPUGRID] [wfd] NVIDIA GPU: fetch share 0.00 LTD 0.00 > backoff dt 0.00 int 0.00 (no new tasks) > 17-Dec-2009 17:02:10 [boincsimap] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [einst...@home] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [...@home] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [BOINC alpha test] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [milky...@home] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [nque...@home Project] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [s...@home] [wfd] overall LTD -2474223.61 > 17-Dec-2009 17:02:10 [s...@home Beta Test] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [spinhe...@home] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [GPUGRID] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [---] [wfd] ------- end work fetch state ------- > 17-Dec-2009 17:02:10 [---] [wfd] No project chosen for work fetch > 17-Dec-2009 17:02:29 [boincsimap] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [einst...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [...@home] [debt] CPU LTD 2.24 delta 2.24 (1.00*114.40 > - 112.16) > 17-Dec-2009 17:02:29 [BOINC alpha test] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [milky...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [nque...@home Project] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [s...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [s...@home Beta Test] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [spinhe...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [GPUGRID] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [---] [debt] CPU LTD: adding offset -2.24 > 17-Dec-2009 17:02:29 [...@home] [std_debug] CPU STD delta 2.24 (1.00*114.40 > - 112.16) > 17-Dec-2009 17:02:29 [...@home] [std_debug] CPU STD 0.00 > 17-Dec-2009 17:02:29 [boincsimap] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [einst...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [...@home] [debt] NVIDIA GPU LTD 28.04 delta 28.04 > (0.50*56.08 - 0.00) > 17-Dec-2009 17:02:29 [BOINC alpha test] [debt] NVIDIA GPU ineligible; LTD > 0.00 > 17-Dec-2009 17:02:29 [milky...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [nque...@home Project] [debt] NVIDIA GPU ineligible; > LTD 0.00 > 17-Dec-2009 17:02:29 [s...@home] [debt] NVIDIA GPU LTD -225314.70 delta > -28.04 (0.50*56.08 - 56.08) > 17-Dec-2009 17:02:29 [s...@home Beta Test] [debt] NVIDIA GPU ineligible; LTD > 0.00 > 17-Dec-2009 17:02:29 [spinhe...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [GPUGRID] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [---] [debt] NVIDIA GPU LTD: adding offset -28.04 > 17-Dec-2009 17:02:29 [s...@home] [std_debug] NVIDIA GPU STD delta 0.00 > (1.00*56.08 - 56.08) > 17-Dec-2009 17:02:29 [s...@home] [std_debug] NVIDIA GPU STD 0.00 > 17-Dec-2009 17:02:35 [boincsimap] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [einst...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [...@home] [debt] CPU LTD 0.24 delta 0.24 (1.00*12.27 - > 12.03) > 17-Dec-2009 17:02:35 [BOINC alpha test] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [milky...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [nque...@home Project] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [s...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [s...@home Beta Test] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [spinhe...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [GPUGRID] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [---] [debt] CPU LTD: adding offset -0.24 > 17-Dec-2009 17:02:35 [...@home] [std_debug] CPU STD delta 0.24 (1.00*12.27 - > 12.03) > 17-Dec-2009 17:02:35 [...@home] [std_debug] CPU STD 0.00 > 17-Dec-2009 17:02:35 [boincsimap] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [einst...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [...@home] [debt] NVIDIA GPU LTD 3.01 delta 3.01 > (0.50*6.02 - 0.00) > 17-Dec-2009 17:02:35 [BOINC alpha test] [debt] NVIDIA GPU ineligible; LTD > 0.00 > 17-Dec-2009 17:02:35 [milky...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [nque...@home Project] [debt] NVIDIA GPU ineligible; > LTD 0.00 > 17-Dec-2009 17:02:35 [s...@home] [debt] NVIDIA GPU LTD -225345.74 delta > -3.01 (0.50*6.02 - 6.02) > 17-Dec-2009 17:02:35 [s...@home Beta Test] [debt] NVIDIA GPU ineligible; LTD > 0.00 > 17-Dec-2009 17:02:35 [spinhe...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [GPUGRID] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [---] [debt] NVIDIA GPU LTD: adding offset -3.01 > 17-Dec-2009 17:02:35 [s...@home] [std_debug] NVIDIA GPU STD delta 0.00 > (1.00*6.02 - 6.02) > 17-Dec-2009 17:02:35 [s...@home] [std_debug] NVIDIA GPU STD 0.00 > 17-Dec-2009 17:02:10 [s...@home Beta Test] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [spinhe...@home] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [GPUGRID] [wfd] overall LTD 0.00 > 17-Dec-2009 17:02:10 [---] [wfd] ------- end work fetch state ------- > 17-Dec-2009 17:02:10 [---] [wfd] No project chosen for work fetch > 17-Dec-2009 17:02:29 [boincsimap] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [einst...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [...@home] [debt] CPU LTD 2.24 delta 2.24 (1.00*114.40 > - 112.16) > 17-Dec-2009 17:02:29 [BOINC alpha test] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [milky...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [nque...@home Project] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [s...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [s...@home Beta Test] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [spinhe...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [GPUGRID] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [---] [debt] CPU LTD: adding offset -2.24 > 17-Dec-2009 17:02:29 [...@home] [std_debug] CPU STD delta 2.24 (1.00*114.40 > - 112.16) > 17-Dec-2009 17:02:29 [...@home] [std_debug] CPU STD 0.00 > 17-Dec-2009 17:02:29 [boincsimap] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [einst...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [...@home] [debt] NVIDIA GPU LTD 28.04 delta 28.04 > (0.50*56.08 - 0.00) > 17-Dec-2009 17:02:29 [BOINC alpha test] [debt] NVIDIA GPU ineligible; LTD > 0.00 > 17-Dec-2009 17:02:29 [milky...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [nque...@home Project] [debt] NVIDIA GPU ineligible; > LTD 0.00 > 17-Dec-2009 17:02:29 [s...@home] [debt] NVIDIA GPU LTD -225314.70 delta > -28.04 (0.50*56.08 - 56.08) > 17-Dec-2009 17:02:29 [s...@home Beta Test] [debt] NVIDIA GPU ineligible; LTD > 0.00 > 17-Dec-2009 17:02:29 [spinhe...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [GPUGRID] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:29 [---] [debt] NVIDIA GPU LTD: adding offset -28.04 > 17-Dec-2009 17:02:29 [s...@home] [std_debug] NVIDIA GPU STD delta 0.00 > (1.00*56.08 - 56.08) > 17-Dec-2009 17:02:29 [s...@home] [std_debug] NVIDIA GPU STD 0.00 > 17-Dec-2009 17:02:35 [boincsimap] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [einst...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [...@home] [debt] CPU LTD 0.24 delta 0.24 (1.00*12.27 - > 12.03) > 17-Dec-2009 17:02:35 [BOINC alpha test] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [milky...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [nque...@home Project] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [s...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [s...@home Beta Test] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [spinhe...@home] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [GPUGRID] [debt] CPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [---] [debt] CPU LTD: adding offset -0.24 > 17-Dec-2009 17:02:35 [...@home] [std_debug] CPU STD delta 0.24 (1.00*12.27 - > 12.03) > 17-Dec-2009 17:02:35 [...@home] [std_debug] CPU STD 0.00 > 17-Dec-2009 17:02:35 [boincsimap] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [einst...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [...@home] [debt] NVIDIA GPU LTD 3.01 delta 3.01 > (0.50*6.02 - 0.00) > 17-Dec-2009 17:02:35 [BOINC alpha test] [debt] NVIDIA GPU ineligible; LTD > 0.00 > 17-Dec-2009 17:02:35 [milky...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [nque...@home Project] [debt] NVIDIA GPU ineligible; > LTD 0.00 > 17-Dec-2009 17:02:35 [s...@home] [debt] NVIDIA GPU LTD -225345.74 delta > -3.01 (0.50*6.02 - 6.02) > 17-Dec-2009 17:02:35 [s...@home Beta Test] [debt] NVIDIA GPU ineligible; LTD > 0.00 > 17-Dec-2009 17:02:35 [spinhe...@home] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [GPUGRID] [debt] NVIDIA GPU ineligible; LTD 0.00 > 17-Dec-2009 17:02:35 [---] [debt] NVIDIA GPU LTD: adding offset -3.01 > 17-Dec-2009 17:02:35 [s...@home] [std_debug] NVIDIA GPU STD delta 0.00 > (1.00*6.02 - 6.02) > 17-Dec-2009 17:02:35 [s...@home] [std_debug] NVIDIA GPU STD 0.00 > > > -----Original Message----- > From: David Anderson [mailto:da...@ssl.berkeley.edu] > Sent: Thursday, December 17, 2009 3:16 PM > To: Pappa > Cc: 'Stephen Maclagan'; boinc_al...@ssl.berkeley.edu; > boinc_dev@ssl.berkeley.edu > Subject: Re: [boinc_alpha] Maximum time Exceeded on Hybrid ATI Astropulse > app > > We'll reduce the scheduler's FLOPS estimate. > Currently the estimate is (peak GPU FLOPS)/5. > Does anyone have a suggestion for what it should be? > Seems like it should reflect both CPU and GPU speed. > > -- David > > Pappa wrote: > > The other side effect that has not been fully explored, there were several > > machines that received over a hundred AP WU's to be errored out. The > thought > > is if it can not determine a proper estimate of run time from the GPU > flops > > and DCF. Only Quota will stop the runaway host. > > > > http://setiweb.ssl.berkeley.edu/beta/show_host_detail.php?hostid=42925 > > > http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=42925&offset=100&sho > > w_names=0&state=5 > > > > http://setiweb.ssl.berkeley.edu/beta/show_host_detail.php?hostid=40712 > > > http://setiweb.ssl.berkeley.edu/beta/results.php?hostid=40712&offset=140&sho > > w_names=0&state=5 > > > > > > > > -----Original Message----- > > From: boinc_alpha-boun...@ssl.berkeley.edu > > [mailto:boinc_alpha-boun...@ssl.berkeley.edu] On Behalf Of Stephen > Maclagan > > Sent: Thursday, December 17, 2009 1:23 PM > > To: boinc_al...@ssl.berkeley.edu > > Subject: [boinc_alpha] Maximum time Exceeded on Hybrid ATI Astropulse app > > > > Raistmer's Hybrid ATI Astropulse app has now made it to Seti Beta as a > Stock > > app, it does only some of it's Calculations on the GPU, > > with most of it being done on the CPU, some CPU's have been historically > > poor at doing Astropulse because of their small L2 Caches, ie AMD chips, > > while the C2D with the Larger caches have been a lot faster, at moment we > > are starting to seeing some of the AMD's running into maximum time > exceeded, > > because it'll be GPU flops that taken into account when the tasks get > > aborted, > > There's also an i7 920 with two HD5800's also running into maximum time > > exceeded as well, because it has the newest and fastest ATI cards out, > > while two other i7 920's with lower Spec GPU's can manage to finish the > > tasks O.K, > > > > This was cured in Boinc 6.10.14 with: > > > >> - client: if anonymous platform description (app_info.xml) doesn't > specify > > FLOPS for a GPU app, assume that it runs at CPU peak speed rather than GPU > > peak speed. Better to be conservative, otherwise job might be >aborted due > > to time limit exceeded. > > > > How can it be cured again, now the Hybrid ATI Astropulse app is no longer > > using an app_info? > > All the hosts getting aborted tasks are running 6.10.18, and most of the > > rest are 6.10.18 or newer. > > > > See this post for lots of info: > > > http://setiweb.ssl.berkeley.edu/beta/forum_thread.php?id=1601&nowrap=true#38 > > 531 > > > > Claggy > > _______________________________________________ > > boinc_alpha mailing list > > boinc_al...@ssl.berkeley.edu > > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_alpha > > To unsubscribe, visit the above URL and > > (near bottom of page) enter your email address. > > > > _______________________________________________ > > boinc_alpha mailing list > > boinc_al...@ssl.berkeley.edu > > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_alpha > > To unsubscribe, visit the above URL and > > (near bottom of page) enter your email address. > > > _______________________________________________ > boinc_dev mailing list > boinc_dev@ssl.berkeley.edu > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. > _______________________________________________ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.