Re: [boinc_dev] BOINC 7.0.64 weirdness?

2013-06-05 Thread dball
Hello,

I've seen this on 7.1.1 as well. The machine was attached to a mix of CPU only 
and GPU
projects. It ran normally for a long time but later I switched to the boinc 
manager and
found it looping. It skipped the CPU only projects and only looped through the 
4 GPU
projects asking for 0 seconds of CPU and 0 seconds of GPU work. I stopped and 
restarted
boinc to get it to stop looping asking for 0 seconds of work. It only happened 
that one
time.

David Ball


 Hi!

 FWIW, we get reports of this as well on Einstein@Home, this one is for
 BOINC 7.0.62:

 http://einstein.phys.uwm.edu/forum_thread.php?id=10134nowrap=true#124926

 On the server side it
appears to be a request for 0 seconds of CPU and 0 seconds of GPU work

 Confirmed. We see this in the scheduler log:

 2013-06-03 10:46:02.4045 [PID=30856] Request: [USER#x] [HOST#6119565]
 [IP xxx.xxx.xxx.38] client 7.0.62
 2013-06-03 10:46:02.4187 [PID=30856] [send] effective_ncpus 1
 max_jobs_on_host_cpu 99 max_jobs_on_host 99
 2013-06-03 10:46:02.4187 [PID=30856] [send] effective_ngpus 1
 max_jobs_on_host_gpu 99
 2013-06-03 10:46:02.4187 [PID=30856] [send] Not using matchmaker
 scheduling; Not using EDF sim
 2013-06-03 10:46:02.4187 [PID=30856] [send] CPU: req 0.00 sec, 0.00
 instances; est delay 0.00
 2013-06-03 10:46:02.4187 [PID=30856] [send] CUDA: req 0.00 sec, 0.00
 instances; est delay 0.00
 2013-06-03 10:46:02.4187 [PID=30856] [send] work_req_seconds: 0.00 secs
 2013-06-03 10:46:02.4187 [PID=30856] [send] available disk 23.89 GB,
 work_buf_min 0
 2013-06-03 10:46:02.4187 [PID=30856] [send] active_frac 0.99 on_frac
 0.44 DCF 1.226839
 2013-06-03 10:46:02.4222 [PID=30856] Sending reply to [HOST#6119565]: 0
 results, delay req 60.00
 2013-06-03 10:46:02.4225 [PID=30856] Scheduler ran 0.024 seconds

 The polling interval seems to be once per minute.

 Cheers
 HBE


 -
 Heinz-Bernd Eggenstein
 Max Planck Institute for Gravitational Physics
 Callinstrasse 38
 D-30167 Hannover,  Germany
 Tel.: +49-511-762-19466 (Room 037)



 From:   Eric J Korpela korp...@ssl.berkeley.edu
 To: boinc_dev@ssl.berkeley.edu boinc_dev@ssl.berkeley.edu,
 Date:   06/03/2013 07:17 PM
 Subject:[boinc_dev] BOINC 7.0.64 weirdness?
 Sent by:boinc_dev boinc_dev-boun...@ssl.berkeley.edu



 Some BOINC v7 clients are getting into a weird state where they contact
 the
 server every few minutes to request no work.  I haven't been able to
 reproduce it, but people have reported that it goes away when they select
 read config file, even if they don't have a config file.

 Here's a not very detailed log that someone sent me.  On the server side
 it
 appears to be a request for 0 seconds of CPU and 0 seconds of GPU work
 (essentially the same as requesting an update when no work is required).

 5/31/2013 10:30:23 AM | SETI@home | Starting task
 23jn12ab.24163.21032.12.11.50_1 using setiathome_enhanced version 609
 (cuda23) in slot 2
 5/31/2013 10:30:25 AM | SETI@home | Started upload of
 23oc12ac.25717.18472.12.11.50_0_0
 5/31/2013 10:30:28 AM | SETI@home | Finished upload of
 23oc12ac.25717.18472.12.11.50_0_0
 5/31/2013 11:31:14 AM | SETI@home | Computation for task
 23jn12ab.24163.21032.12.11.50_1 finished
 5/31/2013 11:31:14 AM | SETI@home | Starting task
 23oc12ac.25646.13973.15.12.45_0 using setiathome_v7 version 700 (cuda32)
 in
 slot 2
 5/31/2013 11:31:18 AM | SETI@home | Started upload of
 23jn12ab.24163.21032.12.11.50_1_0
 5/31/2013 11:31:25 AM | SETI@home | Finished upload of
 23jn12ab.24163.21032.12.11.50_1_0
 5/31/2013 11:31:28 AM | SETI@home | Sending scheduler request: To fetch
 work.
 5/31/2013 11:31:28 AM | SETI@home | Reporting 2 completed tasks
 5/31/2013 11:31:28 AM | SETI@home | Not requesting tasks
 5/31/2013 11:31:30 AM | SETI@home | Scheduler request completed
 5/31/2013 11:33:26 AM | SETI@home | Computation for task
 26mr10ab.24819.17826.5.11.138_0 finished
 5/31/2013 11:33:26 AM | SETI@home | Starting task
 23oc12ac.25646.13973.15.12.40_0 using setiathome_v7 version 700 in slot 1
 5/31/2013 11:33:29 AM | SETI@home | Started upload of
 26mr10ab.24819.17826.5.11.138_0_0
 5/31/2013 11:33:32 AM | SETI@home | Finished upload of
 26mr10ab.24819.17826.5.11.138_0_0
 5/31/2013 11:36:35 AM | SETI@home | Sending scheduler request: To fetch
 work.
 5/31/2013 11:36:35 AM | SETI@home | Reporting 1 completed tasks
 5/31/2013 11:36:35 AM | SETI@home | Not requesting tasks
 5/31/2013 11:36:37 AM | SETI@home | Scheduler request completed
 5/31/2013 1:28:09 PM | SETI@home | Sending scheduler request: To fetch
 work.
 5/31/2013 1:28:09 PM | SETI@home | Not requesting tasks
 5/31/2013 1:28:11 PM | SETI@home | Scheduler request completed
 5/31/2013 1:33:15 PM | SETI@home | Sending scheduler request: To fetch
 work.
 5/31/2013 1:33:15 PM | SETI@home | Not requesting tasks
 5/31/2013 1:33:19 PM | SETI@home | Scheduler request completed
 5/31/2013 1:38:23 PM | SETI@home | Sending 

[boinc_dev] Problems with 7.1.1 work fetch on projects set to No new tasks

2013-05-21 Thread dball
Yesterday, I upgraded to 7.1.1 and had to set the constellation project to no 
new
tasks. I would have suspended it but it's work units take about 10 hours and 
it only
had about 2 hours to go on the one work unit it had.

Sometime late yesterday or overnight, it finished that work unit. However, even 
with no
new tasks set, it fetched another work unit at 11:11:41. I've included the 
relevant
section of stdoutdae.txt below. I included enough before and after the work 
fetch so you
can see that it definitely had no new tasks set.

FYI, the reason I set Constellation to no new tasks was because after the 
upgrade to
7.1.1, it started executing Constellation work units as NCI aqain, which it had 
not done
in 7.0.64, although it had this same problem in some earlier versions of 7.0.
Constellation is not NCI so this caused an extra work unit to be executed on my 
C2D
E6420, which slowed down other work units and anything that ran on the system.

I aborted the new Constellation work unit if fetched at 11:11:41 and suspended
Constellation now that it has no work units.

Here is the section of stdoutdae.txt where it fetched a work unit while set to 
no new
tasks :


21-May-2013 11:11:34 [---] [work_fetch] --- start work fetch state ---
21-May-2013 11:11:34 [---] [work_fetch] target work buffer: 83808.00 + 11232.00 
sec
21-May-2013 11:11:34 [---] [work_fetch] --- project states ---
21-May-2013 11:11:34 [The Lattice Project] [work_fetch] REC 0.000 prio 0.00 
can't
req work: suspended via Manager
21-May-2013 11:11:34 [superlinkattechnion] [work_fetch] REC 0.000 prio 0.00 
can't
req work: suspended via Manager
21-May-2013 11:11:34 [MindModeling@Beta] [work_fetch] REC 0.000 prio 0.00 
can't req
work: suspended via Manager
21-May-2013 11:11:34 [Constellation] [work_fetch] REC 24.318 prio -0.00 
can't req
work: no new tasks requested via Manager
21-May-2013 11:11:34 [Docking] [work_fetch] REC 70.270 prio -0.053371 can req 
work
21-May-2013 11:11:34 [malariacontrol.net] [work_fetch] REC 35.233 prio 
-0.053520 can req
work
21-May-2013 11:11:34 [rosetta@home] [work_fetch] REC 28.369 prio -0.055149 can 
req work
21-May-2013 11:11:34 [correlizer] [work_fetch] REC 27.663 prio -0.055464 can 
req work
21-May-2013 11:11:34 [eon2] [work_fetch] REC 14.738 prio -0.055967 can req work
21-May-2013 11:11:34 [World Community Grid] [work_fetch] REC 275.469 prio 
-0.059589 can
req work
21-May-2013 11:11:34 [NumberFields@home] [work_fetch] REC 15.186 prio -0.060085 
can req
work
21-May-2013 11:11:34 [SZTAKI Desktop Grid] [work_fetch] REC 15.104 prio 
-0.063915 can
req work
21-May-2013 11:11:34 [boincsimap] [work_fetch] REC 32.650 prio -0.070539 can 
req work
21-May-2013 11:11:34 [ibercivis] [work_fetch] REC 14.749 prio -0.074577 can req 
work
21-May-2013 11:11:34 [fightmalaria@home] [work_fetch] REC 10.813 prio -0.082123 
can req
work
21-May-2013 11:11:34 [Asteroids@home] [work_fetch] REC 16.048 prio -0.093301 
can req work
21-May-2013 11:11:34 [Milkyway@Home] [work_fetch] REC 11.449 prio -0.180597 can 
req work
21-May-2013 11:11:34 [NFS@Home] [work_fetch] REC 28.673 prio -0.217775 can req 
work
21-May-2013 11:11:34 [LHC@home 1.0] [work_fetch] REC 35.037 prio -0.266106 can 
req work
21-May-2013 11:11:34 [Poem@Home] [work_fetch] REC 4056.596 prio -3.851261 can 
req work
21-May-2013 11:11:34 [SETI@home] [work_fetch] REC 2544.520 prio -5.105209 can 
req work
21-May-2013 11:11:34 [Einstein@Home] [work_fetch] REC 4200.380 prio -8.210770 
can req work
21-May-2013 11:11:34 [PrimeGrid] [work_fetch] REC 1403.029 prio -10.714000 can 
req work
21-May-2013 11:11:34 [---] [work_fetch] --- state for CPU ---
21-May-2013 11:11:34 [---] [work_fetch] shortfall 9834.29 nidle 0.00 saturated 
85205.71
busy 0.00
21-May-2013 11:11:34 [The Lattice Project] [work_fetch] fetch share 0.000
21-May-2013 11:11:34 [superlinkattechnion] [work_fetch] fetch share 0.000
21-May-2013 11:11:34 [MindModeling@Beta] [work_fetch] fetch share 0.000
21-May-2013 11:11:34 [Constellation] [work_fetch] fetch share 0.000
21-May-2013 11:11:34 [Docking] [work_fetch] fetch share 0.124
21-May-2013 11:11:34 [malariacontrol.net] [work_fetch] fetch share 0.062
21-May-2013 11:11:34 [rosetta@home] [work_fetch] fetch share 0.050
21-May-2013 11:11:34 [correlizer] [work_fetch] fetch share 0.050
21-May-2013 11:11:34 [eon2] [work_fetch] fetch share 0.025
21-May-2013 11:11:34 [World Community Grid] [work_fetch] fetch share 0.497
21-May-2013 11:11:34 [NumberFields@home] [work_fetch] fetch share 0.025
21-May-2013 11:11:34 [SZTAKI Desktop Grid] [work_fetch] fetch share 0.025
21-May-2013 11:11:34 [boincsimap] [work_fetch] fetch share 0.050
21-May-2013 11:11:34 [ibercivis] [work_fetch] fetch share 0.025
21-May-2013 11:11:34 [fightmalaria@home] [work_fetch] fetch share 0.012
21-May-2013 11:11:34 [Asteroids@home] [work_fetch] fetch share 0.025
21-May-2013 11:11:34 [Milkyway@Home] [work_fetch] fetch share 0.006
21-May-2013 11:11:34 [NFS@Home] [work_fetch] fetch share 0.012
21-May-2013 

[boinc_dev] 7.1.1 also getting new work for suspended project

2013-05-21 Thread dball
I suspended Constellation since No New Work wasn't preventing boinc from 
requesting
new work units from it and just noticed that it's still requesting work units 
from
constellation. Here's the section of stdoutdae.txt where it got another WU.

5/21/2013 2:29:52 PM |  | [work_fetch] --- start work fetch state ---
5/21/2013 2:29:52 PM |  | [work_fetch] target work buffer: 83808.00 + 11232.00 
sec
5/21/2013 2:29:52 PM |  | [work_fetch] --- project states ---
5/21/2013 2:29:52 PM | The Lattice Project | [work_fetch] REC 0.000 prio 
0.00 can't
req work: suspended via Manager
5/21/2013 2:29:52 PM | superlinkattechnion | [work_fetch] REC 0.000 prio 
0.00 can't
req work: suspended via Manager
5/21/2013 2:29:52 PM | MindModeling@Beta | [work_fetch] REC 0.000 prio 0.00 
can't
req work: suspended via Manager
5/21/2013 2:29:52 PM | Constellation | [work_fetch] REC 24.280 prio 0.00 
can't req
work: suspended via Manager
5/21/2013 2:29:52 PM | rosetta@home | [work_fetch] REC 28.099 prio -0.065670 
can req work
5/21/2013 2:29:52 PM | correlizer | [work_fetch] REC 27.875 prio -0.066564 can 
req work
5/21/2013 2:29:52 PM | eon2 | [work_fetch] REC 14.597 prio -0.066898 can req 
work
5/21/2013 2:29:52 PM | World Community Grid | [work_fetch] REC 276.058 prio 
-0.067595
can req work
5/21/2013 2:29:52 PM | Docking | [work_fetch] REC 71.034 prio -0.069294 can req 
work
5/21/2013 2:29:52 PM | NumberFields@home | [work_fetch] REC 15.042 prio 
-0.071349 can
req work
5/21/2013 2:29:52 PM | SZTAKI Desktop Grid | [work_fetch] REC 14.960 prio 
-0.075118 can
req work
5/21/2013 2:29:52 PM | malariacontrol.net | [work_fetch] REC 34.898 prio 
-0.077509 can
req work
5/21/2013 2:29:52 PM | boincsimap | [work_fetch] REC 32.339 prio -0.082647 can 
req work
5/21/2013 2:29:52 PM | ibercivis | [work_fetch] REC 15.922 prio -0.088790 can 
req work
5/21/2013 2:29:52 PM | fightmalaria@home | [work_fetch] REC 10.710 prio 
-0.098163 can
req work
5/21/2013 2:29:52 PM | Asteroids@home | [work_fetch] REC 15.895 prio -0.105204 
can req work
5/21/2013 2:29:52 PM | Milkyway@Home | [work_fetch] REC 11.340 prio -0.215259 
can req work
5/21/2013 2:29:52 PM | NFS@Home | [work_fetch] REC 28.400 prio -0.260310 can 
req work
5/21/2013 2:29:52 PM | LHC@home 1.0 | [work_fetch] REC 34.703 prio -0.318082 
can req work
5/21/2013 2:29:52 PM | Poem@Home | [work_fetch] REC 4017.988 prio -4.603479 can 
req work
5/21/2013 2:29:52 PM | SETI@home | [work_fetch] REC 2618.643 prio -6.264621 
can't req
work: scheduler RPC backoff (backoff: 297.68 sec)
5/21/2013 2:29:52 PM | Einstein@Home | [work_fetch] REC 4160.403 prio -9.768530 
can req
work
5/21/2013 2:29:52 PM | PrimeGrid | [work_fetch] REC 1389.675 prio -12.795318 
can req work
5/21/2013 2:29:52 PM |  | [work_fetch] --- state for CPU ---
5/21/2013 2:29:52 PM |  | [work_fetch] shortfall 3133.46 nidle 0.00 saturated 
91906.54
busy 0.00
5/21/2013 2:29:52 PM | The Lattice Project | [work_fetch] fetch share 0.000
5/21/2013 2:29:52 PM | superlinkattechnion | [work_fetch] fetch share 0.000
5/21/2013 2:29:52 PM | MindModeling@Beta | [work_fetch] fetch share 0.000
5/21/2013 2:29:52 PM | Constellation | [work_fetch] fetch share 0.000
5/21/2013 2:29:52 PM | rosetta@home | [work_fetch] fetch share 0.050
5/21/2013 2:29:52 PM | correlizer | [work_fetch] fetch share 0.050
5/21/2013 2:29:52 PM | eon2 | [work_fetch] fetch share 0.025
5/21/2013 2:29:52 PM | World Community Grid | [work_fetch] fetch share 0.497
5/21/2013 2:29:52 PM | Docking | [work_fetch] fetch share 0.124
5/21/2013 2:29:52 PM | NumberFields@home | [work_fetch] fetch share 0.025
5/21/2013 2:29:52 PM | SZTAKI Desktop Grid | [work_fetch] fetch share 0.025
5/21/2013 2:29:52 PM | malariacontrol.net | [work_fetch] fetch share 0.062
5/21/2013 2:29:52 PM | boincsimap | [work_fetch] fetch share 0.050
5/21/2013 2:29:52 PM | ibercivis | [work_fetch] fetch share 0.025
5/21/2013 2:29:52 PM | fightmalaria@home | [work_fetch] fetch share 0.012
5/21/2013 2:29:52 PM | Asteroids@home | [work_fetch] fetch share 0.025
5/21/2013 2:29:52 PM | Milkyway@Home | [work_fetch] fetch share 0.006
5/21/2013 2:29:52 PM | NFS@Home | [work_fetch] fetch share 0.012
5/21/2013 2:29:52 PM | LHC@home 1.0 | [work_fetch] fetch share 0.012
5/21/2013 2:29:52 PM | Poem@Home | [work_fetch] fetch share 0.000 (blocked by 
prefs)
5/21/2013 2:29:52 PM | SETI@home | [work_fetch] fetch share 0.000 (blocked by 
prefs)
5/21/2013 2:29:52 PM | Einstein@Home | [work_fetch] fetch share 0.000 (blocked 
by prefs)
5/21/2013 2:29:52 PM | PrimeGrid | [work_fetch] fetch share 0.000 (blocked by 
prefs)
5/21/2013 2:29:52 PM |  | [work_fetch] --- state for NVIDIA ---
5/21/2013 2:29:52 PM |  | [work_fetch] shortfall 5949.70 nidle 0.00 saturated 
89090.30
busy 0.00
5/21/2013 2:29:52 PM | The Lattice Project | [work_fetch] fetch share 0.000 (no 
apps)
5/21/2013 2:29:52 PM | superlinkattechnion | [work_fetch] fetch share 0.000 
(blocked by
configuration file)
5/21/2013 2:29:52 PM | MindModeling@Beta | [work_fetch] 

Re: [boinc_dev] [boinc_alpha] 7.1.1 also getting new work for suspended project

2013-05-21 Thread dball
 David,

 For both your No New Tasks and Project Suspended scenarios, where BOINC 
 still
 fetched work
 Did you manually click the Update button in both of those scenarios?


No, IIRC, although when I aborted the WU and updated the project to turn it in, 
I
believe it got another WU then.

In fact, I had loaded 7.1.1 on a different quad yesterday (one that doesn't use 
a GPU)
and when I flipped over to look at it (they're on the same KVM so share the 
screen and
keyboard but each has it's own mouse), it had gotten work for 2 projects that 
were
suspended on it. Einstein and Primegrid were the projects, IIRC. I unsuspended 
them
since they didn't have large shares anyway.

On the C2D machine which has a GPU I finally had to remove the Constellation 
project. It
was still getting work while it was suspended and I had set the resource share 
to 0.001

BTW, on the 3 test machines (2 Vista32 and 1 Vista64) I downloaded boinc 7.1.1 
on each
machine. I didn't move it between machines because I didn't have shares setup 
for that.
Each machine downloaded it's own separate copy of 7.1.1 so even if one machine 
somehow
got a bad copy, the other machines shouldn't have had the same problem. The 3rd 
test
machine (the only one running 64 bit Vista) didn't have any projects on it that 
were
suspended or NNW so it didn't show the work fetch problem although it does seem 
to have
tried to make sure it had at least one job from each project (see next 
paragraph).

7.1.1 is behaving differently on work fetch. 7.0.64 seemed to just grab all the 
work it
needed from the current priority project. It seems to me like 7.1.1 is trying 
to get at
least one work unit from each project it is attached to. ISTR David Anderson 
saying
something about the new work fetch simulator on the boinc alpha website doing 
that for
some reason so I guess 7.1.1 has the same logic in it.

David Ball

___
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.


Re: [boinc_dev] GPU Plan Classes

2013-04-16 Thread dball
Have a little bit of additional info. Apparently the HD 7600M series and below 
are
slightly updated versions of Turks and Caicos. It appears the HD 7400M series 
are an
updated version of Caicos called Seymour (40nm, VLIW5, 160 stream processors). 
The HD
7500M and 7600M series are an updated version of Turks called Thames (40nm, 
VLIW5, 480
stream processors). The 7500M series has a 64 bit memory interface and the 
7600M series
has a 128 bit memory interface.

References:

http://news.softpedia.com/news/AMD-Clarifies-Radeon-HD-7000M-Notebook-Strategy-246594.shtml

http://semiaccurate.com/2011/12/07/amd-launches-three-low-end-hd7000m-gpus/

Regards,

David Ball



 As of BOINC 7.0.65, the BOINC client assumed that any CAL-capable GPU was also
 OpenCL-capable, and so reported the to the server that way.  I have since 
 checked in
 code that corrects this assumption and should report correctly to the server.

 Because of limited information available from CAL and OpenCL, it is not 
 possible to do
 this perfectly in situations where one computer has a mix of GPUs some of 
 which are
 openCL capable and some of which are not, but I think my new code should work 
 _almost_
 every case.

 I did a bunch of internet research, and I believe Jon Sonntag's approach 
 similarly
 should work in _almost_  every case.  Unfortunately, some AMD target values 
 include GPUs
 that are both openCL capable and ones that are not.  Here is what I wrote 
 earlier to
 David and Rom:

 he information on AMD's web site is hard to find and very incomplete and 
 ambiguous.
 They seem to be pushing only their version 2.8 SDK which supports OpenCL 
 1.2, and give
 a list of compatible GPUs, but it is unclear whether this means that:
 [1]their latest software does not support older GPUs at all, or
 [2]it supports CAL but not OpenCL on their older GPUs, or
 [3]the newer GPUs are needed for newer OpenCL features but older GPUs are 
 still
 supported for the features which were present in older versions of openCL.

 I suspect it means [2] based on what Jord has written me.

 Because of all the confusion, and because the only information CAL gives us 
 about GPU
 model numbers is the  CALtargetEnum value, I compiled a list the best I 
 could mapping
 CALtargetEnum to GPU model and OpenCL capability.  I got my information from 
 the
 tables in 
 http://en.wikipedia.org/wiki/Comparison_of_ATI_graphics_processing_units.

 This correlates model numbers and other info with the engineering code names 
 listed in
 cal_boinc.h and the switch / case statement in COPROC_ATI::get() in 
 gpu_amd.cpp.  I
 also found a newer version of cal.h with a more recent listing for 
 CALtargetEnum
 values at 
 http://gpuocelot.googlecode.com/svn/trunk/ocelot/ocelot/cal/include/cal.h.

 Finally, I made the somewhat questionable assumption that if a table section 
 makes no
 mention of OpenCL that means those GPU models have no OpenCL support.

 Here is my list, which I am sure is not perfect:

 Based on 
 http://en.wikipedia.org/wiki/Comparison_of_ATI_graphics_processing_units

 HD 83xx - 84xx models include DirectX 11, OpenGL 4.2 and OpenCL 1.1[19]


 case CAL_TARGET_600,/** R600 GPU ISA */
   ATI Radeon HD 2900 (RV600)
   Radeon HD 2900 GT   Nov 6, 2007 OpenCL: NO
   Radeon HD 2900 Pro  Sep 25, 2007OpenCL: NO

 case CAL_TARGET_610,/** RV610 GPU ISA */
   ATI Radeon HD 2300/2400/3200/4200 (RV610)
   Radeon HD 2350  Jun 28, 2007OpenCL: NO
   Radeon HD 2400 Pro  Jun 28, 2007OpenCL: NO
   Radeon HD 2400 XT   Jun 28, 2007OpenCL: NO
   Mobility Radeon HD 2400 May 14, 2007OpenCL: NO
   Mobility Radeon HD 2400 XT  May 14, 2007OpenCL: NO
   Radeon 3000 Graphics (760G Chipset) 2009OpenCL: NO
   Radeon 3100 Graphics (780V Chipset) Jan 23, 2008OpenCL: NO
   Radeon HD 3200 Graphics (780G Chipset)  Jan 23, 2008OpenCL: NO

 case CAL_TARGET_630,/** RV630 GPU ISA */
   ATI Radeon HD 2600 (RV630)
   Radeon HD 2600 Pro  Jun 28, 2007OpenCL: NO
   Radeon HD 2600 XT   Jun 28, 2007OpenCL: NO
   Mobility Radeon HD 2600 May 14, 2007OpenCL: NO
   Mobility Radeon HD 2600 XT  May 14, 2007OpenCL: NO
   Mobility Radeon HD 2700 December 12, 2007   OpenCL: NO
   Radeon HD 3650  Jan 23, 2008 (RV635)OpenCL: NO
   All-In-Wonder HD 3650   Jun 28, 2008 (RV635)OpenCL: NO
   Mobility Radeon HD 3650 January 7, 2008 OpenCL: NO
   Mobility Radeon HD 3670 January 7, 2008 OpenCL: NO

 case CAL_TARGET_670,/** RV670 GPU ISA */
   ATI Radeon HD 3800 (RV670)
   FireStream 9170 November 8, 2007OpenCL 1.0 *
   Radeon HD 3850  Nov 19, 2007OpenCL: NO
   Radeon HD 3870  Nov 19, 2007

Re: [boinc_dev] The reason for a local DCF.

2013-04-03 Thread dball
We need some simple way for a project admin to be able to tell the server that 
the
estimate for a batch of jobs should be adjusted. The server should update the 
estimate
in the unsent jobs in that batch and somehow pass that info along to clients 
that
already have jobs in that batch the next time they contact the server.

One of the projects using the new version of the server (version 701 IIRC) that 
sends
dont_use_dcf/ as part of the sched reply had the estimates suddenly drop by 
1/10th.
The project admin said he was changing the estimated time of WUs and there was 
one 0
missing for that batch of work units. These WU continued to be sent out for 
several days
before another batch started and the time estimates went back to reality.

David Ball

 I can't speak specifically for TrainWreck@home, but I think you'll find that 
 if it's
 running generic BOINC server code that's less than three years old (and if 
 it's telling
 the client to turn off DCF, I think it must be), then the project server does 
 *NOT*
 calculate its own DCF.

 I invite you to review what happened to DCF in sched_send.cpp (server code), 
 in

 http://boinc.berkeley.edu/trac/changeset/1d765245ed6ea666a46b2b5878371c4183accbeb/boinc-v2/sched/sched_send.cpp





 From: McLeod, John john.mcl...@sap.com
To: Richard Haselgrove r.haselgr...@btopenworld.com; 
boinc_dev@ssl.berkeley.edu
 boinc_dev@ssl.berkeley.edu
Sent: Wednesday, 3 April 2013, 15:43
Subject: Re: [boinc_dev] The reason for a local DCF.

Currently the server calculates its own DCF.  And when asked for 43200 
seconds of work
 would inflate the fpops number to account for the difference.  This would 
 mean that the
 work that is received would have a sort of correct value for time before 
 being inflated
 again by the DCF calculation.

No, this is a startup issue, but it can happen any time:

1)       A new project is joined

2)      A new application is pushed down

3)      A new dataset that has a greatly different run time than expected is 
pushed
 down.

A possible way out:
If a project has do not use DCF set, modify the meaning of this somewhat.  
Instead of
 ignoring the DCF entirely, add a DCF modifier to each task of a project 
 which is 1/DCF
 at time of acceptance of the task (this counteracts the fact that the DCF is 
 calculated
 twice, once at the server and once at the client).  Each time the DCF is 
 used to
 calculate the remaining time to run, multiply by this value.  When the DCF 
 for the
 project is recalculated, recalculate as normal ignoring this modifier.  This 
 will
 eventually have the DCF stabilize near 1, and allow the server to calculate 
 what the
 fpops ought and have the client responsive to massive miscalculations in 
 initial state.

From: Richard Haselgrove [mailto:r.haselgr...@btopenworld.com]
Sent: Wednesday, April 03, 2013 10:22 AM
To: McLeod, John; boinc_dev@ssl.berkeley.edu
Subject: Re: [boinc_dev] The reason for a local DCF.

Fully agreed. But remember that you have to follow the logic and also 
re-instate the
 DCF code on that project's server.

Say you set work fetch limits of 0.5 days minimum and 0.5 days additional - 
or a target
 work buffer: 43200.00 + 43200.00 sec

Once TrainWreck@home (eventually) becomes the highest priority project and 
your client
 issues a work request, it will request 43200 seconds of work.

The *server*, which currently ignores DCF in its calculations, will still use 
the 1hr
 17mn estimation - 4620 seconds. The server will assign 10 jobs to fill the 
 request.

Once those 10 jobs arrive at the client, they will be re-estimated by the 
client using
 DCF, which by then will be about 19.27

And your client will announce that it has received 10 days 7 hours of new 
work. And no
 doubt panic.

[I am assuming that TrainWreck@home's previous batch of work for this 
application was
 correctly estimated, and that John has volunteered for TrainWreck@home for 
 long enough
 to have an established and stable APR for HitTheBuffers v1.01]


From: McLeod, John john.mcl...@sap.commailto:john.mcl...@sap.com
To: boinc_dev@ssl.berkeley.edumailto:boinc_dev@ssl.berkeley.edu
 boinc_dev@ssl.berkeley.edumailto:boinc_dev@ssl.berkeley.edu
Sent: Wednesday, 3 April 2013, 13:40
Subject: [boinc_dev] The reason for a local DCF.

I am currently watching a train wreck that would not be happening if DCF was 
turned on
 for a particular project.

The initial estimate is a wall time of one hour seventeen minutes.  The 
actual wall
 time is twenty four hour 44 minutes.  The problem is that work fetch and the 
 scheduler
 do not know that the problem exists for tasks #2 through #20 and are 
 downloading work
 from other projects, not realizing that the saturated time is 20 days and 
 not 20 hours.
___
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the 

Re: [boinc_dev] Recommend not detect 8400 GS video

2013-03-17 Thread dball
OK, forget about it then. I only use that system for crunching so I just told 
it not to
recognize the card. It's a 256MB card and I thought it was on the motherboard 
at first
and was concerned about losing the whole system. In researching temps for it, I 
did find
people talking about screen blanking or thermal cutoff at 110 C. Now that I 
know it's an
actual card, I'll just replace it with something more modern that has good 
temperature
control and uses very little power. Going from 80nm to 28nm is a big step. 
Apparently,
these 8400 gs chips are quite variable and this machine has an overheating chip 
in it or
a bad cooling solution. I also saw mention of some versions of the driver 
causing the
chip to go into thermal runaway. I might even replace the system. It's the 
original Core
2 quad and isn't upgradable to even a later C2Q. A current Ivy Bridge 22nm 
Intel I3
could probably match it for about a third the power of a Q6600. Some or all of 
the Ivy
bridge parts can run OpenCL and Haswell is due out later this year which is 
supposed to
have up to a 5x improvement on the graphics, depending on model, and includes 
the AVX2
extensions with larger FP units on the CPU cores IIRC (could be broadwell that 
has
those). APUs are eating much of the discrete graphics card market. Now that 
ST_E (sp??)
has got FD-SOI working and global foundries has licensed it, expect some 
improvements
from AMD APUs too. Now, if they can just solve the power supply problem for EUV
lithography, they'll be set for 10nm and below. It's scary how they power EUV. 
Think
very high powered lasers vaporizing a stream of metal droplets and missing the 
droplet a
lot of the time.

Anyway, since it seems that the chip failing will not take out the motherboard, 
and most
peoples cards are better than mine or have already been replaced, don't worry 
about it.

Sorry to have bothered you about it.

David Ball



 I would be opposed to BOINC telling me I can't use my 8400 GS because it
 might overheat.  If it is decided that it is a good idea and they should be
 banned, then BOINC should not allow crunching on laptops, cell phones, etc.
 because they also tend to overheat. Or, we let users decide whether to
 crunch or not and allow them to tune the GPU apps such that they can run at
 whatever temperatures they are comfortable with.

 Jon Sonntag


___
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.


[boinc_dev] Recommend not detect 8400 GS video

2013-03-16 Thread dball
While testing an HP m9047c (completely stock hardware - never overclocked)
for boinc alpha, I upgraded some drivers and somehow 7.0.56 of the boinc
client started detecting that it could run OpenCL jobs on the machine
which has a NVIDIA GeForce 8400 GS (256MB) driver: 314.07 GPU which had
previously gone undetected until a series of driver updates. I was
surprised that with so little memory, Seti assigned it 2 AstroPulse v6
v6.04 (opencl_nvidia_100) tasks.

Seti machine: 4719778
http://setiathome.berkeley.edu/show_host_detail.php?hostid=4719778

Fortunately, through just blind good luck, I was on the machine when the
huge Seti download finally finished and watched to see how it did. It was
working ok in the boinc manager but I decided to see what was happening
with GPU-z. It was reaching over 90% GPU utilization and about 48% memory
bandwidth utilization. However, after watching the temperature for the GPU
chip climb through 107 degrees C, I suspended GPU processing and set the
no_gpus flag in cc_config.xml. I aborted the running job and the second
job aborted with a status of 201 (0xc9) EXIT_MISSING_COPROC. I restarted
the boinc client.

Old nVidia chips are known in the trade press as having problems at high
temps because of a mismatch in the internal expansion properties resulting
in breakage. I know this was mentioned for the 65nm and 55nm chips in a
2008 article but I don't know about these chips, which are 80 nm IIRC. You
can read some reprints of the bumpgate articles starting at the address
below.

http://semiaccurate.com/2010/07/11/why-nvidias-chips-are-defective/

If it was me, I'd refuse to let the boinc client recognize these chips as
usable GPUs.

David Ball


___
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.


Re: [boinc_dev] Preferences Override

2013-03-06 Thread dball
Hello,

I think max_cpus is deprecated and has been replaced by max_ncpus_pct,
which is the percentage of the total cores to use.

If you have 4 cores and you only want to use 1 core, try modifying the
global_prefs_override.xml to include

max_cpus1/max_cpus
max_ncpus_pct25.0/max_ncpus_pct

and leaving the rest of the file intact. Then reload it.

BTW, it seems to round down. On my dual core machine I can set it to 99.0
percent and it will only use 1 cpu core. On your 4 core machine, anything
from 25.0 to 49.0 should just use 1 core.

Hope this helps,

David Ball

On 3/6/2013 3:10 PM, Jöbstl, Emanuel wrote:
 Hello David and John, thank you for your fast replies.

 I investigated the configuration files located in C:\ProgramData\Boinc,
and also validated that read_global_prefs_override is set.

 In my config files (global_prefs and global_prefs_override), max_cpus is
set to 1, but still there are four tasks running, using all the
processor cores.

 What could cause this issue?

 I attached the configuration files and the output of boinccmd.exe
--get_tasks.

 with best regards,
 Emi
 
 Von: boinc_dev [boinc_dev-boun...@ssl.berkeley.edu] im Auftrag von David
Anderson [da...@ssl.berkeley.edu]
 Gesendet: Dienstag, 5. März 2013 22:01
 An: boinc_dev@ssl.berkeley.edu
 Betreff: Re: [boinc_dev] Preferences Override

 Here's how things are supposed to work:

 global_prefs.xml contains preferences from the project server.

 global_prefs_override.xml contains preferences set locally.
 It's written by the set_global_prefs_override() GUI RPC.
 As the name implies, values specified here override
 those in global_prefs.xml.

 Note: after calling set_global_prefs_override(),
 you must call read_global_prefs_override() to have the
 new preferences take effect.

 -- David

 On 05-Mar-2013 12:02 PM, Jöbstl, Emanuel wrote:
 Hello Boinc Devs,

 Again, I need some help:

 1)

 I noticed that my local Boinc preferences are being overwritten with the
 preferences from the project server. Is there any way to avoid this or
do I
 have to change the preferences on the server too? I am setting the
 preferences by doing a set_global_prefs_override Gui-Rpc call.

 2)

 Is there any way to detect that a task has been finished on client side
 (using Gui Rpc)?

 with best regards and thanks, Emanuel
 ___ boinc_dev mailing list
 boinc_dev@ssl.berkeley.edu
 http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe,
 visit the above URL and (near bottom of page) enter your email address.

 ___
 boinc_dev mailing list
 boinc_dev@ssl.berkeley.edu
 http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
 To unsubscribe, visit the above URL and
 (near bottom of page) enter your email address.
 ___
 boinc_dev mailing list
 boinc_dev@ssl.berkeley.edu
 http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
 To unsubscribe, visit the above URL and
 (near bottom of page) enter your email address.


___
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.