I think this might be a thread (as opposed to application) priority issue.

Compare these two Process Explorer screenshots:
http://img834.imageshack.us/img834/9496/einsteincudapriority.png
http://img703.imageshack.us/img703/3087/seticudapriority.png

(both taken from the same computer, during the same BOINC session)

The worker thread for the Einstein application is running at priority one, 
whereas the equivalent thread for SETI is running at priority six. (The main 
application thread is running at priority six in both cases)

The SETI application in the screenshot is the original 
setiathome_6.08_windows_intelx86__cuda.exe written by NVidia for SETI's cuda 
launch in January 2009, so NVidia should be able to explain how to work round 
the discrepancy. Note that thread priority is (AFAIK) a Windows-only concept, 
so this probably won't help you Linux/Mac issues.
  ----- Original Message ----- 
  From: Oliver Bock 
  To: David Anderson ; boinc_dev ; Boinc Projects 
  Sent: Monday, December 20, 2010 11:12 AM
  Subject: [boinc_dev] CUDA task scheduling


  Hi everyone,

  We just deployed a new CUDA application (called BRP3) as part of the
  einst...@home project. This app roughly up to 75% of a GPU and 3-30% of
  a CPU, depending on the GPU model/performance. Thus our scheduler
  currently issues these tasks with the following settings:

  hu.avg_ncpus = 0.2
  hu.ncudas = 1

  Please note that BOINC (e.g. sched/sched_customize) revision 22832 is
  used in this case.

  The problem is that with the settings above BOINC starts CUDA tasks in
  addition to CPU tasks that already occupy all existing CPU cores. This
  means on a system having four CPU cores and two CUDA devices, four CPU
  tasks and two CUDA tasks are launched. Although this behavior is
  intended, it doesn't really work out for us because the performance of
  the CUDA tasks is degraded significantly - GPU usage goes down to less
  than 10%, increasing the runtime by the same factor. Although the CUDA
  tasks run with slightly higher priority (below normal on Windows) than
  the CPU tasks (low on Windows) they are limited by the already
  fully-occupied CPU cores which are still required for up to 30% of the
  computation.

  Since we couldn't yet release a Linux or Mac OS version we don't know
  whether this is a Windows time-slicing issue or not. Are there any other
  projects running CUDA tasks in a comparable way?

  The only workaround in sight would be to acquire a full CPU core once
  again but that's certainly not ideal.

  Any ideas are welcome!


  Cheers,
  Oliver
  _______________________________________________
  boinc_dev mailing list
  boinc_dev@ssl.berkeley.edu
  http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
  To unsubscribe, visit the above URL and
  (near bottom of page) enter your email address.
_______________________________________________
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Reply via email to