Re: [boinc_dev] Another problem with GPU FIFO scheduling

Richard Haselgrove Sat, 07 Aug 2010 13:24:28 -0700

Footnote to this: I got the GPUGrid task to run by suspending all three of 
the running SETI tasks together with a multiselect/click. Then resumed them, 
in case I forgot later.


Sure enough, once the first GPUGrid task finished, the part-run tasks were 
picked for resumption, and I had to repeat the process to get the second 
GPUGrid task to start.

----- Original Message ----- 
The new-ish Fermi class of NVidia GPUs has much better hardware support for
multitasking than its predecessors. I don't know of any project that is
using this officially so far, perhaps because Fermi-class GPUs are still
expensive and comparatively rare. but prices are falling and sales
increasing - they will become widespread eventually.

Given the extensive use of third-party applications at SETI, it is
inevitable that experimentation has taken place. Empirically, several
commentators have found the same answer: the SETI Fermi application (as
supplied by NVidia itself) runs most productively when three instances are
scheduled to run concurrently.

But with the current FIFO scheduling and no pre-emption, this leads to
problems when running, as BOINC is designed to do, multiple projects.

Consider http://a.imageshack.us/img441/9485/notaskswitchwithcuda.png

The three SETI tasks finish asychronously. Each time one exits, 0.34 GPUs
become available: but that's not enough to launch the GPUGrid tasks ahead of
them in the FIFO queue. So BOINC deals yet another SETI (or SETI Beta, which
I've set up similarly) task from the bottom of the pack. Presumably, this
would continue indefinitely, until either fractional GPU tasks from all
other projects ran dry, or imminent deadline pressure forced GPUGrid into
'High Priority'.

This is analogous to the situation we saw with AQUA and multi-threaded CPU
applications, where the MT app had a tendency to hog the CPU and keep out
other projects. That's been sorted now: this one hasn't.

I'm sure the BOINC client will complete the work before deadline (although
I've intervened manually, and these tasks won't get a chance to hang around
that long). But that isn't the point.

The science behind GPUGrid requires that tasks be returned in a timely
fashion. Earlier results are required to generate the starting conditions
for later jobs. Any scientific results of value will depend on a long chain
of job - process - result - new job - process - result - new job..., and so
on. Although they allow a deadline of up to five days, to allow slower and
part-time GPUs to participate, they prefer results back within 24 hours if
possible. FIFO scheduling without allowance for fractional usage is
preventing this.

The basic plumbing for task-switch GPU scheduling was in place as far back
as last December, with the introduction of cuda_short_term_debt and
ati_short_term_debt (see for example changeset 19898). Is there any chance
of returning to this functional area of BOINC development, before the next
quantum leap in technology overtakes us?


_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address. 


_______________________________________________
boinc_dev mailing list
[email protected]
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.

Re: [boinc_dev] Another problem with GPU FIFO scheduling

Reply via email to