I agree that we need to time-slice GPU jobs; I'll get to this next month. -- David
On 07-Aug-2010 7:45 AM, Richard Haselgrove wrote: > The new-ish Fermi class of NVidia GPUs has much better hardware support for > multitasking than its predecessors. I don't know of any project that is > using this officially so far, perhaps because Fermi-class GPUs are still > expensive and comparatively rare. but prices are falling and sales > increasing - they will become widespread eventually. > > Given the extensive use of third-party applications at SETI, it is > inevitable that experimentation has taken place. Empirically, several > commentators have found the same answer: the SETI Fermi application (as > supplied by NVidia itself) runs most productively when three instances are > scheduled to run concurrently. > > But with the current FIFO scheduling and no pre-emption, this leads to > problems when running, as BOINC is designed to do, multiple projects. > > Consider http://a.imageshack.us/img441/9485/notaskswitchwithcuda.png > > The three SETI tasks finish asychronously. Each time one exits, 0.34 GPUs > become available: but that's not enough to launch the GPUGrid tasks ahead of > them in the FIFO queue. So BOINC deals yet another SETI (or SETI Beta, which > I've set up similarly) task from the bottom of the pack. Presumably, this > would continue indefinitely, until either fractional GPU tasks from all > other projects ran dry, or imminent deadline pressure forced GPUGrid into > 'High Priority'. > > This is analogous to the situation we saw with AQUA and multi-threaded CPU > applications, where the MT app had a tendency to hog the CPU and keep out > other projects. That's been sorted now: this one hasn't. > > I'm sure the BOINC client will complete the work before deadline (although > I've intervened manually, and these tasks won't get a chance to hang around > that long). But that isn't the point. > > The science behind GPUGrid requires that tasks be returned in a timely > fashion. Earlier results are required to generate the starting conditions > for later jobs. Any scientific results of value will depend on a long chain > of job - process - result - new job - process - result - new job..., and so > on. Although they allow a deadline of up to five days, to allow slower and > part-time GPUs to participate, they prefer results back within 24 hours if > possible. FIFO scheduling without allowance for fractional usage is > preventing this. > > The basic plumbing for task-switch GPU scheduling was in place as far back > as last December, with the introduction of cuda_short_term_debt and > ati_short_term_debt (see for example changeset 19898). Is there any chance > of returning to this functional area of BOINC development, before the next > quantum leap in technology overtakes us? > > > _______________________________________________ > boinc_dev mailing list > [email protected] > http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev > To unsubscribe, visit the above URL and > (near bottom of page) enter your email address. _______________________________________________ boinc_dev mailing list [email protected] http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
