This thread has lost the main developer Ed ... cc'ed along with the PI. Ed and CS, I will forward a few messages on this thread.
On Thu, Dec 12, 2013 at 6:29 PM, Dominic Meiser <[email protected]> wrote: > Hi Karli, > > > On 12/12/2013 02:50 PM, Karl Rupp wrote: > > > Hmm, this does not sound like something I would consider a good fit for > GPUs. With 16 MPI processes you have additional congestion of the one or > two GPUs per node, so you would have the rethink the solution procedure as > a whole. > > Are you sure about that for Titan? Supposedly the K20X's can deal with > multiple MPI processes hitting a single GPU pretty well using Hyper-Q. Paul > has seen pretty good speed up with small GPU kernels simply by > over-subscribing each GPU with 4 MPI processes. > > See here: > > http://blogs.nvidia.com/blog/2012/08/23/unleash-legacy-mpi-codes-with-keplers-hyper-q/ > > > Cheers, > Dominic > > > -- > Dominic Meiser > Tech-X Corporation > 5621 Arapahoe Avenue > Boulder, CO 80303 > USA > Telephone: 303-996-2036 > Fax: 303-448-7756www.txcorp.com > >
