Re: [petsc-users] GPU local direct solve of penta-diagonal

Dominic Meiser Thu, 12 Dec 2013 16:09:34 -0800

On 12/12/2013 04:38 PM, Paul Mullowney wrote:

Provided you have a good parallel sparse direct solve for a single SM,you could unleash 32 direct solves (or perhaps 16) which runconcurrently on the K20x. One only needs to set an environmentvariable to use Hypre Q

On Titan all you need to do is
|
$ export CRAY_CUDA_PROXY=1


|

See here:
https://www.olcf.ornl.gov/tutorials/cuda-proxy-managing-gpu-context/

Cheers,
Dominic


I don't know of any good parallel sparse direct solver for small systems.

-Paul

On Thu, Dec 12, 2013 at 4:29 PM, Dominic Meiser <[email protected]<mailto:[email protected]>> wrote:


    Hi Karli,


    On 12/12/2013 02:50 PM, Karl Rupp wrote:


    Hmm, this does not sound like something I would consider a good
    fit for GPUs. With 16 MPI processes you have additional
    congestion of the one or two GPUs per node, so you would have the
    rethink the solution procedure as a whole.

    Are you sure about that for Titan? Supposedly the K20X's can deal
    with multiple MPI processes hitting a single GPU pretty well using
    Hyper-Q. Paul has seen pretty good speed up with small GPU kernels
    simply by over-subscribing each GPU with 4 MPI processes.

    See here:
    
http://blogs.nvidia.com/blog/2012/08/23/unleash-legacy-mpi-codes-with-keplers-hyper-q/


    Cheers,
    Dominic

--Dominic Meiser

    Tech-X Corporation
    5621 Arapahoe Avenue
    Boulder, CO 80303
    USA
    Telephone:303-996-2036  <tel:303-996-2036>
    Fax:303-448-7756  <tel:303-448-7756>
    www.txcorp.com  <http://www.txcorp.com>



--
Dominic Meiser
Tech-X Corporation
5621 Arapahoe Avenue
Boulder, CO 80303
USA
Telephone: 303-996-2036
Fax: 303-448-7756
www.txcorp.com

Re: [petsc-users] GPU local direct solve of penta-diagonal

Reply via email to