Hi guys,

> We use external libraries such as MAGMA and cuSPARSE. It looks like they
use the runtime API as you mentioned above. At the moment, conflict is
between the two instances of PETSc that we run (one each for real and
complex). We are planning to write some code in CUDA and will use the
driver API if need be.

Does moving to the driver API look like something that can be included
in PETSc 3.5?

I can implement the lazy instantiation mechanism in early February.

As for the CUDA driver API, this depends on the external libraries. PETSc's goal should be to stay away from low-level GPU fiddling as much as possible and only provide the glue needed to leverage existing GPU libraries in an MPI context. For example, I need to check whether e.g. thrust and CUSP are able to deal with such a context information. If not, then we have no other option but to rely on cudaSetDevice() anyway...

Best regards,
Karli




On Mon, Jan 20, 2014 at 11:27 AM, Dominic Meiser <[email protected]
<mailto:[email protected]>> wrote:

    Hi Jed, Harshad,

    A different solution to the problem of PETSc and a user code
    stepping on each other's feet with cudaSetDevice might be to use the
    CUDA driver API for device selection rather than the runtime API. If
    we were to explicitly manage a PETSc CUDA context using the driver
    API we can control what devices are being used without interfering
    with the mechanisms used by other parts of a client code for CUDA
    device selection (e.g. cudaSetDevice). PETSc's device management
    would be completely decoupled from the rest of an application.

    Of course this approach can be combined with lazy initialization as
    proposed by Karl. Whenever the first device function is called we
    create the PETSc CUDA context. The advantages of lazy initialization
    mentioned by Karl and Jed ensue (e.g. ability to run on machines
    without GPUs provided one is not using GPU functionality).

    Another advantage of a solution using the driver API is that device
    and context management would be very similar between CUDA and OpenCL
    backends.

    I realize that this proposal might be impractical as a near term
    solution since it involves a pretty major refactor of the CUDA
    context infrastructure. Furthermore, as far as I can tell, third
    party libraries that we rely on (e.g. cusp and cusparse) assume the
    runtime api. Perhaps these difficulties can be overcome?

    A possible near term solution would be to turn this around and to
    have applications with advanced device selection requirements use
    the driver API. Harshad, I'm not familiar with your code but would
    it be possible for you to use the driver API on your end to avoid
    conflicts with cudaSetDevice calls inside PETSc?

    Cheers,
    Dominic


    On 01/14/2014 09:27 AM, Harshad Sahasrabudhe wrote:

        Hi Jed,

        Sometime back we talked about an interface which could handle
        other libraries calling cudaSetDevice simultaneously with PETSc.
        For example, in our case 2 different instances of PETSc calling
        cudaSetDevice.

         >Sure, but how will we actually share the device between
        libraries?  What
         >if the other library was not PETSc, but something else, and
        they also
         >called cudaSetDevice, but with a different default mapping
        strategy?

         >We need an interface that handles this case.

        Do we already have any solution for this? If not, can we start
        looking at this case?

        Thanks,
        Harshad



    --
    Dominic Meiser
    Tech-X Corporation
    5621 Arapahoe Avenue
    Boulder, CO 80303
    USA
    Telephone: 303-996-2036
    Fax: 303-448-7756
    www.txcorp.com <http://www.txcorp.com>



Reply via email to