PETSc on GPUs

Matthew Knepley Thu, 18 Sep 2008 10:20:31 -0500

On Wed, Sep 17, 2008 at 10:51 PM, Ahmed El Zein <ahmed at azein.com> wrote:
> On Wed, 2008-09-17 at 21:33 -0500, Barry Smith wrote:
>> Ahmed,
>>
>>     This is very cool.
>> On Sep 17, 2008, at 8:05 PM, Ahmed El Zein wrote:
>>
>> > Harald,
>> > I am working on implementing SpMV on a NVIDIA GPU with CUDA for SML
>> > applications (just about finished actually).
>> >
>> > As the SML application I am planing to base my work on uses PETSc, I
>> > have written some functions that will convert AIJ matrices and Vectors
>> > to SP and copy them to the GPU, multiply them and copy them back to
>> > the
>> > host. I would be happy to share them with you if you want.
>>
>>     I think to really make the GPU truly a large step forward in
>> performance, the Vec's need to
>> be kept on the GPU and only transported back to the main CPU when
>> absolutely needed. For example, consider a KSP solver like CG, it would
>> run on the main CPU but each Vec is actually just a handle for the
>> true vector
>> entries that are in the GPU memory, a call to VecAXPY(), for example,
>> would
>> pass the scalar and the two Vec handles down to the GPU where the actual
>> axpy is performed. With this paradigm the only values passed back to the
>> main CPU are scalars. This is why I think this work has to be done only
>> on the latests GPU systems with lots of memory.
>>
> You are right! That is what I do for the SML application. I convert and
> copy the matrix to the GPU and then iteratively send a new vector to the
> GPU for multiplication. In fact unless the matrix will be reused at
> least 4 times, there would be no performance gain!
>
> The 8800 GTX has 768 MB of memory but you can have multiple GPUs running
> on your machine and split your data amongst them to effectively get more
> memory.
>
> What I was thinking of was to add GPU pointers to a PETSC Mat or Vec
> object with an optional shadow parameter. If shadow is enabled the host
> will keep a copy of what is on the GPU in main memory. That way if there
> are changes to the original matrix:
> 1. It might be less expensive to make the modification on the host.
> 2. It might be possible to update the matrix on the GPU by sending
> diffs, as the copying of data back and forth is the most expensive
> operation.
>
> It also will allow the use of that matrix on both the host and the GPU
> to offer maximum flexibility. Matrix assembly would be done on the host
> anyway.
>
> (I am not sure if the above points are loads of rubbish or not. But I
> think that there are many options to be considered in a PETSc GPU
> implementation.)
>
>
> A question that I had regarding the PETSc code when I was thinking about
> this was:
> You have the SeqAIJ matrix type and the the MPIAIJ type built around it
> (or that is what I understand from the code). So basically you implement
> the SeqAIJ type for the GPU and you get the MPI type for free?


Yes, that is true. However, note that in the MPI step, you will need a gather
operation to get the second matrix multiply to work.

  Matt

> Ahmed
>
>>     Barry
>>
>> >
>> >
>> > While this would be outside the scope of my MPhil I would be very
>> > interested in helping to add GPU support for PETSC. I have not yet had
>> > any experience with CTM for programming ATI GPUs but I believe there
>> > would not be a huge difference.
>> >
>> > I have access to a GeForce 8800GTX GPU (single precession only) at the
>> > ANU. I have been talking with my supervisor about getting a GTX280 or
>> > GTX 260 (supports double precision) but I don't know if we will be
>> > getting one.
>> >
>> > Anyway I would like to help. So if anyone would like to start thinking
>> > how this would be best implemented, I am available. :)
>> >
>> > Ahmed
>> >
>> > On Wed, 2008-09-17 at 08:24 -0700, Harald Pfeiffer wrote:
>> >> Hi,
>> >>
>> >> do you know whether there are any efforts to run PETSc on GPUs
>> >> (graphical processing units)?
>> >>
>> >> Thanks,
>> >> Harald
>> >>
>> >>
>> >>
>> >
>
-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener

PETSc on GPUs

Reply via email to