On Wed, Sep 17, 2008 at 10:51 PM, Ahmed El Zein <ahmed at azein.com> wrote: > On Wed, 2008-09-17 at 21:33 -0500, Barry Smith wrote: >> Ahmed, >> >> This is very cool. >> On Sep 17, 2008, at 8:05 PM, Ahmed El Zein wrote: >> >> > Harald, >> > I am working on implementing SpMV on a NVIDIA GPU with CUDA for SML >> > applications (just about finished actually). >> > >> > As the SML application I am planing to base my work on uses PETSc, I >> > have written some functions that will convert AIJ matrices and Vectors >> > to SP and copy them to the GPU, multiply them and copy them back to >> > the >> > host. I would be happy to share them with you if you want. >> >> I think to really make the GPU truly a large step forward in >> performance, the Vec's need to >> be kept on the GPU and only transported back to the main CPU when >> absolutely needed. For example, consider a KSP solver like CG, it would >> run on the main CPU but each Vec is actually just a handle for the >> true vector >> entries that are in the GPU memory, a call to VecAXPY(), for example, >> would >> pass the scalar and the two Vec handles down to the GPU where the actual >> axpy is performed. With this paradigm the only values passed back to the >> main CPU are scalars. This is why I think this work has to be done only >> on the latests GPU systems with lots of memory. >> > You are right! That is what I do for the SML application. I convert and > copy the matrix to the GPU and then iteratively send a new vector to the > GPU for multiplication. In fact unless the matrix will be reused at > least 4 times, there would be no performance gain! > > The 8800 GTX has 768 MB of memory but you can have multiple GPUs running > on your machine and split your data amongst them to effectively get more > memory. > > What I was thinking of was to add GPU pointers to a PETSC Mat or Vec > object with an optional shadow parameter. If shadow is enabled the host > will keep a copy of what is on the GPU in main memory. That way if there > are changes to the original matrix: > 1. It might be less expensive to make the modification on the host. > 2. It might be possible to update the matrix on the GPU by sending > diffs, as the copying of data back and forth is the most expensive > operation. > > It also will allow the use of that matrix on both the host and the GPU > to offer maximum flexibility. Matrix assembly would be done on the host > anyway. > > (I am not sure if the above points are loads of rubbish or not. But I > think that there are many options to be considered in a PETSc GPU > implementation.) > > > A question that I had regarding the PETSc code when I was thinking about > this was: > You have the SeqAIJ matrix type and the the MPIAIJ type built around it > (or that is what I understand from the code). So basically you implement > the SeqAIJ type for the GPU and you get the MPI type for free?
Yes, that is true. However, note that in the MPI step, you will need a gather operation to get the second matrix multiply to work. Matt > Ahmed > >> Barry >> >> > >> > >> > While this would be outside the scope of my MPhil I would be very >> > interested in helping to add GPU support for PETSC. I have not yet had >> > any experience with CTM for programming ATI GPUs but I believe there >> > would not be a huge difference. >> > >> > I have access to a GeForce 8800GTX GPU (single precession only) at the >> > ANU. I have been talking with my supervisor about getting a GTX280 or >> > GTX 260 (supports double precision) but I don't know if we will be >> > getting one. >> > >> > Anyway I would like to help. So if anyone would like to start thinking >> > how this would be best implemented, I am available. :) >> > >> > Ahmed >> > >> > On Wed, 2008-09-17 at 08:24 -0700, Harald Pfeiffer wrote: >> >> Hi, >> >> >> >> do you know whether there are any efforts to run PETSc on GPUs >> >> (graphical processing units)? >> >> >> >> Thanks, >> >> Harald >> >> >> >> >> >> >> > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener
