Could you please check the file attached to this email. there is source code and log summary from execution of mat mult.
When I run the ex131 with parameters -vec_type cuda and -mat_type seqaijcuda mpiexec -n 1 ./ex131 -f ../matbinary.ex -vec 0 -mat_type seqaijcuda -vec_type cuda -log_summary it fails because of CUDA Error 4. see MatMultKO.log When I run the same program without -vec_type cuda parameter only with -mat_type seqaijcuda it run ok. mpiexec -n 1 ./ex131 -f ../matbinary.ex -vec 0 -mat_type seqaijcuda -log_summary MatMltOK.log When I run without -math_type seqaijcuda only with -vec_type cuda it fails again because terminate called after throwing an instance of 'thrust::system::system_error' what(): invalid argument terminate called after throwing an instance of 'thrust::system::system_error' what(): invalid argument -------------------------------------------------------------------------- mpiexec noticed that process rank 0 with PID 3755 on node desktop exited on signal 6 (Aborted). -------------------------------------------------------------------------- Could you please give me some comments on that Dnia 2010-12-13, pon o godzinie 07:37 +0000, Matthew Knepley pisze: > Yes, it should run on the GPU. Check an example, like ex19. > > > Matt > > On Mon, Dec 13, 2010 at 7:29 AM, Jakub Pola <jakub.pola at gmail.com> > wrote: > Hi, > > Does MatMult function is performed on GPU? when I prepared > program which > just executes this function with parameters -vec_type cuda and > -mat_type > seqaijcuda i havent seen in summary log any VecCUDACopyTo > entry > > > Dnia 2010-12-11, sob o godzinie 11:50 -0600, Barry Smith > pisze: > > > > To answer this you need to understand that PETSc copies > vectors and matrices to the GPU memory "on demand" (that is > exactly when they are first needed on the GPU, and not before) > and once it has copied to the GPU it keeps track of it and > will NOT copy it down again if it is already there. > > > > Hence in your run below, yes it includes the copy time > down. > > > > But note that ONE multiply on the GPU is absurd, it does > not make sense to copy a matrix down to the GPU and then do > ONE multiply with it. Thus I NEVER do "sandalone" benchmarking > where a single kernel is called by it self once, the time > results are useless. Always run a FULL application with > -log_summary; for example in this case a full KSPSolve() that > requires a bunch of iterations. Then you can look at the > performance of each kernel. The reason to do it this way is > that the numbers can be very different and what matters is > runs in APPLICATIONS so that is what should be measured. > > > > If say you run KSP with 20 iterations then the time to > copy the matrix down to the GPU is amortized over those 20 > iterations and thus maybe ok. You should see the flop rate for > the MatMult() go up in this case. > > > > You may have noticed we have a log entry for > VecCopyToGPU() we will be adding one for matrices as well thus > you will be able to see how long the copy time is but not that > the copy time is still counted in the MatMult() time if the > first copy of the matrix to GPU is triggered by the MatMult. > You can subtract the copy time from the mult time to get the > per multiply time, this would correspond to the multiply time > in the limit of a single copy down and many, many multiplies > on the GPU. > > > > Barry > > > > > > > > > > On Dec 11, 2010, at 8:32 AM, Jakub Pola wrote: > > > > > Hello again, > > > > > > I compiled one of te examples. I used sparse matix called > 02-raefsky3. > > > I used -vec_type cuda and -mat_type seqaijcuda. > > > > > > When I see summary of the operations performed by program > there is > > > > > > MatMult 1 1.0 2.0237e-02 1.0 2.98e+06 1.0 0.0e+00 0.0e+00 > 0.0e+00 2100 > > > 0 0 0 2100 0 0 0 147 > > > > > > Does time of performing MatMult includes memory transfer > for loading > > > matrix in GPU memory or just exact computation time? > > > > > > Thanks in advance. > > > Kuba. > > > > > > > > > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > -------------- next part -------------- A non-text attachment was scrubbed... Name: tests.zip Type: application/zip Size: 4031 bytes Desc: not available URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20101213/3d1963ed/attachment.zip>
