On Aug 31, 2010, at 3:36 PM, Matthew Knepley wrote: > On Tue, Aug 31, 2010 at 7:17 PM, Barry Smith <bsmith at mcs.anl.gov> wrote: > > On Aug 31, 2010, at 3:14 PM, Keita Teranishi wrote: > >> Does this PETSc use timers from CUDA? > > No, didn't know there were timers in CUDA. > > Yes, I use them when I really want to know how well I an utilizing the board, > vs. how > much improvement overall I can expect in the code. When compared with PETSc > timers, > they can give us an idea of the transfer overhead, which I do in my GPU FEM > code.
We have essentially no transfer in this example. It takes zero percent of the time. Barry > > Matt > > We actually want to use the real world timers because each method is > actually a call on the CPU so real world time is what matters. > > Barry > >> >> ================================ >> Keita Teranishi >> Scientific Library Group >> Cray, Inc. >> keita at cray.com >> ================================ >> >> From: petsc-dev-bounces at mcs.anl.gov [mailto:petsc-dev-bounces at >> mcs.anl.gov] On Behalf Of Barry Smith >> Sent: Tuesday, August 31, 2010 2:03 PM >> To: For users of the development version of PETSc >> Subject: Re: [petsc-dev] [GPU] Performance of ex19 >> >> >> Your MatMult is now slower. Are your results reproducible, if you run 5 >> times how similar are them? >> >> Barry >> >> On Aug 31, 2010, at 2:57 PM, Keita Teranishi wrote: >> >> >> VecDot 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecMDot 2024 1.0 1.1560e+00 1.0 2.54e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 18 29 0 0 0 32 29 0 0 0 2201 >> VecNorm 2096 1.0 3.5999e-01 1.0 1.68e+08 1.0 0.0e+00 0.0e+00 >> 0.0e+00 6 2 0 0 0 10 2 0 0 0 466 >> VecScale 2092 1.0 2.1599e-01 1.0 8.37e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 3 1 0 0 0 6 1 0 0 0 387 >> VecCopy 2072 1.0 5.5997e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 2 0 0 0 0 0 >> VecSet 70 1.0 8.0004e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecAXPY 108 1.0 2.7999e-02 1.0 8.64e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 1 0 0 0 0 309 >> VecWAXPY 68 1.0 7.9999e-03 1.0 2.72e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 340 >> VecMAXPY 2092 1.0 5.8399e-01 1.0 2.71e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 9 31 0 0 0 16 31 0 0 0 4634 >> VecScatterBegin 5 1.0 4.0002e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecReduceArith 2 1.0 3.9999e-03 1.0 1.60e+05 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 40 >> VecReduceComm 1 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecCUDACopyTo 10 1.0 3.9999e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> VecCUDACopyFrom 5 1.0 4.0002e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> SNESSolve 1 1.0 3.6119e+00 1.0 8.87e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 56100 0 0 0 100100 0 0 0 2456 >> SNESLineSearch 2 1.0 4.0002e-03 1.0 5.49e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 1374 >> SNESFunctionEval 3 1.0 4.0002e-03 1.0 2.52e+06 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 630 >> SNESJacobianEval 2 1.0 3.1199e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 5 0 0 0 0 9 0 0 0 0 123 >> KSPGMRESOrthog 2024 1.0 1.7120e+00 1.0 5.09e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 26 57 0 0 0 47 57 0 0 0 2972 >> KSPSetup 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> KSPSolve 2 1.0 3.2919e+00 1.0 8.83e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 51 99 0 0 0 91 99 0 0 0 2681 >> PCSetUp 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> PCApply 2024 1.0 4.7998e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 1 0 0 0 0 1 0 0 0 0 0 >> MatMult 2092 1.0 8.9998e-01 1.0 3.32e+09 1.0 0.0e+00 0.0e+00 >> 0.0e+00 14 37 0 0 0 25 37 0 0 0 3689 >> MatAssemblyBegin 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatAssemblyEnd 2 1.0 1.2000e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatZeroEntries 2 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 0 >> MatFDColorApply 2 1.0 3.1199e-01 1.0 3.85e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 5 0 0 0 0 9 0 0 0 0 123 >> MatFDColorFunc 42 1.0 7.9999e-03 1.0 3.53e+07 1.0 0.0e+00 0.0e+00 >> 0.0e+00 0 0 0 0 0 0 0 0 0 0 4410 >> >> > > > > > -- > What most experimenters take for granted before they begin their experiments > is infinitely more interesting than any results to which their experiments > lead. > -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20100831/39613ee2/attachment.html>
