Hello, Thank you for your answers. I am working with Dave May on this topic.
Still running src/ksp/ksp/tutorials/ex34 with the same options reported by Dave, I added the option -log_view_gpu_time. Now the log provides gpu flop/s instead of nans. However, I have trouble understanding the numbers reported in the log (file attached). 1. The numbers reported for Total Mflop/s and GPU Mflop/s are different even when 100% of the work is supposed to be done on the GPU. 2. The numbers reported for GPU Mflop/s are always higher than the numbers reported for Total Mflop/s. As I understand, the Total Mflop/s should be the sum of both GPU and CPU flop/s, but if the gpu does 100% of the work, why are there different numbers reported by the GPU and Total flop/s columns and why the GPU flop/s are always higher than the Total flop/s ? Or am I missing something? Thank you for your attention. Anthony Jourdon Le sam. 20 janv. 2024 à 02:25, Barry Smith <[email protected]> a écrit : > > Nans indicate we do not have valid computational times for these > operations; think of them as Not Available. Providing valid times for the > "inner" operations listed with Nans requires inaccurate times (higher) for > the outer operations, since extra synchronization between the CPU and GPU > must be done to get valid times for the inner options. We opted to have the > best valid times for the outer operations since those times reflect the > time of the application. > > > > > > > On Jan 19, 2024, at 12:35 PM, Dave May <[email protected]> wrote: > > > > Hi all, > > > > I am trying to understand the logging information associated with the > %flops-performed-on-the-gpu reported by -log_view when running > > src/ksp/ksp/tutorials/ex34 > > with the following options > > -da_grid_x 192 > > -da_grid_y 192 > > -da_grid_z 192 > > -dm_mat_type seqaijhipsparse > > -dm_vec_type seqhip > > -ksp_max_it 10 > > -ksp_monitor > > -ksp_type richardson > > -ksp_view > > -log_view > > -mg_coarse_ksp_max_it 2 > > -mg_coarse_ksp_type richardson > > -mg_coarse_pc_type none > > -mg_levels_ksp_type richardson > > -mg_levels_pc_type none > > -options_left > > -pc_mg_levels 3 > > -pc_mg_log > > -pc_type mg > > > > This config is not intended to actually solve the problem, rather it is > a stripped down set of options designed to understand what parts of the > smoothers are being executed on the GPU. > > > > With respect to the log file attached, my first set of questions related > to the data reported under "Event Stage 2: MG Apply". > > > > [1] Why is the log littered with nan's? > > * I don't understand how and why "GPU Mflop/s" should be reported as nan > when a value is given for "GPU %F" (see MatMult for example). > > > > * For events executed on the GPU, I assume the column "Time (sec)" > relates to "CPU execute time", this would explain why we see a nan in "Time > (sec)" for MatMult. > > If my assumption is correct, how should I interpret the column "Flop > (Max)" which is showing 1.92e+09? > > I would assume of "Time (sec)" relates to the CPU then "Flop (Max)" > should also relate to CPU and GPU flops would be logged in "GPU Mflop/s" > > > > [2] More curious is that within "Event Stage 2: MG Apply" KSPSolve, > MGSmooth Level 0, MGSmooth Level 1, MGSmooth Level 2 all report "GPU %F" as > 93. I believe this value should be 100 as the smoother (and coarse grid > solver) are configured as richardson(2)+none and thus should run entirely > on the GPU. > > Furthermore, when one inspects all events listed under "Event Stage 2: > MG Apply" those events which do flops correctly report "GPU %F" as 100. > > And the events showing "GPU %F" = 0 such as > > MatHIPSPARSCopyTo, VecCopy, VecSet, PCApply, DCtxSync > > don't do any flops (on the CPU or GPU) - which is also correct (although > non GPU events should show nan??) > > > > Hence I am wondering what is the explanation for the missing 7% from > "GPU %F" for KSPSolve and MGSmooth {0,1,2}?? > > > > Does anyone understand this -log_view, or can explain to me how to > interpret it? > > > > It could simply be that: > > a) something is messed up with -pc_mg_log > > b) something is messed up with the PETSc build > > c) I am putting too much faith in -log_view and should profile the code > differently. > > > > Either way I'd really like to understand what is going on. > > > > > > Cheers, > > Dave > > > > > > > > <ex34_192_mg_seqhip_richardson_pcnone.o5748667> > >
ex34_192_mg_seqhip_richardson_pcnone_gpulog.out
Description: Binary data
