It is a great convenience to have one line of bash script to change to debug or optimized build using PETSC_ARCH.
I have done this at my computer and everything is fine. It seems PETSc performs additional checks that require extra communication in VecAXPY/VecAYPX as well as in MatMultTranspose when built with debug option. It does not perform any communication in the kernels I have mentioned in the optimized build. On Mon, Jan 6, 2014 at 8:03 PM, Satish Balay <[email protected]> wrote: > You should be able to grab PETSC_ARCH/conf/reconfigure_PETSC_ARCH.py > from the current install [on the supercomputer] - modify the required > options - and rerun this script to get an equivalent build. > > Satish > > On Sat, 4 Jan 2014, Barry Smith wrote: > > > > > You need to build PETSc twice in total. Use two different PETSC_ARCH > for example arch-debug and arch-opt then just switch between the > PETSC_ARCH depending on what you are doing. > > > > If you have any trouble installing PETSc just send the configure.log > and make.log to [email protected] and we’ll help you get it > installed. > > > > Barry > > > > On Jan 4, 2014, at 6:18 PM, R. Oğuz Selvitopi <[email protected]> > wrote: > > > > > And if I want to run my code without debugging mode on, I have to > build PETSc from the beginning with the corresponding configure options, > right? > > > > > > That is some headache for me, as I am using PETSc on a supercomputer > and it is built with the default option, where debug mode is on. > > > > > > > > > On Sun, Jan 5, 2014 at 1:52 AM, Barry Smith <[email protected]> > wrote: > > > > > > The debug version does extra reductions for error checking, you > should never look at -log_summary with a debug build. > > > > > > Normally MatMult() for MPIAIJ matrices has nearest neighbor > communication so neither MatMult() or MatMultTranspose() has global > reductions but if some scatters involve all entries then VecScatter does > use global reductions in those cases. You could run a slimmed down run on > 2 process with one process in the debugger and put a break point in > MPI_Allreduce() and MPI_Reduce() to see when it is being triggered inside > the MatMultTranspose(). Use -start_in_debugger -debugger_nodes 0 > > > > > > > > > Barry > > > > > > On Jan 4, 2014, at 1:32 PM, R. Oğuz Selvitopi <[email protected]> > wrote: > > > > > > > Hello, > > > > > > > > I am trying to understand the output generated by PETSc with > -log_summary option. > > > > > > > > Using PetscLogStageRegister/PetscLogStagePush/PetscLogStagePop I > want to find out if there exists unnecessary communication in my code. > > > > My problem is with understanding the number of reductions performed. > > > > > > > > I have a solver whose stages are logged, and in the summary stages > output, I get > > > > > > > > Summary of Stages: ----- Time ------ ----- Flops ----- --- > Messages --- -- Message Lengths -- -- Reductions -- > > > > Avg %Total Avg %Total > counts %Total Avg %Total counts %Total > > > > 4: Solver: 6.5625e-04 4.3% 4.2000e+02 59.7% 1.600e+01 > 23.2% 3.478e+00 14.5% 8.000e+00 5.3% > > > > > > > > where it seems I have 8 reduction operations performed. But in the > details of the stage events, I get: > > > > > > > > --- Event Stage 4: Solver > > > > > > > > Event Count Time (sec) Flops > --- Global --- --- Stage --- Total > > > > Max Ratio Max Ratio Max Ratio Mess Avg > len Reduct %T %f %M %L %R %T %f %M %L %R Mflop/s > > > > > ------------------------------------------------------------------------------------------------------------------------ > > > > MatMult 1 1.0 1.2207e-04 1.8 4.10e+01 4.1 > 8.0e+00 1.5e+01 0.0e+00 1 16 12 7 0 15 27 50 50 0 1 > > > > MatMultTranspose 1 1.0 1.2112e-04 1.0 4.60e+01 3.8 8.0e+00 > 1.5e+01 2.0e+00 1 18 12 7 1 18 30 50 50 25 1 > > > > VecDot 3 1.0 2.6989e-04 1.2 2.90e+01 2.6 > 0.0e+00 0.0e+00 3.0e+00 2 12 0 0 2 36 20 0 0 38 0 > > > > VecSet 2 1.0 8.1062e-06 1.5 0.00e+00 0.0 > 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0 > > > > VecAXPY 2 1.0 3.7909e-05 1.3 2.00e+01 2.0 0.0e+00 > 0.0e+00 0.0e+00 0 9 0 0 0 5 15 0 0 0 2 > > > > VecAYPX 1 1.0 5.0068e-06 1.2 1.00e+01 1.7 0.0e+00 > 0.0e+00 0.0e+00 0 5 0 0 0 1 8 0 0 0 6 > > > > VecScatterBegin 2 1.0 7.2956e-05 2.4 0.00e+00 0.0 1.6e+01 > 1.5e+01 0.0e+00 0 0 23 14 0 6 0100100 0 0 > > > > VecScatterEnd 2 1.0 9.5129e-05 2.2 0.00e+00 0.0 0.0e+00 > 0.0e+00 0.0e+00 0 0 0 0 0 10 0 0 0 0 0 > > > > > > > > It seems there are only 5 reductions. > > > > > > > > But when I detail my log stages, it shows up VecAXPY/VecAYPX > operations require reductions as well (I have two VecAXPY and a single > VecAYPX, so 5+3 = 8). > > > > (Whose logs I have not included here). > > > > > > > > Normally these two operations should not require any reductions at > all, as opposed to VecDot. > > > > > > > > Do VecAXPY/VecAYPX require reductions? Is it because PETSc is > compiled with the debugging option so that it performs additional checks > that perform reductions? > > > > > > > > Which is the correct number of reductions in above statistics, 5 or > 8? > > > > > > > > Moreover, why does MatMult require no reduction whereas > MatMultTranspose requires two of them? > > > > > > > > Thanks in advance. > > > > > > > > > > > > > > > -- > > > --------- > > > > > > Oguz. > > > > > -- --------- Oguz.
