PCSetUpOnBlocks() calls KSPSetUp() on each block which calls PCSetUp() on 
each block. This is why you are getting the confusing logging results. So the "
extra" KSPSetUp() is the one on the blocks.

> On Oct 5, 2021, at 12:02 PM, Karthikeyan Chockalingam - STFC UKRI 
> <[email protected]> wrote:
> 
> Hi Matt,
>  
> I have a couple of questions;
>  
> Weather I run on single core or on multiple cores I find that KSPSetUP 
> (ksp/tutorial/ex.45.c) is always called twice. But why? Is setup not done 
> once?
> I find PCSetUpOnBlocks is calling PCSetup and not the other way around. Can 
> you shed some light? The preconditioner used is block jacobi.
>  
> Thanks,
> Karthik.
>  
> From: Matthew Knepley <[email protected] <mailto:[email protected]>>
> Date: Sunday, 3 October 2021 at 12:54
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
> <[email protected] 
> <mailto:[email protected]>>
> Cc: "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Subject: Re: [petsc-users] (percent time in this phase)
>  
> On Sun, Oct 3, 2021 at 5:43 AM Karthikeyan Chockalingam - STFC UKRI 
> <[email protected] 
> <mailto:[email protected]>> wrote:
> Hi Matt,
>  
> Thank you. The flamegraph tool is helpful.
> Please find the attached screen shoot and foo.txt which generated that graph 
> (using https://www.speedscope.app <https://www.speedscope.app/>).
> I find the following call sequence from the graph
> KSPSolve -> PCApply -> KSPSolve -> PCApply -> MatSolve
>  
> I have a couple of questions
> The KSPSolve time listed in the file using -log_summary (or -log_view), is it 
> the time taken by the first KSPSolve (in the above call sequence)?
> Yes. When calls are nested, we just do not record the time for the nested 
> call in log_view.
>  
>  
> What is the unit of measurement in flamegrah?
> I believe it is microseconds, but I am not sure.
>  
>   Thanks,
>  
>      Matt
>  
>  
> Thanks,
> Karthik.
>  
> From: Matthew Knepley <[email protected] <mailto:[email protected]>>
> Date: Friday, 1 October 2021 at 14:51
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
> <[email protected] 
> <mailto:[email protected]>>
> Cc: Barry Smith <[email protected] <mailto:[email protected]>>, 
> "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Subject: Re: [petsc-users] (percent time in this phase)
>  
> On Thu, Sep 30, 2021 at 8:50 AM Karthikeyan Chockalingam - STFC UKRI 
> <[email protected] 
> <mailto:[email protected]>> wrote:
> When comparing the MatSolve data for
>  
> GPU
>  
> MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 
> 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 
> 0.00e+00 100
> MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0 
> 0.00e+00  0
>  
> and CPU
>  
> MatSolve             352 1.0 1.3553e+02 1.0 1.02e+11 1.0 0.0e+00 0.0e+00 
> 0.0e+00 35 34  0  0  0  35 34  0  0  0  4489
>  
> the time spent is almost the same for this preconditioner.  Look like 
> MatCUSPARSSolAnl is called only twice (since I am running on two cores) 
>  
> mpirun -n 2 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  
> -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type 
> bjacobi -ksp_monitor
>  
> So would it be fair to assume MatCUSPARSSolAnl  is not accounted for in 
> MatSolve and it is an exclusive event?
> KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%) ~ 
> 100 %
>  
> I am getting so old. We have a different kind of log output if you are really 
> concerned about inclusion. You can run with
>  
>   -log_view :foo.txt:ascii_flamegraph
>  
> and then there are tools for plotting that output, described here
>  
>   https://firedrakeproject.org/optimising.html 
> <https://firedrakeproject.org/optimising.html>
>  
> This output _guarantees_ strict inclusion, so you will not have the problems 
> you have above adding things up.
>  
>   Thanks,
>  
>     Matt
>  
> Best,
> Karthik.
>  
>  
> From: Matthew Knepley <[email protected] <mailto:[email protected]>>
> Date: Wednesday, 29 September 2021 at 16:29
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
> <[email protected] 
> <mailto:[email protected]>>
> Cc: Barry Smith <[email protected] <mailto:[email protected]>>, 
> "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
> On Wed, Sep 29, 2021 at 10:18 AM Karthikeyan Chockalingam - STFC UKRI 
> <[email protected] 
> <mailto:[email protected]>> wrote:
> Thank you!
>  
> Just to summarize
>  
> KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%) ~ 
> 100 %
>  
> You didn’t happen to mention how MatCUSPARSSolAnl is accounted for? Am I 
> right in accounting for it as above?
>  
> I am not sure.I thought it might be the GPU part of MatSolve(). I will have 
> to look in the code. I am not as familiar with the GPU part.
>  
> MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0 
> 0.00e+00  0
>  
> Finally, I believe the vector events, VecNorn, VecTDot, VecAXPY, and VecAYPX 
> are mutually exclusive?
>  
> Yes.
>  
>   Thanks,
>  
>      Matt
>  
> Best,
>  
> Karthik.
>  
> From: Matthew Knepley <[email protected] <mailto:[email protected]>>
> Date: Wednesday, 29 September 2021 at 11:58
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
> <[email protected] 
> <mailto:[email protected]>>
> Cc: Barry Smith <[email protected] <mailto:[email protected]>>, 
> "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
> On Wed, Sep 29, 2021 at 6:24 AM Karthikeyan Chockalingam - STFC UKRI 
> <[email protected] 
> <mailto:[email protected]>> wrote:
> Thank you Mathew. Now, it is all making sense to me.
>  
> From data file ksp_ex45_N511_gpu_2.txt
>  
> KSPSolve (53%) + KSPSetup (0%) = PCSetup (16%) + PCApply (37%).
>  
> However, you said “So an iteration would mostly consist of MatMult + PCApply, 
> with some vector work”
>  
> 1) You do one solve, but 2 KSPSetUp()s. You must be running on more than one 
> process and using Block-Jacobi . Half the time is spent in the solve (53%)
>  
> KSPSetUp               2 1.0 5.3149e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
> 2.4e+01  0  0  0  0  1   0  0  0  0  1     0       0      0 0.00e+00    0 
> 0.00e+00  0
> KSPSolve               1 1.0 1.5837e+02 1.1 8.63e+11 1.0 6.8e+02 2.1e+06 
> 4.4e+03 53100100100 95  53100100100 96 10881   11730   1022 6.40e+03 1021 
> 8.17e-03 100
>  
> 
> 2) The preconditioner look like BJacobi-ILU. The setup time is 16%, which is 
> all setup of the individual blocks, and this is all used by the numerical ILU 
> factorization.
>  
> PCSetUp 2 1.0 4.9623e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0 0 
> 16 0 0 0 0 58 0 2 6.93e+03 0 0.00e+00 0 PCSetUpOnBlocks 1 1.0 4.9274e+01 1.3 
> 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 0 0 0 0 15 0 0 0 0 59 0 2 6.93e+03 0 
> 0.00e+00 0
> MatLUFactorNum         1 1.0 4.6126e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 
> 0.0e+00 14  0  0  0  0  14  0  0  0  0    63       0      2 6.93e+03    0 
> 0.00e+00  0
> MatILUFactorSym        1 1.0 2.5110e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 
> 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 
> 0.00e+00  0
>  
> 3) The preconditioner application takes 37% of the time, which is all solving 
> the factors and recorded in MatSolve(). Matrix multiplication takes 4%.
>  
> PCApply 341 1.0 1.3068e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 37 34 0 0 
> 0 37 34 0 0 0 4516 4523 1 5.34e+02 0 0.00e+00 100
> MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 
> 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 
> 0.00e+00 100
> MatMult              341 1.0 1.0774e+01 1.1 2.96e+11 1.0 6.9e+02 2.1e+06 
> 2.0e+00  4 34100100  0   4 34100100  0 54801   66441      2 5.86e+03    0 
> 0.00e+00 100
>  
> 4) The significant vector time is all in norms (11%) since they are really 
> slow on the GPU.
>  
> 
> VecNorm              342 1.0 6.2261e+01129.9 4.57e+10 1.0 0.0e+00 0.0e+00 
> 6.8e+02 11  5  0  0 15  11  5  0  0 15  1466   196884      0 0.00e+00  342 
> 2.74e-03 100
> VecTDot              680 1.0 1.7107e+00 1.3 9.09e+10 1.0 0.0e+00 0.0e+00 
> 1.4e+03  1 10  0  0 29   1 10  0  0 29 106079   133922      0 0.00e+00  680 
> 5.44e-03 100
> VecAXPY              681 1.0 3.2036e+00 1.7 9.10e+10 1.0 0.0e+00 0.0e+00 
> 0.0e+00  1 11  0  0  0   1 11  0  0  0 56728   58367    682 5.34e+02    0 
> 0.00e+00 100
> VecAYPX              339 1.0 2.6502e+00 1.8 4.53e+10 1.0 0.0e+00 0.0e+00 
> 0.0e+00  1  5  0  0  0   1  5  0  0  0 34136   34153    339 2.71e-03    0 
> 0.00e+00 100
>  
> So the solve time is:
>  
>   53% ~ 37% + 4% + 11%
>  
> and the setup time is about 16%. I was wrong about the SetUp time being 
> included, as it is outside the event:
>  
>   
> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852
>  
> <https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852>
>  
> It looks like the remainder of the time (23%) is spent preallocating the 
> matrix.
>  
>   Thanks,
>  
>      Matt
>  
> The MalMult event is 4 %. How does this event figure into the above equation; 
> if preconditioning (MatMult + PCApply) is included in KSPSolve?
>  
> Best,
> Karthik.
>  
> From: Matthew Knepley <[email protected] <mailto:[email protected]>>
> Date: Wednesday, 29 September 2021 at 10:58
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
> <[email protected] 
> <mailto:[email protected]>>
> Cc: Barry Smith <[email protected] <mailto:[email protected]>>, 
> "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
> On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI 
> <[email protected] 
> <mailto:[email protected]>> wrote:
> That was helpful. I would like to provide some additional details of my run 
> on cpus and gpus. Please find the following attachments:
>  
> graph.pdf a plot showing overall time and various petsc events.
> ksp_ex45_N511_cpu_6.txt data file of the log_summary 
> ksp_ex45_N511_gpu_2.txt data file of the log_summary 
>  
> I used the following petsc options for cpu
>  
> mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511 
> -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi 
> -ksp_monitor
>  
> and for gpus
>  
> mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  
> -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type 
> bjacobi -ksp_monitor
>  
> to run the following problem
>  
> https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html 
> <https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html>
>  
> From the above code, I see is there no individual function called KSPSetUp(), 
> so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS, 
> kSPSetComputeOperators all are timed together as KSPSetUp. For this example, 
> is KSPSetUp time and KSPSolve time mutually exclusive?
>  
> No, KSPSetUp() will be contained in KSPSolve() if it is called automatically.
>  
> In your response you said that
>  
>    “PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends 
> on how much of the preconditioner construction can take place early, so 
> depends exactly on the preconditioner used.”
>  
> I don’t see a explicit call to PCSetUp() or  PCApply() in ex45; so for this 
> particular preconditioner (bjacobi) how can I tell how they are timed?
>  
> They are all inside KSPSolve(). If you have a preconditioned linear solve, 
> the oreconditioning happens during the iteration. So an iteration would mostly
> consist of MatMult + PCApply, with some vector work.
>  
> I am hoping to time KSP solving and preconditioning mutually exclusively.
>  
> I am not sure that concept makes sense here. See above.
>  
>   Thanks,
>  
>      Matt
>  
>  
> Kind regards,
> Karthik.
>  
>  
> From: Barry Smith <[email protected] <mailto:[email protected]>>
> Date: Tuesday, 28 September 2021 at 19:19
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
> <[email protected] 
> <mailto:[email protected]>>
> Cc: "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
>  
>  
> 
> On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI 
> <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> Thanks for Barry for your response.
>  
> I was just benchmarking the problem with various preconditioner on cpu and 
> gpu. I understand, it is not possible to get mutually exclusive timing.
> However, can you tell if KSPSolve time includes both PCSetup and PCApply? And 
> if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp and 
> PCApply.
>  
>    If you do not call KSPSetUp() separately from KSPSolve() then its time is 
> included with KSPSolve().
>  
>    PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on 
> how much of the preconditioner construction can take place early, so depends 
> exactly on the preconditioner used.
>  
>    So yes the answer is not totally satisfying. The one thing I would 
> recommend is to not call KSPSetUp() directly and then KSPSolve() will always 
> include the total time of the solve plus all setup time. PCApply will contain 
> all the time to apply the preconditioner but may also include some setup time.
>  
>   Barry
>  
>  
> Best,
> Karthik.
>  
>  
>  
>  
> From: Barry Smith <[email protected] <mailto:[email protected]>>
> Date: Tuesday, 28 September 2021 at 16:56
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" 
> <[email protected] 
> <mailto:[email protected]>>
> Cc: "[email protected] <mailto:[email protected]>" 
> <[email protected] <mailto:[email protected]>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
>  
>  
> 
> On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI 
> <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> Hello,
>  
> I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson 
> problem. I noticed from the output from using the flag -log_summary that for 
> various events their respective %T (percent time in this phase) do not add up 
> to 100 but rather exceeds 100. So, I gather there is some overlap among these 
> events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and 
> PCSolve. Is it possible to get a mutually exclusive %T or Time for these 
> individual events? I have attached  the log_summary output file from my run 
> for your reference.
>  
>  
>   For nested solvers it is tricky to get the times to be mutually exclusive 
> because some parts of the building of the preconditioner is for some 
> preconditioners delayed until the solve has started. 
>  
>   It looks like you are using the default preconditioner options which for 
> this example are taking more or less no time since so many iterations are 
> needed. It is best to use -pc_type mg to use geometric multigrid on this 
> problem.
>  
>   Barry
>  
>  
>  
> 
> Thanks!
> Karthik.
>  
> This email and any attachments are intended solely for the use of the named 
> recipients. If you are not the intended recipient you must not use, disclose, 
> copy or distribute this email or any of its attachments and should notify the 
> sender immediately and delete this email from your system. UK Research and 
> Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
> this email or any attachments containing viruses or malware but the recipient 
> should carry out its own virus and malware checks before opening the 
> attachments. UKRI does not accept any liability for any losses or damages 
> which the recipient may sustain due to presence of any viruses. 
> <ksp_ex45_N511_cpu_6.txt>
>  
> 
> 
>  
> --
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 
>  
> --
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 
>  
> --
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 
>  
> --
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 
>  
> -- 
> What most experimenters take for granted before they begin their experiments 
> is infinitely more interesting than any results to which their experiments 
> lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

Reply via email to