> On May 7, 2018, at 10:34 AM, NAN ZHAO <[email protected]> wrote:
>
> Dear all,
>
> I am trying to integrated PETSC in a legacy FEM code, but I had a few
> troubles to get the performance of MatSetValues to match the old subroutines
> in the sequtial implementation :
>
> 1. I use MatSetVaules per element, to add the matrix value to each row and
> col this element had, and find it really slow compared with my old subroutine
> which directly add the value to a 1-d array to store the value in CRS format.
> For a case with 12K unknown, my old subroutines takes several seconds, but
> MatSetValues takes around 50 seconds to finish the matrix calculation
> part.... Did I do something wrong, I do have preallocation giving non-zeros
> for each row in this MATSEQ matrix...
50 seconds means something is incorrect with the preallocation:
http://www.mcs.anl.gov/petsc/documentation/faq.html#efficient-assembly
>
> 2. I switched to MatCreateSeqAIJWithArrays, and the performance seems to be
> OK. I do not understand the difference, does MatCreateSeqAIJWithArrays call
> MatSetValues internally?
The i, j, and a arrays are not copied by this routine, PETSc uses exactly
what you provided without copying hence it is fast.
> or it is just the difference with INSERT_VALUES vs ADD_VALUES?
Insert vs add shouldn't make a performance difference.
>
>
> 3, I want to know the converged rtol,atol of a KSPSolve, how to I do it?
If you want to know the convergence criteria used by the test you can call
KSPGetTolerances()
You can run with -ksp_monitor to see the norm of the (preconditioned)
residual at each iteration or -ksp_monitor_true_residual to see the norm of the
non-preconditioned residual.
You can use KSPSetResidualHistory() to have the residual norm saved at each
iteration then after KSPSolve() you can access the array to see the final
residual norm.
>
> 4. I want to do a parallel implement of this too, but worried about the
> performance of MatSetValues, should I use MatCreateMPIAIJWithArrays?
First you need to get the preallocation working for sequential than change
to parallel.
>
> Thanks,
>
> Nan