> On May 7, 2018, at 10:34 AM, NAN ZHAO <[email protected]> wrote:
> 
> Dear all,
> 
> I am trying to integrated PETSC in a legacy FEM code, but I had a few 
> troubles to get the performance of MatSetValues to match the old subroutines 
> in the sequtial implementation :
> 
> 1. I use MatSetVaules per element, to add the matrix value to each row and 
> col this element had, and find it really slow compared with my old subroutine 
> which directly add the value to a 1-d array to store the value in CRS format. 
>  For a case with 12K unknown, my old subroutines takes several seconds, but 
> MatSetValues takes around 50 seconds to finish the matrix calculation 
> part.... Did I do something wrong, I do have preallocation giving non-zeros 
> for each row in this MATSEQ matrix...

    50 seconds means something is incorrect with the preallocation: 
http://www.mcs.anl.gov/petsc/documentation/faq.html#efficient-assembly

> 
> 2. I switched to MatCreateSeqAIJWithArrays, and the performance seems to be 
> OK.  I do not understand the difference, does MatCreateSeqAIJWithArrays call 
> MatSetValues internally?

   The i, j, and a arrays are not copied by this routine, PETSc uses exactly 
what you provided without copying hence it is fast.

> or it is just the difference with INSERT_VALUES vs ADD_VALUES?

    Insert vs add shouldn't make a performance difference. 
> 
> 
> 3, I want to know the converged rtol,atol of a KSPSolve, how to I do it?

    If you want to know the convergence criteria used by the test you can call 
KSPGetTolerances()

    You can run with -ksp_monitor to see the norm of the (preconditioned) 
residual at each iteration or -ksp_monitor_true_residual to see the norm of the 
non-preconditioned residual.

    You can use KSPSetResidualHistory() to have the residual norm saved at each 
iteration then after KSPSolve() you can access the array to see the final 
residual norm.


> 
> 4. I want to do a parallel implement of this too, but worried about the 
> performance of MatSetValues, should I use MatCreateMPIAIJWithArrays?

    First you need to get the preallocation working for sequential than change 
to parallel. 
> 
> Thanks,
> 
> Nan

Reply via email to