This usually happens if you use the wrong MPIEXEC

i.e use the mpiexec from the MPI you built PETSc with.

Satish

On Fri, 7 Aug 2015, [email protected] wrote:

> Hong,
> 
> Running example 2 with the command line given below gives me two uniprocessor 
> runs!?
> 
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist 
> -ksp_view
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
> Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=56, cols=56
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=56, cols=56
>     total: nonzeros=250, allocated nonzeros=280
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> Norm of error 5.21214e-15 iterations 1
> KSP Object: 1 MPI processes
>   type: gmres
>     GMRES: restart=30, using Classical (unmodified) Gram-Schmidt 
> Orthogonalization with no iterative refinement
>     GMRES: happy breakdown tolerance 1e-30
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=0.000138889, absolute=1e-50, divergence=10000
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=56, cols=56
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=56, cols=56
>     total: nonzeros=250, allocated nonzeros=280
>     total number of mallocs used during MatSetValues calls =0
>       not using I-node routines
> Norm of error 5.21214e-15 iterations 1
> 
> Mahir
> 
> From: Hong [mailto:[email protected]]
> Sent: den 6 augusti 2015 16:36
> To: Ülker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir:
> 
> I have been using PETSC_COMM_WORLD.
> 
> What do you get by running a petsc example, e.g.,
> petsc/src/ksp/ksp/examples/tutorials
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist 
> -ksp_view
> 
> KSP Object: 2 MPI processes
>   type: gmres
> ...
> 
> Hong
> 
> From: Hong [mailto:[email protected]<mailto:[email protected]>]
> Sent: den 5 augusti 2015 17:11
> To: Ülker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir:
> As you noticed, you ran the code in serial mode, not parallel.
> Check your code on input communicator, e.g., what input communicator do you 
> use in
> KSPCreate(comm,&ksp)?
> 
> I have added error flag to superlu_dist interface (released version). When 
> user uses '-mat_superlu_dist_parsymbfact'
> in serial mode, this option is ignored with a warning.
> 
> Hong
> 
> Hong,
> 
> If I set parsymbfact:
> 
> $ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu 
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput 
> DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view
> Invalid ISPEC at line 484 in file get_perm_c.c
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> --------------------------------------------------------------------------
> mpiexec detected that one or more processes exited with non-zero status, thus 
> causing
> the job to be terminated. The first process to do so was:
> 
>   Process name: [[63679,1],0]
>   Exit code:    255
> --------------------------------------------------------------------------
> 
> Since the program does not finish the call to KSPSolve(), we do not get any 
> information about the KSP from –ksp_view.
> 
> If I do not set it, I get a serial run even if I specify –n 2:
> 
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu 
> -pc_factor_mat_solver_package superlu_dist -ksp_view
> …
> KSP Object: 1 MPI processes
>   type: preonly
>   maximum iterations=10000, initial guess is zero
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000
>   left preconditioning
>   using NONE norm type for convergence test
> PC Object: 1 MPI processes
>   type: lu
>     LU: out-of-place factorization
>     tolerance for zero pivot 2.22045e-14
>     matrix ordering: nd
>     factor fill ratio given 0, needed 0
>       Factored matrix follows:
>         Mat Object:         1 MPI processes
>           type: seqaij
>           rows=954, cols=954
>           package used to perform factorization: superlu_dist
>           total: nonzeros=0, allocated nonzeros=0
>           total number of mallocs used during MatSetValues calls =0
>             SuperLU_DIST run parameters:
>               Process grid nprow 1 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 0
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 1 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
>   linear system matrix = precond matrix:
>   Mat Object:   1 MPI processes
>     type: seqaij
>     rows=954, cols=954
>     total: nonzeros=34223, allocated nonzeros=34223
>     total number of mallocs used during MatSetValues calls =0
>       using I-node routines: found 668 nodes, limit used is 5
> 
> I am running PETSc via Cygwin on a windows machine.
> When I installed PETSc the tests with different numbers of processes ran well.
> 
> Mahir
> 
> 
> From: Hong [mailto:[email protected]<mailto:[email protected]>]
> Sent: den 3 augusti 2015 19:06
> To: Ülker-Kaustell, Mahir
> Cc: Hong; Xiaoye S. Li; PETSc users list
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir,
> 
> 
> I have not used …parsymbfact in sequential runs or set matinput=GLOBAL for 
> parallel runs.
> 
> If I use 2 processors, the program runs if I use 
> –mat_superlu_dist_parsymbfact=1:
> mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu 
> -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL 
> -mat_superlu_dist_parsymbfact=1
> 
> The incorrect option  '-mat_superlu_dist_parsymbfact=1' is not taken, so your 
> code runs well without parsymbfact.
> 
> Please run it with '-ksp_view' and see what
> 'SuperLU_DIST run parameters:' are being used, e.g.
> petsc/src/ksp/ksp/examples/tutorials (maint)
> $ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist 
> -mat_superlu_dist_parsymbfact=1 -ksp_view
> 
> ...
>   SuperLU_DIST run parameters:
>               Process grid nprow 2 x npcol 1
>               Equilibrate matrix TRUE
>               Matrix input mode 1
>               Replace tiny pivots TRUE
>               Use iterative refinement FALSE
>               Processors in row 2 col partition 1
>               Row permutation LargeDiag
>               Column permutation METIS_AT_PLUS_A
>               Parallel symbolic factorization FALSE
>               Repeated factorization SamePattern_SameRowPerm
> 
> I do not understand why your code uses matrix input mode = global.
> 
> Hong
> 
> 
> 
> From: Hong [mailto:[email protected]<mailto:[email protected]>]
> Sent: den 3 augusti 2015 16:46
> To: Xiaoye S. Li
> Cc: Ülker-Kaustell, Mahir; Hong; PETSc users list
> 
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> Mahir,
> 
> Sherry found the culprit. I can reproduce it:
> petsc/src/ksp/ksp/examples/tutorials
> mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist 
> -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> ...
> 
> PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using 
> more than one processes.
> Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set 
> matinput=GLOBAL for parallel run?
> 
> I'll add an error flag for these use cases.
> 
> Hong
> 
> On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li 
> <[email protected]<mailto:[email protected]>> wrote:
> I think I know the problem.   Since zdistribute.c is called, I guess you are 
> using the global (replicated) matrix input interface, pzgssvx_ABglobal().  
> This interface does not allow you to use parallel symbolic factorization 
> (since matrix is centralized).
> 
> That's why you get the following error:
> Invalid ISPEC at line 484 in file get_perm_c.c
> 
> You need to use distributed matrix input interface pzgssvx() (without 
> ABglobal)
> 
> Sherry
> 
> 
> On Mon, Aug 3, 2015 at 5:02 AM, 
> [email protected]<mailto:[email protected]> 
> <[email protected]<mailto:[email protected]>> wrote:
> Hong and Sherry,
> 
> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:
> 
> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid 
> ISPEC at line 484 in file get_perm_c.c
> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the program 
> crashes with:  Calloc fails for SPA dense[]. at line 438 in file zdistribute.c
> 
> Mahir
> 
> From: Hong [mailto:[email protected]<mailto:[email protected]>]
> Sent: den 30 juli 2015 02:58
> To: Ülker-Kaustell, Mahir
> Cc: Xiaoye Li; PETSc users list
> 
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> 
> Mahir,
> 
> Sherry fixed several bugs in superlu_dist-v4.1.
> The current petsc-release interfaces with superlu_dist-v4.0.
> We do not know whether the reported issue (attached below) has been resolved 
> or not. If not, can you test it with the latest superlu_dist-v4.1?
> 
> Here is how to do it:
> 1. download superlu_dist v4.1
> 2. remove existing PETSC_ARCH directory, then configure petsc with
> '--download-superlu_dist=superlu_dist_4.1.tar.gz'
> 3. build petsc
> 
> Let us know if the issue remains.
> 
> Hong
> 
> 
> ---------- Forwarded message ----------
> From: Xiaoye S. Li <[email protected]<mailto:[email protected]>>
> Date: Wed, Jul 29, 2015 at 2:24 PM
> Subject: Fwd: [petsc-users] SuperLU MPI-problem
> To: Hong Zhang <[email protected]<mailto:[email protected]>>
> Hong,
> I am cleaning the mailbox, and saw this unresolved issue.  I am not sure 
> whether the new fix to parallel symbolic factorization solves the problem.  
> What bothers be is that he is getting the following error:
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> This has nothing to do with my bug fix.
> ​  Shall we ask him to try the new version, or try to get him matrix?
> Sherry
> ​
> 
> ---------- Forwarded message ----------
> From: [email protected]<mailto:[email protected]> 
> <[email protected]<mailto:[email protected]>>
> Date: Wed, Jul 22, 2015 at 1:32 PM
> Subject: RE: [petsc-users] SuperLU MPI-problem
> To: Hong <[email protected]<mailto:[email protected]>>, "Xiaoye S. Li" 
> <[email protected]<mailto:[email protected]>>
> Cc: petsc-users <[email protected]<mailto:[email protected]>>
> The 1000 was just a conservative guess. The number of non-zeros per row is in 
> the tens in general but certain constraints lead to non-diagonal streaks in 
> the sparsity-pattern.
> Is it the reordering of the matrix that is killing me here? How can I set 
> options.ColPerm?
> 
> If i use -mat_superlu_dist_parsymbfact the program crashes with
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch 
> system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
> find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message 
> --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
> trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk 
> Wed Jul 22 21:59:23 2015
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ 
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 
> --with-scalar-type=complex --download-fblaspack --download-mpich 
> --download-scalapack --download-mumps --download-metis --download-parmetis 
> --download-superlu --download-superlu_dist --download-fftw
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> 
> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) 
> with
> 
> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c
> col block 3006 -------------------------------------------------------
> Primary job  terminated normally, but 1 process returned
> a non-zero exit code.. Per user-direction, the job has been aborted.
> -------------------------------------------------------
> col block 1924 [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch 
> system) has told this process to end
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
> find memory corruption errors
> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run
> [0]PETSC ERROR: to get more information on the crash.
> [0]PETSC ERROR: --------------------- Error Message 
> --------------------------------------------------------------
> [0]PETSC ERROR: Signal received
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for 
> trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015
> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk 
> Wed Jul 22 21:59:58 2015
> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 
> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ 
> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 
> --with-scalar-type=complex --download-fblaspack --download-mpich 
> --download-scalapack --download-mumps --download-metis --download-parmetis 
> --download-superlu --download-superlu_dist --download-fftw
> [0]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [unset]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0
> [0]PETSC ERROR: 
> ------------------------------------------------------------------------
> 
> 
> /Mahir
> 
> 
> From: Hong [mailto:[email protected]<mailto:[email protected]>]
> Sent: den 22 juli 2015 21:34
> To: Xiaoye S. Li
> Cc: Ülker-Kaustell, Mahir; petsc-users
> 
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> In Petsc/superlu_dist interface, we set default
> 
> options.ParSymbFact = NO;
> 
> When user raises the flag "-mat_superlu_dist_parsymbfact",
> we set
> 
>     options.ParSymbFact = YES;
>     options.ColPerm     = PARMETIS;   /* in v2.2, PARMETIS is forced for 
> ParSymbFact regardless of user ordering setting */
> 
> We do not change anything else.
> 
> Hong
> 
> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li 
> <[email protected]<mailto:[email protected]>> wrote:
> I am trying to understand your problem. You said you are solving Naviers 
> equation (elastodynamics) in the frequency domain, using finite element 
> discretization.  I wonder why you have about 1000 nonzeros per row.  Usually 
> in many PDE discretized matrices, the number of nonzeros per row is in the 
> tens (even for 3D problems), not in the thousands.   So, your matrix is quite 
> a bit denser than many sparse matrices we deal with.
> 
> The number of nonzeros in the L and U factors is much more than that in 
> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as 
> bad as 50-100x fill ratio for 3D.  But since your matrix starts much denser 
> (i.e., the underlying graph has many connections), it may not lend to any 
> good ordering strategy to preserve sparsity of L and U; that is, the L and U 
> fill ratio may be large.
> 
> I don't understand why you get the following error when you use
> ‘-mat_superlu_dist_parsymbfact’.
> 
> Invalid ISPEC at line 484 in file get_perm_c.c
> 
> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc.
> 
> ​Hong -- in order to use parallel symbolic factorization, is it sufficient to 
> specify only
> ‘-mat_superlu_dist_parsymbfact’
> ​ ?  (the default is to use  sequential symbolic factorization.)
> 
> 
> Sherry
> 
> On Wed, Jul 22, 2015 at 9:11 AM, 
> [email protected]<mailto:[email protected]> 
> <[email protected]<mailto:[email protected]>> wrote:
> Thank you for your reply.
> 
> As you have probably figured out already, I am not a computational scientist. 
> I am a researcher in civil engineering (railways for high-speed traffic), 
> trying to produce some, from my perspective, fairly large parametric studies 
> based on finite element discretizations.
> 
> I am working in a Windows-environment and have installed PETSc through Cygwin.
> Apparently, there is no support for Valgrind in this OS.
> 
> If I have understood you correct, the memory issues are related to superLU 
> and given my background, there is not much I can do. Is this correct?
> 
> 
> Best regards,
> Mahir
> 
> ______________________________________________
> Mahir Ülker-Kaustell, Kompetenssamordnare, Brokonstruktör, Tekn. Dr, Tyréns AB
> 010 452 30 82, 
> [email protected]<mailto:[email protected]>
> ______________________________________________
> 
> -----Original Message-----
> From: Barry Smith [mailto:[email protected]<mailto:[email protected]>]
> Sent: den 22 juli 2015 02:57
> To: Ülker-Kaustell, Mahir
> Cc: Xiaoye S. Li; petsc-users
> Subject: Re: [petsc-users] SuperLU MPI-problem
> 
> 
>    Run the program under valgrind 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I use the 
> option -mat_superlu_dist_parsymbfact I get many scary memory problems some 
> involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> 
>    Note that I consider it unacceptable for running programs to EVER use 
> uninitialized values; until these are all cleaned up I won't trust any runs 
> like this.
> 
>   Barry
> 
> 
> 
> 
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)
> ==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)
> ==42050==    by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)
> ==42050==    by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)
> ==42050==    by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)
> ==42050==    by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)
> ==42050==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)
> ==42050==
> ==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    at 0x102DA1C3A: writev (in 
> /usr/lib/system/libsystem_kernel.dylib)
> ==42049==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42049==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42049==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42049==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42049==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==  Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 
> alloc'd
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42049==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    at 0x102DA1C3A: writev (in 
> /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42049==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42049==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42048==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42048==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42048==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)
> ==42048==    by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42049==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42049==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42049==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42049==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42049==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42049==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==  Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 
> alloc'd
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)
> ==42048==    by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)
> ==42048==    by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x1020EB90C: gk_malloc (memory.c:147)
> ==42048==    by 0x10211C50B: libmetis__imalloc (gklib.c:24)
> ==42048==    by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)
> ==42048==    by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)
> ==42048==    by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)
> ==42048==    by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)
> ==42048==    by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42048== Syscall param write(buf) points to uninitialised byte(s)
> ==42048==    at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)
> ==42048==    by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)
> ==42048==    by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)
> ==42048==    by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)
> ==42048==    by 0x10293ADBA: MPID_Send (mpid_send.c:130)
> ==42048==    by 0x10277A1FA: MPI_Send (send.c:127)
> ==42048==    by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Address 0x104810704 is on thread 1's stack
> ==42048==  in frame #3, created by MPIDI_CH3_EagerContigShortSend 
> (ch3u_eager.c:218)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)
> ==42048==    by 0x101501192: pdgssvx (pdgssvx.c:934)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)
> ==42050==    by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)
> ==42050==    by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)
> ==42050==    by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)
> ==42050==    by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)
> ==42050==    by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a stack allocation
> ==42050==    at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)
> ==42050==
> ==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)
> ==42050==    at 0x102DA1C3A: writev (in 
> /usr/lib/system/libsystem_kernel.dylib)
> ==42050==    by 0x10296A0DC: MPL_large_writev (mplsock.c:32)
> ==42050==    by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)
> ==42050==    by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)
> ==42050==    by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)
> ==42050==    by 0x102939531: MPID_Isend (mpid_isend.c:138)
> ==42050==    by 0x10277656E: MPI_Isend (isend.c:125)
> ==42050==    by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)
> ==42050==    by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 
> alloc'd
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)
> ==42050==    by 0x10151DA7D: pdgstrf (pdgstrf.c:735)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42049==    at 0x10151F141: pdgstrf (pdgstrf.c:1139)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42048== Conditional jump or move depends on uninitialised value(s)
> ==42048==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049== Conditional jump or move depends on uninitialised value(s)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x101520054: pdgstrf (pdgstrf.c:1429)
> ==42048==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42048==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42048==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42048==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42048==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42048==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42048==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42048==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42048==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42048==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42048==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==  Uninitialised value was created by a heap allocation
> ==42049==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42048==    by 0x100001B3C: main (in ./ex19)
> ==42048==
> ==42049==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42049==    by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)
> ==42049==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42049==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42049==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42049==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42049==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42049==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42049==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42049==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42049==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42049==    by 0x100001B3C: main (in ./ex19)
> ==42049==
> ==42050== Conditional jump or move depends on uninitialised value(s)
> ==42050==    at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)
> ==42050==    by 0x1015019A5: pdgssvx (pdgssvx.c:1069)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==  Uninitialised value was created by a heap allocation
> ==42050==    at 0x1000183B1: malloc (vg_replace_malloc.c:303)
> ==42050==    by 0x10153B704: superlu_malloc_dist (memory.c:108)
> ==42050==    by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)
> ==42050==    by 0x1015018C2: pdgssvx (pdgssvx.c:1057)
> ==42050==    by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST 
> (superlu_dist.c:414)
> ==42050==    by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)
> ==42050==    by 0x100F09F2C: PCSetUp_LU (lu.c:152)
> ==42050==    by 0x100FF9036: PCSetUp (precon.c:982)
> ==42050==    by 0x1010F54EB: KSPSetUp (itfunc.c:332)
> ==42050==    by 0x1010F7985: KSPSolve (itfunc.c:546)
> ==42050==    by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)
> ==42050==    by 0x1011C49B7: SNESSolve (snes.c:3906)
> ==42050==    by 0x100001B3C: main (in ./ex19)
> ==42050==
> 
> 
> > On Jul 20, 2015, at 12:03 PM, 
> > [email protected]<mailto:[email protected]> wrote:
> >
> > Ok. So I have been creating the full factorization on each process. That 
> > gives me some hope!
> >
> > I followed your suggestion and tried to use the runtime option 
> > ‘-mat_superlu_dist_parsymbfact’.
> > However, now the program crashes with:
> >
> > Invalid ISPEC at line 484 in file get_perm_c.c
> >
> > And so on…
> >
> > From the SuperLU manual; I should give the option either YES or NO, however 
> > -mat_superlu_dist_parsymbfact YES makes the program crash in the same way 
> > as above.
> > Also I can’t find any reference to -mat_superlu_dist_parsymbfact in the 
> > PETSc documentation
> >
> > Mahir
> >
> > Mahir Ülker-Kaustell, Kompetenssamordnare, Brokonstruktör, Tekn. Dr, Tyréns 
> > AB
> > 010 452 30 82, 
> > [email protected]<mailto:[email protected]>
> >
> > From: Xiaoye S. Li [mailto:[email protected]<mailto:[email protected]>]
> > Sent: den 20 juli 2015 18:12
> > To: Ülker-Kaustell, Mahir
> > Cc: Hong; petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > The default SuperLU_DIST setting is to serial symbolic factorization. 
> > Therefore, what matters is how much memory do you have per MPI task?
> >
> > The code failed to malloc memory during redistribution of matrix A to {L\U} 
> > data struction (using result of serial symbolic factorization.)
> >
> > You can use parallel symbolic factorization, by runtime option: 
> > '-mat_superlu_dist_parsymbfact'
> >
> > Sherry Li
> >
> >
> > On Mon, Jul 20, 2015 at 8:59 AM, 
> > [email protected]<mailto:[email protected]> 
> > <[email protected]<mailto:[email protected]>> 
> > wrote:
> > Hong:
> >
> > Previous experiences with this equation have shown that it is very 
> > difficult to solve it iteratively. Hence the use of a direct solver.
> >
> > The large test problem I am trying to solve has slightly less than 10^6 
> > degrees of freedom. The matrices are derived from finite elements so they 
> > are sparse.
> > The machine I am working on has 128GB ram. I have estimated the memory 
> > needed to less than 20GB, so if the solver needs twice or even three times 
> > as much, it should still work well. Or have I completely misunderstood 
> > something here?
> >
> > Mahir
> >
> >
> >
> > From: Hong [mailto:[email protected]<mailto:[email protected]>]
> > Sent: den 20 juli 2015 17:39
> > To: Ülker-Kaustell, Mahir
> > Cc: petsc-users
> > Subject: Re: [petsc-users] SuperLU MPI-problem
> >
> > Mahir:
> > Direct solvers consume large amount of memory. Suggest to try followings:
> >
> > 1. A sparse iterative solver if  [-omega^2M + K] is not too 
> > ill-conditioned. You may test it using the small matrix.
> >
> > 2. Incrementally increase your matrix sizes. Try different matrix orderings.
> > Do you get memory crash in the 1st symbolic factorization?
> > In your case, matrix data structure stays same when omega changes, so you 
> > only need to do one matrix symbolic factorization and reuse it.
> >
> > 3. Use a machine that gives larger memory.
> >
> > Hong
> >
> > Dear Petsc-Users,
> >
> > I am trying to use PETSc to solve a set of linear equations arising from 
> > Naviers equation (elastodynamics) in the frequency domain.
> > The frequency dependency of the problem requires that the system
> >
> >                              [-omega^2M + K]u = F
> >
> > where M and K are constant, square, positive definite matrices (mass and 
> > stiffness respectively) is solved for each frequency omega of interest.
> > K is a complex matrix, including material damping.
> >
> > I have written a PETSc program which solves this problem for a small (1000 
> > degrees of freedom) test problem on one or several processors, but it keeps 
> > crashing when I try it on my full scale (in the order of 10^6 degrees of 
> > freedom) problem.
> >
> > The program crashes at KSPSetUp() and from what I can see in the error 
> > messages, it appears as if it consumes too much memory.
> >
> > I would guess that similar problems have occurred in this mail-list, so I 
> > am hoping that someone can push  me in the right direction…
> >
> > Mahir
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 

Reply via email to