Mahir, Sherry found the culprit. I can reproduce it: petsc/src/ksp/ksp/examples/tutorials mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact
Invalid ISPEC at line 484 in file get_perm_c.c Invalid ISPEC at line 484 in file get_perm_c.c ------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code.. Per user-direction, the job has been aborted. ------------------------------------------------------- ... PETSc-superlu_dist interface sets matinput=DISTRIBUTED as default when using more than one processes. Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set matinput=GLOBAL for parallel run? I'll add an error flag for these use cases. Hong On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <[email protected]> wrote: > I think I know the problem. Since zdistribute.c is called, I guess you > are using the global (replicated) matrix input interface, > pzgssvx_ABglobal(). This interface does not allow you to use parallel > symbolic factorization (since matrix is centralized). > > That's why you get the following error: > Invalid ISPEC at line 484 in file get_perm_c.c > > You need to use distributed matrix input interface pzgssvx() (without > ABglobal) > > Sherry > > > On Mon, Aug 3, 2015 at 5:02 AM, [email protected] < > [email protected]> wrote: > >> Hong and Sherry, >> >> >> >> I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains: >> >> >> >> If I use -mat_superlu_dist_parsymbfact, the program crashes with: Invalid >> ISPEC at line 484 in file get_perm_c.c >> >> If I use -mat_superlu_dist_parsymbfact=1 or leave this flag out, the >> program crashes with: Calloc fails for SPA dense[]. at line 438 in file >> zdistribute.c >> >> >> >> Mahir >> >> >> >> *From:* Hong [mailto:[email protected]] >> *Sent:* den 30 juli 2015 02:58 >> *To:* Ülker-Kaustell, Mahir >> *Cc:* Xiaoye Li; PETSc users list >> >> *Subject:* Fwd: [petsc-users] SuperLU MPI-problem >> >> >> >> Mahir, >> >> >> >> Sherry fixed several bugs in superlu_dist-v4.1. >> >> The current petsc-release interfaces with superlu_dist-v4.0. >> >> We do not know whether the reported issue (attached below) has been >> resolved or not. If not, can you test it with the latest superlu_dist-v4.1? >> >> >> >> Here is how to do it: >> >> 1. download superlu_dist v4.1 >> >> 2. remove existing PETSC_ARCH directory, then configure petsc with >> >> '--download-superlu_dist=superlu_dist_4.1.tar.gz' >> >> 3. build petsc >> >> >> >> Let us know if the issue remains. >> >> >> >> Hong >> >> >> >> >> >> ---------- Forwarded message ---------- >> From: *Xiaoye S. Li* <[email protected]> >> Date: Wed, Jul 29, 2015 at 2:24 PM >> Subject: Fwd: [petsc-users] SuperLU MPI-problem >> To: Hong Zhang <[email protected]> >> >> Hong, >> >> I am cleaning the mailbox, and saw this unresolved issue. I am not sure >> whether the new fix to parallel symbolic factorization solves the problem. >> What bothers be is that he is getting the following error: >> >> Invalid ISPEC at line 484 in file get_perm_c.c >> >> This has nothing to do with my bug fix. >> >> Shall we ask him to try the new version, or try to get him matrix? >> >> Sherry >> >> >> >> >> ---------- Forwarded message ---------- >> From: *[email protected] <[email protected]>* < >> [email protected]> >> Date: Wed, Jul 22, 2015 at 1:32 PM >> Subject: RE: [petsc-users] SuperLU MPI-problem >> To: Hong <[email protected]>, "Xiaoye S. Li" <[email protected]> >> Cc: petsc-users <[email protected]> >> >> The 1000 was just a conservative guess. The number of non-zeros per row >> is in the tens in general but certain constraints lead to non-diagonal >> streaks in the sparsity-pattern. >> >> Is it the reordering of the matrix that is killing me here? How can I set >> options.ColPerm? >> >> >> >> If i use -mat_superlu_dist_parsymbfact the program crashes with >> >> >> >> Invalid ISPEC at line 484 in file get_perm_c.c >> >> ------------------------------------------------------- >> >> Primary job terminated normally, but 1 process returned >> >> a non-zero exit code.. Per user-direction, the job has been aborted. >> >> ------------------------------------------------------- >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the >> batch system) has told this process to end >> >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS >> X to find memory corruption errors >> >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> >> [0]PETSC ERROR: to get more information on the crash. >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [0]PETSC ERROR: Signal received >> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> >> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 >> >> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by >> muk Wed Jul 22 21:59:23 2015 >> >> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 >> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 >> --with-scalar-type=complex --download-fblaspack --download-mpich >> --download-scalapack --download-mumps --download-metis --download-parmetis >> --download-superlu --download-superlu_dist --download-fftw >> >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> [unset]: aborting job: >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> >> >> If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat >> later) with >> >> >> >> Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c >> >> col block 3006 ------------------------------------------------------- >> >> Primary job terminated normally, but 1 process returned >> >> a non-zero exit code.. Per user-direction, the job has been aborted. >> >> ------------------------------------------------------- >> >> col block 1924 [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the >> batch system) has told this process to end >> >> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >> >> [0]PETSC ERROR: or see >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind >> >> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS >> X to find memory corruption errors >> >> [0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, >> and run >> >> [0]PETSC ERROR: to get more information on the crash. >> >> [0]PETSC ERROR: --------------------- Error Message >> -------------------------------------------------------------- >> >> [0]PETSC ERROR: Signal received >> >> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html >> for trouble shooting. >> >> [0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015 >> >> [0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by >> muk Wed Jul 22 21:59:58 2015 >> >> [0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 >> PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ >> --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 >> --with-scalar-type=complex --download-fblaspack --download-mpich >> --download-scalapack --download-mumps --download-metis --download-parmetis >> --download-superlu --download-superlu_dist --download-fftw >> >> [0]PETSC ERROR: #1 User provided function() line 0 in unknown file >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> [unset]: aborting job: >> >> application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0 >> >> [0]PETSC ERROR: >> ------------------------------------------------------------------------ >> >> >> >> >> >> /Mahir >> >> >> >> >> >> *From:* Hong [mailto:[email protected]] >> >> *Sent:* den 22 juli 2015 21:34 >> *To:* Xiaoye S. Li >> *Cc:* Ülker-Kaustell, Mahir; petsc-users >> >> >> *Subject:* Re: [petsc-users] SuperLU MPI-problem >> >> >> >> In Petsc/superlu_dist interface, we set default >> >> >> >> options.ParSymbFact = NO; >> >> >> >> When user raises the flag "-mat_superlu_dist_parsymbfact", >> >> we set >> >> >> >> options.ParSymbFact = YES; >> >> options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for >> ParSymbFact regardless of user ordering setting */ >> >> >> >> We do not change anything else. >> >> >> >> Hong >> >> >> >> On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <[email protected]> wrote: >> >> I am trying to understand your problem. You said you are solving Naviers >> equation (elastodynamics) in the frequency domain, using finite element >> discretization. I wonder why you have about 1000 nonzeros per row. >> Usually in many PDE discretized matrices, the number of nonzeros per row is >> in the tens (even for 3D problems), not in the thousands. So, your matrix >> is quite a bit denser than many sparse matrices we deal with. >> >> >> >> The number of nonzeros in the L and U factors is much more than that in >> original matrix A -- typically we see 10-20x fill ratio for 2D, or can be >> as bad as 50-100x fill ratio for 3D. But since your matrix starts much >> denser (i.e., the underlying graph has many connections), it may not lend >> to any good ordering strategy to preserve sparsity of L and U; that is, the >> L and U fill ratio may be large. >> >> >> >> I don't understand why you get the following error when you use >> >> ‘-mat_superlu_dist_parsymbfact’. >> >> >> >> Invalid ISPEC at line 484 in file get_perm_c.c >> >> >> >> Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. >> >> >> >> Hong -- in order to use parallel symbolic factorization, is it >> sufficient to specify only >> >> ‘-mat_superlu_dist_parsymbfact’ >> >> ? (the default is to use sequential symbolic factorization.) >> >> >> >> >> >> Sherry >> >> >> >> On Wed, Jul 22, 2015 at 9:11 AM, [email protected] < >> [email protected]> wrote: >> >> Thank you for your reply. >> >> As you have probably figured out already, I am not a computational >> scientist. I am a researcher in civil engineering (railways for high-speed >> traffic), trying to produce some, from my perspective, fairly large >> parametric studies based on finite element discretizations. >> >> I am working in a Windows-environment and have installed PETSc through >> Cygwin. >> Apparently, there is no support for Valgrind in this OS. >> >> If I have understood you correct, the memory issues are related to >> superLU and given my background, there is not much I can do. Is this >> correct? >> >> >> Best regards, >> Mahir >> >> ______________________________________________ >> Mahir Ülker-Kaustell, Kompetenssamordnare, Brokonstruktör, Tekn. Dr, >> Tyréns AB >> 010 452 30 82, [email protected] >> ______________________________________________ >> >> >> -----Original Message----- >> From: Barry Smith [mailto:[email protected]] >> Sent: den 22 juli 2015 02:57 >> To: Ülker-Kaustell, Mahir >> Cc: Xiaoye S. Li; petsc-users >> Subject: Re: [petsc-users] SuperLU MPI-problem >> >> >> Run the program under valgrind >> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind . When I >> use the option -mat_superlu_dist_parsymbfact I get many scary memory >> problems some involving for example ddist_psymbtonum >> (pdsymbfact_distdata.c:1332) >> >> Note that I consider it unacceptable for running programs to EVER use >> uninitialized values; until these are all cleaned up I won't trust any runs >> like this. >> >> Barry >> >> >> >> >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053) >> ==42050== by 0x101557F60: get_perm_c_parmetis >> (get_perm_c_parmetis.c:285) >> ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10155751B: get_perm_c_parmetis >> (get_perm_c_parmetis.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651) >> ==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903) >> ==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944) >> ==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107) >> ==42050== by 0x101557F60: get_perm_c_parmetis >> (get_perm_c_parmetis.c:285) >> ==42050== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10155751B: get_perm_c_parmetis >> (get_perm_c_parmetis.c:96) >> ==42050== >> ==42049== Syscall param writev(vector[...]) points to uninitialised >> byte(s) >> ==42049== at 0x102DA1C3A: writev (in >> /usr/lib/system/libsystem_kernel.dylib) >> ==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) >> ==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) >> ==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) >> ==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) >> ==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138) >> ==42049== by 0x10277656E: MPI_Isend (isend.c:125) >> ==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) >> ==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) >> ==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) >> ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) >> ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) >> ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42049== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== Syscall param writev(vector[...]) points to uninitialised >> byte(s) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== Address 0x105edff70 is 1,424 bytes inside a block of size >> 752,720 alloc'd >> ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) >> ==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) >> ==42048== at 0x102DA1C3A: writev (in >> /usr/lib/system/libsystem_kernel.dylib) >> ==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) >> ==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) >> ==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) >> ==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) >> ==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) >> ==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) >> ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42049== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138) >> ==42048== by 0x10277656E: MPI_Isend (isend.c:125) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63) >> ==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553) >> ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) >> ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== Uninitialised value was created by a heap allocation >> ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42049== by 0x1020EB90C: gk_malloc (memory.c:147) >> ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42048== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24) >> ==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) >> ==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) >> ==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42049== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42049== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== Address 0x10597a860 is 1,408 bytes inside a block of size >> 752,720 alloc'd >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) >> ==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28) >> ==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23) >> ==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98) >> ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42048== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== Uninitialised value was created by a heap allocation >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x1020EB90C: gk_malloc (memory.c:147) >> ==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24) >> ==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519) >> ==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225) >> ==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151) >> ==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34) >> ==42048== by 0x101557CFC: get_perm_c_parmetis >> (get_perm_c_parmetis.c:241) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== >> ==42048== Syscall param write(buf) points to uninitialised byte(s) >> ==42048== at 0x102DA1C22: write (in >> /usr/lib/system/libsystem_kernel.dylib) >> ==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525) >> ==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86) >> ==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend >> (ch3u_eager.c:257) >> ==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130) >> ==42048== by 0x10277A1FA: MPI_Send (send.c:127) >> ==42048== by 0x10155802F: get_perm_c_parmetis >> (get_perm_c_parmetis.c:299) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== Address 0x104810704 is on thread 1's stack >> ==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend >> (ch3u_eager.c:218) >> ==42048== Uninitialised value was created by a heap allocation >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42048== by 0x101557AB9: get_perm_c_parmetis >> (get_perm_c_parmetis.c:185) >> ==42048== by 0x101501192: pdgssvx (pdgssvx.c:934) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92) >> ==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343) >> ==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380) >> ==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531) >> ==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539) >> ==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a stack allocation >> ==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96) >> ==42050== >> ==42050== Syscall param writev(vector[...]) points to uninitialised >> byte(s) >> ==42050== at 0x102DA1C3A: writev (in >> /usr/lib/system/libsystem_kernel.dylib) >> ==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32) >> ==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610) >> ==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84) >> ==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556) >> ==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138) >> ==42050== by 0x10277656E: MPI_Isend (isend.c:125) >> ==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201) >> ==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082) >> ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size >> 131,072 alloc'd >> ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) >> ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) >> ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a heap allocation >> ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145) >> ==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735) >> ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== >> ==42048== Conditional jump or move depends on uninitialised value(s) >> ==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139) >> ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== Uninitialised value was created by a heap allocation >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) >> ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== >> ==42049== Conditional jump or move depends on uninitialised value(s) >> ==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139) >> ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== Uninitialised value was created by a heap allocation >> ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) >> ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== >> ==42048== Conditional jump or move depends on uninitialised value(s) >> ==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429) >> ==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== Conditional jump or move depends on uninitialised value(s) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== Uninitialised value was created by a heap allocation >> ==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429) >> ==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) >> ==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42048== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42048== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42048== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== Uninitialised value was created by a heap allocation >> ==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42048== by 0x100001B3C: main (in ./ex19) >> ==42048== >> ==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332) >> ==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42049== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42049== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42049== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42049== by 0x100001B3C: main (in ./ex19) >> ==42049== >> ==42050== Conditional jump or move depends on uninitialised value(s) >> ==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382) >> ==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== Uninitialised value was created by a heap allocation >> ==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303) >> ==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108) >> ==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389) >> ==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057) >> ==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST >> (superlu_dist.c:414) >> ==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946) >> ==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152) >> ==42050== by 0x100FF9036: PCSetUp (precon.c:982) >> ==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332) >> ==42050== by 0x1010F7985: KSPSolve (itfunc.c:546) >> ==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233) >> ==42050== by 0x1011C49B7: SNESSolve (snes.c:3906) >> ==42050== by 0x100001B3C: main (in ./ex19) >> ==42050== >> >> >> > On Jul 20, 2015, at 12:03 PM, [email protected] wrote: >> > >> > Ok. So I have been creating the full factorization on each process. >> That gives me some hope! >> > >> > I followed your suggestion and tried to use the runtime option >> ‘-mat_superlu_dist_parsymbfact’. >> > However, now the program crashes with: >> > >> > Invalid ISPEC at line 484 in file get_perm_c.c >> > >> > And so on… >> > >> > From the SuperLU manual; I should give the option either YES or NO, >> however -mat_superlu_dist_parsymbfact YES makes the program crash in the >> same way as above. >> > Also I can’t find any reference to -mat_superlu_dist_parsymbfact in the >> PETSc documentation >> > >> > Mahir >> > >> > Mahir Ülker-Kaustell, Kompetenssamordnare, Brokonstruktör, Tekn. Dr, >> Tyréns AB >> > 010 452 30 82, [email protected] >> > >> > From: Xiaoye S. Li [mailto:[email protected]] >> > Sent: den 20 juli 2015 18:12 >> > To: Ülker-Kaustell, Mahir >> > Cc: Hong; petsc-users >> > Subject: Re: [petsc-users] SuperLU MPI-problem >> > >> > The default SuperLU_DIST setting is to serial symbolic factorization. >> Therefore, what matters is how much memory do you have per MPI task? >> > >> > The code failed to malloc memory during redistribution of matrix A to >> {L\U} data struction (using result of serial symbolic factorization.) >> > >> > You can use parallel symbolic factorization, by runtime option: >> '-mat_superlu_dist_parsymbfact' >> > >> > Sherry Li >> > >> > >> > On Mon, Jul 20, 2015 at 8:59 AM, [email protected] < >> [email protected]> wrote: >> > Hong: >> > >> > Previous experiences with this equation have shown that it is very >> difficult to solve it iteratively. Hence the use of a direct solver. >> > >> > The large test problem I am trying to solve has slightly less than 10^6 >> degrees of freedom. The matrices are derived from finite elements so they >> are sparse. >> > The machine I am working on has 128GB ram. I have estimated the memory >> needed to less than 20GB, so if the solver needs twice or even three times >> as much, it should still work well. Or have I completely misunderstood >> something here? >> > >> > Mahir >> > >> > >> > >> > From: Hong [mailto:[email protected]] >> > Sent: den 20 juli 2015 17:39 >> > To: Ülker-Kaustell, Mahir >> > Cc: petsc-users >> > Subject: Re: [petsc-users] SuperLU MPI-problem >> > >> > Mahir: >> > Direct solvers consume large amount of memory. Suggest to try >> followings: >> > >> > 1. A sparse iterative solver if [-omega^2M + K] is not too >> ill-conditioned. You may test it using the small matrix. >> > >> > 2. Incrementally increase your matrix sizes. Try different matrix >> orderings. >> > Do you get memory crash in the 1st symbolic factorization? >> > In your case, matrix data structure stays same when omega changes, so >> you only need to do one matrix symbolic factorization and reuse it. >> > >> > 3. Use a machine that gives larger memory. >> > >> > Hong >> > >> > Dear Petsc-Users, >> > >> > I am trying to use PETSc to solve a set of linear equations arising >> from Naviers equation (elastodynamics) in the frequency domain. >> > The frequency dependency of the problem requires that the system >> > >> > [-omega^2M + K]u = F >> > >> > where M and K are constant, square, positive definite matrices (mass >> and stiffness respectively) is solved for each frequency omega of interest. >> > K is a complex matrix, including material damping. >> > >> > I have written a PETSc program which solves this problem for a small >> (1000 degrees of freedom) test problem on one or several processors, but it >> keeps crashing when I try it on my full scale (in the order of 10^6 degrees >> of freedom) problem. >> > >> > The program crashes at KSPSetUp() and from what I can see in the error >> messages, it appears as if it consumes too much memory. >> > >> > I would guess that similar problems have occurred in this mail-list, so >> I am hoping that someone can push me in the right direction… >> > >> > Mahir >> >> >> >> >> >> >> >> >> > >
