Hi,
I am seeing problems when trying to build petsc-dev code. My configure line is below, same as I successfully did for 2.3.2-p10. I tried with mkl 9 and mkl 10. Same errors. There are references to undefined symbols. Please share with me if you have any experience with the issue or suggestions to resolve it. Thanks, Ying ./config/configure.py --with-batch=1 --with-clanguage=C++ --with-vendor-compilers=intel '--CXXFLAGS=-g -gcc-name=/usr/intel/pkgs/gcc/4.2.2/bin/g++ -gcc-version=420 ' '--LDFLAGS=-L/usr/lib64 -L/usr/intel/pkgs/gcc/4.2.2/lib -ldl -lpthread -Qlocation,ld,/usr/intel/pkgs/gcc/4.2.2/x86_64-suse-linux/bin -L/usr/intel/pkgs/icc/10.1.008e/lib -lirc' --with-cxx=$ICCDIR/bin/icpc --with-fc=$IFCDIR/bin/ifort --with-mpi-compilers=0 --with-mpi-shared=0 --with-debugging=yes --with-mpi=yes --with-mpi-include=$MPIDIR/include --with-mpi-lib=\[$MPIDIR/lib64/libmpi.a,$MPIDIR/lib64/libmpiif.a,$MPIDIR /lib64/libmpigi.a\] --with-blas-lapack-lib=\[$MKLLIBDIR/libguide.so,$MKLLIBDIR/libmkl_lapack .so,$MKLLIBDIR/libmkl_solver.a,$MKLLIBDIR/libmkl.so\] --with-scalapack=yes --with-scalapack-include=$MKLDIR/include --with-scalapack-lib=$MKLLIBDIR/libmkl_scalapack.a --with-blacs=yes --with-blacs-include=$MKLDIR/include --with-blacs-lib=$MKLLIBDIR/libmkl_blacs_intelmpi_lp64.a --with-umfpack=1 --with-umfpack-lib=\[$UMFPACKDIR/UMFPACK/Lib/libumfpack.a,$UMFPACKDIR/AM D/Lib/libamd.a\] --with-umfpack-include=$UMFPACKDIR/UMFPACK/Include --with-parmetis=1 --with-parmetis-dir=$PARMETISDIR --with-mumps=1 --download-mumps=$PETSC_DIR/externalpackages/MUMPS_4.6.3.tar.gz --with-superlu_dist=1 --download-superlu_dist=$PETSC_DIR/externalpackages/superlu_dist_2.0.tar .gz .... /nfs/pdx/proj/dt/pdx_sde02/x86-64_linux26/petsc/petsc-dev/conftest.c:7: undefined reference to `f2cblaslapack311_id_' /p/dt/sde/tools/x86-64_linux26/mkl/10.0.2.018/lib/em64t/libguide.so: undefined reference to `pthread_atfork' .... ------------------------------------------------------------------------ -------------- You set a value for --with-blas-lapack-lib=<lib>, but ['/p/dt/sde/tools/x86-64_linux26/mkl/10.0.2.018/lib/em64t/libguide.so', '/p/dt/sde/tools/x86-64_linux26/mkl/10.0.2.018/lib/em64t/libmkl_lapack.s o', '/p/dt/sde/tools/x86-64_linux26/mkl/10.0.2.018/lib/em64t/libmkl_solver.a ', '/p/dt/sde/tools/x86-64_linux26/mkl/10.0.2.018/lib/em64t/libmkl.so'] cannot be used ************************************************************************ ********* -----Original Message----- From: Barry Smith [mailto:[email protected]] Sent: Thursday, October 09, 2008 12:39 PM To: Rhew, Jung-hoon Cc: PETSc-Maint Smith; Linton, Tom; Cea, Stephen M; Stettler, Mark Subject: Re: [PETSC #18391] PETSc crash with memory allocation in ILU preconditioning We don't have all the code just right to use those packages with 64 bit integers. I will try to get them all working by Monday and will let you know my progress. To use them you will need to be using petsc-dev http://www-unix.mcs.anl.gov/petsc/petsc-as/developers/index.html so you can switch to that now if you are not yet using it in preparation for my updates. Barry On Oct 9, 2008, at 12:52 PM, Rhew, Jung-hoon wrote: > Hi, > > I found that the root cause of malloc error was that our PETSc > library had been compiled without 64 bit flag on. Thus, PetscInt > was defined as "int" instead of "long long" and for large problems, > the memory allocation requires memory beyond the maximum of int and > causes integer overflow. > > But when I tried to build using 64 bit flag (--with-64-bit- > indices=1), all files associated with the external libraries (such > as UMFPACK, and MUMPS) built with PETSc started failing in > compilation mainly due to the incompatibility between "int" in those > libraries and "long long" in PETSc. > > I wonder if you can let us know how to resolve this conflict when > builing PETSc with 64 bit. The brute force way is to change the > source codes of those libraries where the conflicts occur but I > wonder if there is a neater way of doing this. > > Thanks. > jr > > Example: > libfast in: /nfs/ltdn/disks/td_disk49/usr.cdmg/jrhew/work/mds_work/ > PETSC/mypetsc-2.3.2-p10/src/mat/impls/aij/seq/umfpack > > umfpack.c(154): error: a value of type "PetscInt={long long} *" > cannot be used to initialize an entity of type "int *" > int m=A->rmap.n,n=A->cmap.n,*ai=mat->i,*aj=mat- > >j,status,*ra,idx; > > > -----Original Message----- > From: Barry Smith [mailto:bsmith at mcs.anl.gov] > Sent: Tuesday, October 07, 2008 6:15 PM > To: Rhew, Jung-hoon > Cc: petsc-maint at mcs.anl.gov; Linton, Tom; Cea, Stephen M; Stettler, > Mark > Subject: Re: [PETSC #18391] PETSc crash with memory allocation in > ILU preconditioning > > > During the symbolic phase of ILU(N) there is no way in advance to > know how many new nonzeros are needed > in the factored version over the original matrix (this is tree for LU > too). We handle this by starting with a certain > amount of memory and then if that is not enough for for the symbolic > factor we double the memory allocated > and copy the values over from the old copy of the symbolic factor > (what has been computed so far) and then > free the old copy. > > To avoid this "memory doubling" (which is not super memory > efficient) you can use the option > -mat_factor_fill or PCFactorSetFill() to set slightly more than the > "correct" value then only a single malloc > is needed and you can do larger problems. > > Of course, the question is "what value should I use for fill"? > There is no formula, if there was we would > use it automatically. So the only way I know is to run smaller > problems and get a feel for what the ratio > should be for your larger problem. Run with -info | grep > pc_factor_fill and it will tell you what "you should > have used" > > Hope this helps, > > Barry > > > > On Oct 7, 2008, at 5:46 PM, Rhew, Jung-hoon wrote: > >> Hi, >> >> 1. I ran it with 64-bit machine with 32GB physical memory but it >> still crashed. At the crash, the peak memory was 17GB so there were >> plenty of memory left. This is why I don't think the simulation >> needed full 32GB + swap space more than 64GB. >> >> 2. The problem size is too big for direct solver as it can easily go >> beyond 32GB. Actually, we use MUMPS for smaller problems. >> >> 3. ILUN is the most robust preconditioner we found for our >> production simulation so we want to stick to it. >> >> I think I'll send a test case that reproduces the problem. >> >> -----Original Message----- >> From: knepley at gmail.com [mailto:knepley at gmail.com] On Behalf Of >> Matthew Knepley >> Sent: Tuesday, October 07, 2008 2:21 PM >> To: Rhew, Jung-hoon >> Cc: PETSC Maintenance >> Subject: Re: [PETSC #18391] PETSc crash with memory allocation in >> ILU preconditioning >> >> Its not hard for ILU(k) to run out of the 32-bit limit for large >> matrices. I would recommend >> >> 1) Using a 64-bit machine with more memory >> >> 2) Trying a sparse direct solver like MUMPS >> >> 3) Trying another preconditioner, which is of course problem >> dependent >> >> Thanks, >> >> Matt >> >> On Tue, Oct 7, 2008 at 4:03 PM, Rhew, Jung-hoon >> <jung-hoon.rhew at intel.com> wrote: >>> Dear PETSc team, >>> >>> We use PETSc as a linear solver library in our tool and in some >>> test cases >>> using ILU(N) preconditioner, we have problems with memory. I'm not >>> sending >>> our matrix at this time since it is huge but if you think it is >>> needed, I'll >>> send it to you. >>> >>> Thanks for your help in advance. >>> >>> >>> >>> Log file is attached. >>> OS: suse 64bit sles9 >>> >>> 2.6.5-7.276.PTF.196309.1-smp #1 SMP Mon Jul 24 10:45:31 UTC 2006 >>> x86_64 >>> x86_64 x86_64 GNU/Linux >>> >>> PETSc ver: petsc-2.3.2-p10 >>> MPI implementation: Intel MPI based on MPICH2 and MVAPICH2 >>> Compiler: GCC 4.2.2 >>> Probable PETSc component: n/a >>> Problem Description >>> >>> Solver setting: BCGSL (L=2) and ILU(N=2) >>> >>> -ksp_rtol=1e-14 >>> >>> -ksp_type=bcgsl >>> >>> -ksp_bcgsl_ell=2 >>> >>> -pc_factor_levels=2 >>> >>> -pc_factor_reuseordering >>> >>> -pc_factor_zeropivot=0.0 >>> >>> -pc_type=ilu >>> >>> -pc_factor_fill=2 >>> >>> -pc_factor_mat_ordering_type=rcm >>> >>> >>> >>> malloc crash: sparse matrix size ~ 500K by 500K with NNZ ~ 0.002% >>> (full >>> error message is attached.) >>> >>> In debugger, symbolic ILU requires memory beyond the max int. At >>> line 1089 >>> In aijfact.c, len becomes -2147483648 as >>> (bi[n])*sizeof(PetscScalar) > max >>> int. >>> >>> len = (bi[n])*sizeof(PetscScalar); >>> >>> >>> >>> Then, it causes the following malloc error in subsequent function >>> calls (the >>> call stack is also in the attached error message). >>> >>> [0]PETSC ERROR: --------------------- Error Message >>> ------------------------------------ >>> >>> [0]PETSC ERROR: Out of memory. This could be due to allocating >>> >>> [0]PETSC ERROR: too large an object or bleeding by not properly >>> >>> [0]PETSC ERROR: destroying unneeded objects. >>> >>> [0]PETSC ERROR: Memory allocated -2147483648 Memory used by process >>> -2147483648 >>> >>> [0]PETSC ERROR: Try running with -malloc_dump or -malloc_log for >>> info. >>> >>> [0]PETSC ERROR: Memory requested 18446744071912865792! >>> >>> [0]PETSC ERROR: >>> ------------------------------------------------------------------------ >>> >>> [0]PETSC ERROR: Petsc Release Version 2.3.2, Patch 10, Wed Mar 28 >>> 19:13:22 >>> CDT 2007 HG revision: d7298c71db7f5e767f359ae35d33cab3bed44428 >>> >>> >>> >>> Possibly relevant symptom: iterative solver with ILU(N) consumes >>> more memory >>> than direct solver as N gets larger (>5) although the matrix is not >>> big >>> enough to cause malloc crash like the above. >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >
