I don't see anything obviously wrong with this build I guess the other thing to do is to build debug version on the machine and run in a debugger to determine the problem. [I believe there is a way to debug on bgl..]
Satish On Tue, 25 Jan 2011, Rongliang Chen wrote: > Hi Balay, > > Thank you for your reply. > I have checked my code with valgrind on my own computer and there is no > problem. > But when I run my code on the IBM Blue Gene/L with > "-sub_pc_factor_mat_solver_package superlu", it has such problem. > Since there is not valgrind on IBM Blue Gene/L, I can not test my code with > valgrind on it. > > But if use the PETSC's default LU factorization, there is no such problem. > So I suspect that there is some problem with my petsc's installation. > Can you help me to check if my installation is correct? > Following is the detail of the installation and the configure.log and > make.log are attached. > > Installing Petsc on IBM Blue Gene/L: > > 1. patch -p0 < /contrib/bgl/petsc/petsc-3.0.0-p4/petsc-3.0.0-p4.patch > 2. ./config/bgl-ibm-goto_lapack.py and the the "bgl-ibm-goto_lapack.py" is > : > ****************************************************************************************** > #!/usr/bin/env python > # > # BGL has broken 'libc' dependencies. The option 'LIBS' is used to > # workarround this problem. > # > # LIBS="-lc -lnss_files -lnss_dns -lresolv" > # > # Another workarround is to modify mpicc/mpif77 scripts and make them > # link with the corresponding compilers, and these additional > # libraries. The following tarball has the modified compiler scripts > # > # ftp://ftp.mcs.anl.gov/pub/petsc/tmp/petsc-bgl-tools.tar.gz > # > configure_options = [ > '--with-cc=/contrib/bgl/bin/mpxlc', > '--with-cxx=/contrib/bgl/bin/mpxlC', > '--with-fc=/contrib/bgl/bin/mpxlf -qnosave', > '--with-mpi-dir=/bgl/BlueLight/ppcfloor/bglsys', # required by BLACS to > get mpif.h > '--with-lapack-lib=/contrib/bgl/lib/liblapack440.a', > '--with-blas-lib=/contrib/bgl/lib/libblas440.a', > # '--with-blas-lapack-lib=-L/contrib/bgl/lib -llapack440 -L/contrib/bgl/lib > -lgoto', > > '--with-is-color-value-type=short', > '--with-shared=0', > > '-COPTFLAGS=-O2 -qbgl -qarch=440d -qtune=440 -qmaxmem=-1', > '-CXXOPTFLAGS=-O2 -qbgl -qarch=440d -qtune=440 -qmaxmem=-1', > '-FOPTFLAGS=-O2 -qbgl -qarch=440d -qtune=440 -qmaxmem=-1', > '--with-debugging=0', > > # the following option gets automatically enabled on BGL/with IBM > compilers. > # '--with-fortran-kernels=bgl' > > '--with-x=0', > '--with-x11=0', > '--with-batch=1', > '--with-memcmp-ok', > '--sizeof-char=1', > '--sizeof-void-p=4', > '--sizeof-short=2', > '--sizeof-int=4', > '--sizeof-long=4', > '--sizeof-size-t=4', > '--sizeof-long-long=8', > '--sizeof-float=4', > '--sizeof-double=8', > '--bits-per-byte=8', > '--sizeof-MPI-Comm=4', > '--sizeof-MPI-Fint=4', > '--have-mpi-long-double=1', > > > '--download-superlu=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/superlu_4.0-March_7_2010.tar.gz', > > '--download-superlu_dist=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/SuperLU_DIST_2.4-hg-v2.tar.gz', > > '--download-parmetis=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/ParMetis-dev-p3.tar.gz', > > '--download-scalapack=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/scalapack.tgz', > > '--download-blacs=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/blacs-dev.tar.gz', > > '--download-f-blas-lapack=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/fblaslapack-3.1.1.tar.gz', > > '--download-mumps=/home/rchen/soft/petsc-3.1-p7-nodebug/externalpackages/MUMPS_4.9.2.tar.gz', > > # '--download-f-blas-lapack=1', > # '--download-hypre=1', > # '--download-spooles=1', > # '--download-superlu=1', > # '--download-parmetis=1', > # '--download-superlu_dist=1', > # '--download-blacs=1', > > '-PETSC_ARCH=bgl-ibm-goto-O3_440d' > ] > > if __name__ == '__main__': > import sys,os > sys.path.insert(0,os.path.abspath('config')) > import configure > configure.petsc_configure(configure_options) > > # Extra options used for testing locally > test_options = [] > ************************************************************************ > 3. cqsub -n 1 -t 20 -O conftest -q debug ./conftest > 4. ./reconfigure.py > 5. make all > > Thank you! > > Best, > > Rongliang > > > ---------------------------------------------------------------------- > > > > Message: 1 > > Date: Mon, 24 Jan 2011 16:06:22 -0600 (CST) > > From: Satish Balay <balay at mcs.anl.gov> > > Subject: Re: [petsc-users] Problem on LU factorization > > To: PETSc users list <petsc-users at mcs.anl.gov> > > Message-ID: > > <alpine.LFD.2.02.1101241605370.2510 at localhost6.localdomain6> > > Content-Type: TEXT/PLAIN; charset=US-ASCII > > > > On Mon, 24 Jan 2011, Matthew Knepley wrote: > > > > > > When I use superlu with command line "-sub_pc_factor_mat_solver_package > > > > superlu", it said > > > > > > "[43]PETSC ERROR: > > > > > > ------------------------------------------------------------------------ > > > > [43]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, > > > > probably memory access out of range > > > > [43]PETSC ERROR: Try option -start_in_debugger or > > -on_error_attach_debugger > > > > [43]PETSC ERROR: or see > > > > > > http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[43]PETSCERROR: > > or try > > > > http://valgrind.org on GNU/linux and Apple Mac OS X to find memory > > > > corruption errors > > > > [43]PETSC ERROR: likely location of problem given in stack below > > > > [43]PETSC ERROR: --------------------- Stack Frames > > > > ------------------------------------ > > > > [43]PETSC ERROR: Note: The EXACT line numbers in the stack are not > > > > available, > > > > [43]PETSC ERROR: INSTEAD the line number of the start of the > > function > > > > [43]PETSC ERROR: is given. > > > > [43]PETSC ERROR: [43] MatLUFactorNumeric_SuperLU line 121 > > > > src/mat/impls/aij/seq/superlu/superlu.c > > > > [43]PETSC ERROR: [43] MatLUFactorNumeric line 2575 > > > > src/mat/interface/matrix.c > > > > ............................ > > > > " > > > > > > > > > > Please confirm that you have the latest patch level. If so, send the > > matrix > > > in PETSc binary format to petsc-maint at mcs.anl.gov > > > along with the precise solver options and output of -ksp_view. > > > > More likely there is memory corruption somewhere - should run this > > code with valgrind to weed out such issues.. > > > > Satish > > > > > > ------------------------------ > > > > >
