It is not problem with Matload twice. The file has one matrix, but is loaded twice.
Replacing pc with ksp, the code runs fine. The error occurs when PCSetUp_LU() is called with SAME_NONZERO_PATTERN. I'll further look at it later. Hong ________________________________________ From: Zhang, Hong Sent: Friday, October 21, 2016 8:18 PM To: Barry Smith; petsc-users Subject: RE: [petsc-users] SuperLU_dist issue in 3.7.4 I am investigating it. The file has two matrices. The code takes following steps: PCCreate(PETSC_COMM_WORLD, &pc); MatCreate(PETSC_COMM_WORLD,&A); MatLoad(A,fd); PCSetOperators(pc,A,A); PCSetUp(pc); MatCreate(PETSC_COMM_WORLD,&A); MatLoad(A,fd); PCSetOperators(pc,A,A); PCSetUp(pc); //crash here with np=2, superlu_dist, not with mumps/superlu or superlu_dist np=1 Hong ________________________________________ From: Barry Smith [bsm...@mcs.anl.gov] Sent: Friday, October 21, 2016 5:59 PM To: petsc-users Cc: Zhang, Hong Subject: Re: [petsc-users] SuperLU_dist issue in 3.7.4 > On Oct 21, 2016, at 5:16 PM, Satish Balay <ba...@mcs.anl.gov> wrote: > > The issue with this test code is - using MatLoad() twice [with the > same object - without destroying it]. Not sure if thats supporsed to > work.. If the file has two matrices in it then yes a second call to MatLoad() with the same matrix should just load in the second matrix from the file correctly. Perhaps we need a test in our test suite just to make sure that works. Barry > > Satish > > On Fri, 21 Oct 2016, Hong wrote: > >> I can reproduce the error on a linux machine with petsc-maint. It crashes >> at 2nd solve, on both processors: >> >> Program received signal SIGSEGV, Segmentation fault. >> 0x00007f051dc835bd in pdgsequ (A=0x1563910, r=0x176dfe0, c=0x178f7f0, >> rowcnd=0x7fffcb8dab30, colcnd=0x7fffcb8dab38, amax=0x7fffcb8dab40, >> info=0x7fffcb8dab4c, grid=0x1563858) >> at >> /sandbox/hzhang/petsc/arch-linux-gcc-gfortran/externalpackages/git.superlu_dist/SRC/pdgsequ.c:182 >> 182 c[jcol] = SUPERLU_MAX( c[jcol], fabs(Aval[j]) * r[irow] >> ); >> >> The version of superlu_dist: >> commit 0b5369f304507f1c7904a913f4c0c86777a60639 >> Author: Xiaoye Li <x...@lbl.gov> >> Date: Thu May 26 11:33:19 2016 -0700 >> >> rename 'struct pair' to 'struct superlu_pair'. >> >> Hong >> >> On Fri, Oct 21, 2016 at 5:36 AM, Anton Popov <po...@uni-mainz.de> wrote: >> >>> >>> On 10/19/2016 05:22 PM, Anton Popov wrote: >>> >>> I looked at each valgrind-complained item in your email dated Oct. 11. >>> Those reports are really superficial; I don't see anything wrong with >>> those lines (mostly uninitialized variables) singled out. I did a few >>> tests with the latest version in github, all went fine. >>> >>> Perhaps you can print your matrix that caused problem, I can run it using >>> your matrix. >>> >>> Sherry >>> >>> Hi Sherry, >>> >>> I finally figured out a minimalistic setup (attached) that reproduces the >>> problem. >>> >>> I use petsc-maint: >>> >>> git clone -b maint https://bitbucket.org/petsc/petsc.git >>> >>> and configure it in the debug mode without optimization using the options: >>> >>> --download-superlu_dist=1 \ >>> --download-superlu_dist-commit=origin/maint \ >>> >>> Compile the test, assuming PETSC_DIR points to the described petsc >>> installation: >>> >>> make ex16 >>> >>> Run with: >>> >>> mpirun -n 2 ./ex16 -f binaryoutput -pc_type lu >>> -pc_factor_mat_solver_package superlu_dist >>> >>> Matrix partitioning between the processors will be completely the same as >>> in our code (hard-coded). >>> >>> I factorize the same matrix twice with the same PC object. Remarkably it >>> runs fine for the first time, but fails for the second. >>> >>> Thank you very much for looking into this problem. >>> >>> Cheers, >>> Anton >>> >> >