Thanks for looking into this, Junchao. I guess the next step if for me to build petsc with the same configuration as yours and see if that works.
Regards, Manav > On Aug 20, 2020, at 10:45 PM, Junchao Zhang <junchao.zh...@gmail.com> wrote: > > Manav, > I downloaded your petsc_mat.tgz but could not reproduce the problem, on both > Linux and Mac. I used the petsc commit id df0e4300 you mentioned. > On Linux, I have openmpi-4.0.2 + gcc-8.3.0, and petsc is configured > --with-debugging --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort > --COPTFLAGS="-g -O0" --FOPTFLAGS="-g -O0" --CXXOPTFLAGS="-g -O0" > --PETSC_ARCH=linux-host-dbg > On Mac, I have mpich-3.3.1 + clang-11.0.0-apple, and petsc is configured > --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort > --with-ctable=0 COPTFLAGS="-O0 -g" CXXOPTFLAGS="-O0 -g" > PETSC_ARCH=mac-clang-dbg > > mpirun -n 8 ./test > rank: 1 : stdout.processor.1 > rank: 4 : stdout.processor.4 > rank: 0 : stdout.processor.0 > rank: 5 : stdout.processor.5 > rank: 6 : stdout.processor.6 > rank: 7 : stdout.processor.7 > rank: 3 : stdout.processor.3 > rank: 2 : stdout.processor.2 > rank: 1 : Beginning reading nnz... > rank: 4 : Beginning reading nnz... > rank: 0 : Beginning reading nnz... > rank: 5 : Beginning reading nnz... > rank: 7 : Beginning reading nnz... > rank: 2 : Beginning reading nnz... > rank: 3 : Beginning reading nnz... > rank: 6 : Beginning reading nnz... > rank: 5 : Finished reading nnz > rank: 5 : Beginning mat preallocation... > rank: 3 : Finished reading nnz > rank: 3 : Beginning mat preallocation... > rank: 4 : Finished reading nnz > rank: 4 : Beginning mat preallocation... > rank: 7 : Finished reading nnz > rank: 7 : Beginning mat preallocation... > rank: 1 : Finished reading nnz > rank: 1 : Beginning mat preallocation... > rank: 0 : Finished reading nnz > rank: 0 : Beginning mat preallocation... > rank: 2 : Finished reading nnz > rank: 2 : Beginning mat preallocation... > rank: 6 : Finished reading nnz > rank: 6 : Beginning mat preallocation... > rank: 5 : Finished preallocation > rank: 5 : Beginning reading and setting matrix values... > rank: 1 : Finished preallocation > rank: 1 : Beginning reading and setting matrix values... > rank: 7 : Finished preallocation > rank: 7 : Beginning reading and setting matrix values... > rank: 2 : Finished preallocation > rank: 2 : Beginning reading and setting matrix values... > rank: 4 : Finished preallocation > rank: 4 : Beginning reading and setting matrix values... > rank: 0 : Finished preallocation > rank: 0 : Beginning reading and setting matrix values... > rank: 3 : Finished preallocation > rank: 3 : Beginning reading and setting matrix values... > rank: 6 : Finished preallocation > rank: 6 : Beginning reading and setting matrix values... > rank: 1 : Finished reading and setting matrix values > rank: 1 : Beginning mat assembly... > rank: 5 : Finished reading and setting matrix values > rank: 5 : Beginning mat assembly... > rank: 4 : Finished reading and setting matrix values > rank: 4 : Beginning mat assembly... > rank: 2 : Finished reading and setting matrix values > rank: 2 : Beginning mat assembly... > rank: 3 : Finished reading and setting matrix values > rank: 3 : Beginning mat assembly... > rank: 7 : Finished reading and setting matrix values > rank: 7 : Beginning mat assembly... > rank: 6 : Finished reading and setting matrix values > rank: 6 : Beginning mat assembly... > rank: 0 : Finished reading and setting matrix values > rank: 0 : Beginning mat assembly... > rank: 1 : Finished mat assembly > rank: 3 : Finished mat assembly > rank: 7 : Finished mat assembly > rank: 0 : Finished mat assembly > rank: 5 : Finished mat assembly > rank: 2 : Finished mat assembly > rank: 4 : Finished mat assembly > rank: 6 : Finished mat assembly > > --Junchao Zhang > > > On Thu, Aug 20, 2020 at 5:29 PM Junchao Zhang <junchao.zh...@gmail.com > <mailto:junchao.zh...@gmail.com>> wrote: > I will have a look and report back to you. Thanks. > --Junchao Zhang > > > On Thu, Aug 20, 2020 at 5:23 PM Manav Bhatia <bhatiama...@gmail.com > <mailto:bhatiama...@gmail.com>> wrote: > I have created a standalone test that demonstrates the problem at my end. I > have stored the indices, etc. from my problem in a text file for each rank, > which I use to initialize the matrix. > Please note that the test is specifically for 8 ranks. > > The .tgz file is on my google drive: > https://drive.google.com/file/d/1R-WjS36av3maXX3pUyiR3ndGAxteTVj-/view?usp=sharing > > <https://drive.google.com/file/d/1R-WjS36av3maXX3pUyiR3ndGAxteTVj-/view?usp=sharing> > > > This contains a README file with instructions on running. Please note that > the work directory needs the index files. > > Please let me know if I can provide any further information. > > Thank you all for your help. > > Regards, > Manav > >> On Aug 20, 2020, at 12:54 PM, Jed Brown <j...@jedbrown.org >> <mailto:j...@jedbrown.org>> wrote: >> >> Matthew Knepley <knep...@gmail.com <mailto:knep...@gmail.com>> writes: >> >>> On Thu, Aug 20, 2020 at 11:09 AM Manav Bhatia <bhatiama...@gmail.com >>> <mailto:bhatiama...@gmail.com>> wrote: >>> >>>> >>>> >>>> On Aug 20, 2020, at 8:31 AM, Stefano Zampini <stefano.zamp...@gmail.com >>>> <mailto:stefano.zamp...@gmail.com>> >>>> wrote: >>>> >>>> Can you add a MPI_Barrier before >>>> >>>> ierr = MatAssemblyBegin(aij->A,mode);CHKERRQ(ierr); >>>> >>>> >>>> With a MPI_Barrier before this function call: >>>> — three of the processes have already hit this barrier, >>>> — the other 5 are inside MatStashScatterGetMesg_Private -> >>>> MatStashScatterGetMesg_BTS -> MPI_Waitsome(2 processes)/MPI_Waitall(3 >>>> processes) >> >> This is not itself evidence of inconsistent state. You can use >> >> -build_twosided allreduce >> >> to avoid the nonblocking sparse algorithm. >> >>> >>> Okay, you should run this with -matstash_legacy just to make sure it is not >>> a bug in your MPI implementation. But it looks like >>> there is inconsistency in the parallel state. This can happen because we >>> have a bug, or it could be that you called a collective >>> operation on a subset of the processes. Is there any way you could cut down >>> the example (say put all 1s in the matrix, etc) so >>> that you could give it to us to run? >