> On 1 Mar 2021, at 6:29 AM, Zhang, Hong <[email protected]> wrote: > > Pierre, > This is a bug in MatMatMultSymbolic_MPIAIJ_MPIDense() during optimization of > block column size of B. Run your code with > '-matmatmult_Bbn 1', the infinite loop should not occur.
Thanks Hong, I can confirm this option makes the more complex use case run smoothly as well. > I'll try to figure out a fix tomorrow. Great. Thanks, Pierre > Hong > From: Zhang, Hong <[email protected]> > Sent: Sunday, February 28, 2021 11:05 PM > To: Pierre Jolivet <[email protected]>; For users of the development version of > PETSc <[email protected]>; Zhang, Hong <[email protected]> > Subject: Re: [petsc-dev] Infinite loop in A*B > > The infinite loop in MatMatMultNumeric_MPIAIJ_MPIDense() > for (i=0; i<BN; i+=n) { > } > is cause by n=contents->workB->cmap->n=0 (line 590 in mpimatmatmult.c) > Hong > From: petsc-dev <[email protected]> on behalf of Zhang, Hong via > petsc-dev <[email protected]> > Sent: Sunday, February 28, 2021 10:33 PM > To: Pierre Jolivet <[email protected]>; For users of the development version of > PETSc <[email protected]> > Subject: Re: [petsc-dev] Infinite loop in A*B > > I can reproduce the hang with > mpiexec -n 2 ./matmatmult > > It seems in an infinite loop of calling MatDensePlaceArray() from > > #0 MatDensePlaceArray (mat=0xda5c50, array=0xd15e60) > at /home/hongsu/soft/petsc/src/mat/impls/dense/mpi/mpidense.c:2047 > #1 0x00007fa0d13bf4f7 in MatDenseGetSubMatrix_SeqDense (A=0xcfb2b0, cbegin=0, > cend=0, v=0xd90370) > at /home/hongsu/soft/petsc/src/mat/impls/dense/seq/dense.c:2997 > #2 0x00007fa0d13c574e in MatDenseGetSubMatrix (A=0xcfb2b0, cbegin=0, cend=0, > v=0xd90370) at > /home/hongsu/soft/petsc/src/mat/impls/dense/seq/dense.c:3371 > #3 0x00007fa0d13db5ce in MatDenseGetSubMatrix_MPIDense (A=0xca5250, cbegin=0, > cend=0, v=0x7ffe87d41de0) > at /home/hongsu/soft/petsc/src/mat/impls/dense/mpi/mpidense.c:1835 > #4 0x00007fa0d13c574e in MatDenseGetSubMatrix (A=0xca5250, cbegin=0, cend=0, > v=0x7ffe87d41de0) > at /home/hongsu/soft/petsc/src/mat/impls/dense/seq/dense.c:3371 > #5 0x00007fa0d179c2fa in MatMatMultNumeric_MPIAIJ_MPIDense (A=0xc55490, > B=0xca5250, C=0xd282b0) > at /home/hongsu/soft/petsc/src/mat/impls/aij/mpi/mpimatmatmult.c:593 > #6 0x00007fa0d1181331 in MatProductNumeric_AB (mat=0xd282b0) > at /home/hongsu/soft/petsc/src/mat/interface/matproduct.c:567 > #7 0x00007fa0d1182c14 in MatProductNumeric (mat=0xd282b0) > at /home/hongsu/soft/petsc/src/mat/interface/matproduct.c:679 > #8 0x00007fa0d115ef69 in MatProduct_Private (A=0xc55490, B=0xca5250, > scall=MAT_INITIAL_MATRIX, fill=-2, ptype=MATPRODUCT_AB, C=0x7ffe87d42018) > at /home/hongsu/soft/petsc/src/mat/interface/matrix.c:9405 > ---Type <return> to continue, or q <return> to quit--- > #9 0x00007fa0d115f274 in MatMatMult (A=0xc55490, B=0xca5250, > scall=MAT_INITIAL_MATRIX, fill=-2, > C=0x7ffe87d42018) at > /home/hongsu/soft/petsc/src/mat/interface/matrix.c:9445 > #10 0x000000000040130a in main (argc=2, argv=0x7ffe87d42108) at ex1.c:20 > > I'll try to figure out what is going on. If anyone has a clue, please help. > The above stack comes from 'release' branch. > Hong > From: petsc-dev <[email protected]> on behalf of Pierre Jolivet > <[email protected]> > Sent: Sunday, February 28, 2021 4:17 PM > To: For users of the development version of PETSc <[email protected]> > Subject: [petsc-dev] Infinite loop in A*B > > Hello, > The following MWE loops indefinitely for MPI_Comm_size in {2; 3}. > Nothing fancy, just MatAIJ and MatDense. > The problem is either in MatMPIDenseScatter() or > MatMatMultSymbolic_MPIAIJ_MPIDense(), I believe, so if someone familiar with > those routines can figure out a hot fix, I’m all ears. > I could of course switch to a MatMult(), but the same infinite loop happens > in another more complex code with > A = rows=8, cols=35212 > B = rows=35212, cols=9 > So I’ll need a fix eventually. > > Thanks, > Pierre
