On Fri, Jan 20, 2012 at 9:44 AM, Wen Jiang <jiangwen84 at gmail.com> wrote:
> Hi Barry, > > Thanks for your suggestion. I just added MatSetOption(mat, > MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE) to my code, but I did not get > any error information regarding to bad allocation. And my code is stuck > there. I attached the output file below. Thanks. > Run with -start_in_debugger and get a stack trace. Note that your stashes are enormous. You might consider MatAssemblyBegin/End(A, MAT_ASSEMBLY_FLUSH) during assembly. Matt > [0] VecAssemblyBegin_MPI(): Stash has 210720 entries, uses 12 mallocs. > [0] VecAssemblyBegin_MPI(): Block-Stash has 0 entries, uses 0 mallocs. > [5] MatAssemblyBegin_MPIAIJ(): Stash has 4806656 entries, uses 8 mallocs. > [6] MatAssemblyBegin_MPIAIJ(): Stash has 5727744 entries, uses 9 mallocs. > [4] MatAssemblyBegin_MPIAIJ(): Stash has 5964288 entries, uses 9 mallocs. > [7] MatAssemblyBegin_MPIAIJ(): Stash has 7408128 entries, uses 9 mallocs. > [3] MatAssemblyBegin_MPIAIJ(): Stash has 8123904 entries, uses 9 mallocs. > [2] MatAssemblyBegin_MPIAIJ(): Stash has 11544576 entries, uses 10 mallocs. > [0] MatStashScatterBegin_Private(): No of messages: 1 > [0] MatStashScatterBegin_Private(): Mesg_to: 1: size: 107888648 > [0] MatAssemblyBegin_MPIAIJ(): Stash has 13486080 entries, uses 10 mallocs. > [1] MatAssemblyBegin_MPIAIJ(): Stash has 16386048 entries, uses 10 mallocs. > [7] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2514194 used > [7] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [7] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [7] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [7] PetscCommDuplicate(): Using internal PETSc communicator 47582902893600 > 339106512 > [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2514537 used > [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [0] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [0] PetscCommDuplicate(): Using internal PETSc communicator 46968795675680 > 536030192 > [0] MatSetUpMultiply_MPIAIJ(): Using block index set to define scatter > [6] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2499938 used > [6] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [6] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [6] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [6] PetscCommDuplicate(): Using internal PETSc communicator 47399146302496 > 509504096 > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 11390 X 11390; storage space: 0 > unneeded,2525390 used > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [5] Mat_CheckInode(): Found 11390 nodes out of 11390 rows. Not using Inode > routines > [5] PetscCommDuplicate(): Using internal PETSc communicator 47033309994016 > 520223440 > [1] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2500281 used > [1] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [1] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [1] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [1] PetscCommDuplicate(): Using internal PETSc communicator 47149241441312 > 163068544 > [2] MatAssemblyEnd_SeqAIJ(): Matrix size: 11391 X 11391; storage space: 0 > unneeded,2525733 used > [2] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0 > [2] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 294 > [2] Mat_CheckInode(): Found 11391 nodes out of 11391 rows. Not using Inode > routines > [2] PetscCommDuplicate(): Using internal PETSc communicator 47674980494368 > 119371056 > > > > > >> > Since my code never finishes, I cannot get the summary files by add >> -log_summary. any other way to get summary file? >> > > My guess is that you are running a larger problem on the this system and >> your preallocation for the matrix is wrong. While in the small run you sent >> the preallocation is correct. >> >> Usually the only thing that causes it to take forever is not the >> parallel communication but is the preallocation. After you create the >> matrix and set its preallocation call >> MatSetOption(mat, NEW_NONZERO_ALLOCATION_ERR,PETSC_TRUE); then run. It >> will stop with an error message if preallocation is wrong. >> >> Barry >> >> >> >> > >> > BTW, my codes are running without any problem on shared-memory desktop >> with any number of processes. >> > >> > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120120/97da42b5/attachment-0001.htm>
