Spreading the elements over the processors by sheer number is not automatically a safe method, depending on the mesh. Especially with irregular meshes, such as created by Triangle of Gmsh, such a distribution will not reduce the amount of communication, maybe even increase it.
There are mature and well-tested partitioning tools available that can divide your mesh into regional partitions. We use Metis/ParMetis. I believe PETSc uses PTScotch. This is an extra step, but it will reduce the communication volume considerably. Cheers Lukas On 2/27/17, Mark Adams <[email protected]> wrote: > Another approach that might be simple, if you have the metadata for the > entire mesh locally, is set up a list of elements that your local matrix > block-rows/vertices touch but going over all the elements and test if any > of its vertices i are: if (i >= start && i < end) list.append(i). Just > compute and assemble those elements and tell PETSc to > ignore-off-processor-entries. No communication, redundant local work, some > setup code and cost. > > On Sun, Feb 26, 2017 at 11:37 PM, Fangbo Wang <[email protected]> wrote: > >> I got my finite element mesh from a commercial finite element software >> ABAQUS. I simply draw the geometry of the model in the graphical >> interface >> and assign element types and material properties to different parts of >> the >> model, ABAQUS will automatically output the element and node information >> of >> the model. >> >> Suppose I have 1000 elements in my model and 10 MPI processes, >> #1 to #100 local element matrices will be computed in MPI process 0; >> #101 to #200 local element matrices will be computed in MPI process 1; >> #201 to #300 local element matrices will be computed in MPI process 2; >> .......... >> #901 to #1000 local element matrices will be computed in MPI process 9; >> >> >> However, I might get a lot of global matrix indices which I need to send >> to other processors due to the degree of freedom ordering in the finite >> element model. >> >> This is what I did according to my understanding of finite element and >> what I have seen. >> Do you have some nice libraries or packages that can be easily used in >> scientific computing environment? >> >> Thank you very much! >> >> >> >> Fangbo Wang >> >> >> >> >> On Sun, Feb 26, 2017 at 11:15 PM, Barry Smith <[email protected]> wrote: >> >>> >>> > On Feb 26, 2017, at 10:04 PM, Fangbo Wang <[email protected]> >>> > wrote: >>> > >>> > My problem is a solid mechanics problem using finite element method to >>> discretize the model ( a 30mX30mX30m soil domain with a building >>> structure >>> on top). >>> > >>> > I am not manually deciding which MPI process compute which matrix >>> enties. Because I know Petsc can automaticaly communicate between these >>> processors. >>> > I am just asking each MPI process generate certain number of matrix >>> entries regardless of which process will finally store them. >>> >>> The standard way to handle this for finite elements is to partition >>> the >>> elements among the processes and then partition the nodes (rows of the >>> degrees of freedom) subservient to the partitioning of the elements. >>> Otherwise most of the matrix (or vector) entries must be communicated >>> and >>> this is not scalable. >>> >>> So how are you partitioning the elements (for matrix stiffness >>> computations) and the nodes between processes? >>> > >>> > Actually, I constructed another matrix with same size but generating >>> much less entries, and the code worked. However, it gets stuck when I >>> generate more matrix entries. >>> > >>> > thank you very much! Any suggestion is highly appreciated. >>> > >>> > BTW, what is the meaning of "[4] MatCheckCompressedRow(): Found the >>> ratio (num_zerorows 0)/(num_localrows 96812) < 0.6. Do not use >>> CompressedRow routines."? I know compressed row format is commonly used >>> for >>> sparse matrix, why don't use compressed row routines here? >>> >>> This is not important. >>> >>> > >>> > >>> > Thanks, >>> > >>> > >>> > Fangbo Wang >>> > >>> > >>> > >>> > On Sun, Feb 26, 2017 at 10:42 PM, Barry Smith <[email protected]> >>> wrote: >>> > >>> > How are you generating the matrix entries in parallel? In general >>> > you >>> can generate any matrix entries on any MPI process and they will be >>> automatically transferred to the MPI process that owns the entries >>> automatically. BUT if a huge number of matrix entries are computed on >>> one >>> process and need to be communicated to another process this may cause >>> gridlock with MPI. Based on the huge size of messages from process 12 it >>> looks like this is what is happening in your code. >>> > >>> > Ideally most matrix entries are generated on the process they are >>> stored and hence this gridlock does not happen. >>> > >>> > What type of discretization are you using? Finite differences, finite >>> element, finite volume, spectral, something else? How are you deciding >>> which MPI process should compute which matrix entries? Once we >>> understand >>> this we may be able to suggest a better way to compute the entries. >>> > >>> > Barry >>> > >>> > Under normally circumstances 1.3 million unknowns is not a large >>> parallel matrix, there may be special features of your matrix that is >>> making this difficult. >>> > >>> > >>> > >>> > > On Feb 26, 2017, at 9:30 PM, Fangbo Wang <[email protected]> >>> wrote: >>> > > >>> > > Hi, >>> > > >>> > > I construct a big matrix which is 1.3million by 1.3million which is >>> using approximatly 100GB memory. I have a computer with 500GB memory. >>> > > >>> > > I run the Petsc program and it get stuck when finally assembling the >>> matrix. The program is using around 200GB memory only. However, the >>> program >>> just get stuck there. Here is the output message when it gets stuck. >>> > > . >>> > > . >>> > > previous outputs not shown here >>> > > . >>> > > [12] MatStashScatterBegin_Ref(): No of messages: 15 >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 0: size: 271636416 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 1: size: 328581552 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 2: size: 163649328 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 3: size: 95512224 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 4: size: 317711616 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 5: size: 170971776 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 6: size: 254000064 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 7: size: 163146720 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 8: size: 345150048 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 9: size: 163411584 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 10: size: 428874816 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 11: size: 739711296 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 13: size: 435247344 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 14: size: 435136752 bytes >>> > > [12] MatStashScatterBegin_Ref(): Mesg_to: 15: size: 346167552 bytes >>> > > [14] MatAssemblyBegin_MPIAIJ(): Stash has 263158893 entries, uses 14 >>> mallocs. >>> > > [8] MatAssemblyBegin_MPIAIJ(): Stash has 286768572 entries, uses 14 >>> mallocs. >>> > > [12] MatAssemblyBegin_MPIAIJ(): Stash has 291181818 entries, uses 14 >>> mallocs. >>> > > [13] MatStashScatterBegin_Ref(): No of messages: 15 >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 0: size: 271636416 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 1: size: 271636416 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 2: size: 220594464 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 3: size: 51041952 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 4: size: 276201408 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 5: size: 256952256 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 6: size: 198489024 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 7: size: 218657760 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 8: size: 219686880 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 9: size: 288874752 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 10: size: 428874816 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 11: size: 172579968 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 12: size: 639835680 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 14: size: 270060144 bytes >>> > > [13] MatStashScatterBegin_Ref(): Mesg_to: 15: size: 511244160 bytes >>> > > [13] MatAssemblyBegin_MPIAIJ(): Stash has 268522881 entries, uses 14 >>> mallocs. >>> > > [5] MatAssemblyEnd_SeqAIJ(): Matrix size: 96812 X 96812; storage >>> space: 89786788 unneeded,7025212 used >>> > > [5] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() >>> is 0 >>> > > [5] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81 >>> > > [5] MatCheckCompressedRow(): Found the ratio (num_zerorows >>> 0)/(num_localrows 96812) < 0.6. Do not use CompressedRow routines. >>> > > [5] MatSeqAIJCheckInode(): Found 32271 nodes of 96812. Limit used: >>> > > 5. >>> Using Inode routines >>> > > [4] MatAssemblyEnd_SeqAIJ(): Matrix size: 96812 X 96812; storage >>> space: 89841924 unneeded,6970076 used >>> > > [4] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() >>> is 0 >>> > > [4] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 81 >>> > > [4] MatCheckCompressedRow(): Found the ratio (num_zerorows >>> 0)/(num_localrows 96812) < 0.6. Do not use CompressedRow routines. >>> > > [4] MatSeqAIJCheckInode(): Found 32272 nodes of 96812. Limit used: >>> > > 5. >>> Using Inode routines >>> > > >>> > > stuck here!!!! >>> > > >>> > > >>> > > Any one have ideas on this? Thank you very much! >>> > > >>> > > >>> > > >>> > > Fangbo Wang >>> > > >>> > > >>> > > >>> > > -- >>> > > Fangbo Wang, PhD student >>> > > Stochastic Geomechanics Research Group >>> > > Department of Civil, Structural and Environmental Engineering >>> > > University at Buffalo >>> > > Email: [email protected] >>> > >>> > >>> > >>> > >>> > -- >>> > Fangbo Wang, PhD student >>> > Stochastic Geomechanics Research Group >>> > Department of Civil, Structural and Environmental Engineering >>> > University at Buffalo >>> > Email: [email protected] >>> >>> >> >> >> -- >> Fangbo Wang, PhD student >> Stochastic Geomechanics Research Group >> Department of Civil, Structural and Environmental Engineering >> University at Buffalo >> Email: *[email protected] <[email protected]>* >> >
