Back of the envelope: (Assuming you're using HEX27 elements... but the analysis won't be off by much if you're using HEX20)
~60 first order nodes in each direction: 216,000 first order nodes ~120 second order nodes in each direction: 1,728,000 second order nodes One first order variable: 216,000 degrees of freedom (DoFs) Three second order variables: 5,184,000 DoFs Total DoFs: 5,400,000 So... one solution/Krylov vector will be ~50MB (8 bytes * total DoFs). By default (if you're using PETSc with gmres) you'll get at least 30 Krylov vectors plus 3 or so in libMesh for storing current, old and older solutions... let's call it "40" vectors (to account for some overhead and temporary copies of things etc.)... so a total of 2GB of RAM just for solution vectors. Harder to calculate (but we can still ballpark it) is the amount of RAM the Jacobian matrix will take up. Let's go with worst case. Any interior vertex degree of freedom will have 5^3=125 nonzeros per second order variable (375 total) and 27 more for the linear variable so: ~400 nonzeros per row. That means the Jacobian will take up about as much memory as 400 solution vectors: 20GB. Depending on what preconditioner you choose you might even end up with a _copy_ of that Jacobian. Using something like ILU with limited fill you won't have much memory overhead... but using one of the HYPRE preconditioners will net you a full copy. To hedge our bets here I'm going to add in a 25% memory addition for the preconditioner: 5GB So... the Jacobian matrix, preconditioner and other solution vectors together are about 27GB of RAM. Now... let's look at the Mesh: 1,728,000 total nodes * 3 doubles (to store the coordinates) * 8 bytes: 42MB Each element holds a pointer to its 27 nodes. 60^3 * 27 * 8 bytes: 47MB Total: ~100MB. There will be more memory than just this though... the Mesh also stores quite a bit of information about degrees of freedom... etc. Let's just multiply by 3 for a safe number: 300MB. The actual number will be different... but it will be on this order of magnitude... Here's the final tally: Solution vectors, Matrix and preconditioner: ~27GB Mesh : ~300MB As you can see... the Mesh is NOT your problem. You don't need to worry about the memory the Mesh will use unless you have tens of millions (or even hundreds of millions) of elements. 216,000 elements is not anywhere close. If you want to run the problem size that you've proposed you're either going to need a beefy workstation or, even better, a small cluster. I generally recommend keeping about 20,000 DoFs per processor... so you could scale this problem all the way out to ~270 processors. The memory from the solution vectors, matrix and preconditioner will more or less "scale" (i.e. it will be distributed across all of the processors) while the memory for the mesh will be fixed when using SerialMesh/ReplicatedMesh (so you'll have 300MB for each MPI process that won't reduce as you spread the problem out). Hope that helps... Derek On Wed, Jun 8, 2016 at 5:56 PM Cody Permann <codyperm...@gmail.com> wrote: > On Wed, Jun 8, 2016 at 3:41 PM Xujun Zhao <xzha...@gmail.com> wrote: > > > Hi Cody, > > > > This sounds like the mesh data keeps a copy on each processor, but the > > matrices and vectors are still stored distributedly. is it correct? > > > > Yes > > > > I have a 3D stokes problem with 60x60x60 mesh, 2nd order element for > > velocity u,v,w, and first order for pressure p. Totally about 2.9M dofs. > > This can run with 1, 2 and 3 CPUs. However, if I use 4 CPUs, the program > > crashed with segmentation fault as follows: > > > > If I run a smaller system, e.g. 25x25x25, it still works for 4 CPUs. Do > > you think this is caused by memory due to the mesh duplication? > > > > That's a good size problem for a single machine. You may very well be > running out of memory here. I'd suggest that you open up another window and > watch the memory usage for your smaller problem, scale it up and watch it > grow. You can always try switching to "DistributedMesh" to see if that > helps. It will a little but it probably won't make as big of a difference > as you might expect. It might be time to distribute your problem to a few > nodes. > > Cody > > > > > > > ==================================================================================== > > > > BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES > > > > = PID 23903 RUNNING AT b461 > > > > = EXIT CODE: 9 > > > > = CLEANING UP REMAINING PROCESSES > > > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES > > > > > > > =================================================================================== > > > > YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Killed (signal 9) > > > > This typically refers to a problem with your application. > > > > Please see the FAQ page for debugging suggestions > > > > On Wed, Jun 8, 2016 at 4:17 PM, Cody Permann <codyperm...@gmail.com> > > wrote: > > > >> That's right! > >> > >> This is the classic space versus time tradeoff. In the bigger scheme of > >> things, using a little more memory is usually fine on a modern system. > The > >> SerialMesh (now called ReplicatedMesh) is quite a bit faster. I think > the > >> general consensus is: use ReplicatedMesh until you are truly memory > >> constrained AND you know that the bulk of the memory is in your mesh and > >> not your matrices and vectors and everything else. > >> > >> Cody > >> > >> On Wed, Jun 8, 2016 at 2:40 PM Xujun Zhao <xzha...@gmail.com> wrote: > >> > >>> Hi all, > >>> > >>> I am curious about SerialMesh running with multiple CPUs. If I have 1 > >>> node > >>> with 16 cores on the cluster. Will "mpirun -n 16" lead to 16 copies of > >>> SerialMesh? If so, it looks like running on multiple CPUs will require > >>> more > >>> memory?? > >>> > >>> Thanks. > >>> Xujun > >>> > >>> > ------------------------------------------------------------------------------ > >>> What NetFlow Analyzer can do for you? Monitors network bandwidth and > >>> traffic > >>> patterns at an interface-level. Reveals which users, apps, and > protocols > >>> are > >>> consuming the most bandwidth. Provides multi-vendor support for > NetFlow, > >>> J-Flow, sFlow and other flows. Make informed decisions using capacity > >>> planning reports. > >>> https://ad.doubleclick.net/ddm/clk/305295220;132659582;e > >>> _______________________________________________ > >>> Libmesh-users mailing list > >>> Libmesh-users@lists.sourceforge.net > >>> https://lists.sourceforge.net/lists/listinfo/libmesh-users > >>> > >> > > > > ------------------------------------------------------------------------------ > What NetFlow Analyzer can do for you? Monitors network bandwidth and > traffic > patterns at an interface-level. Reveals which users, apps, and protocols > are > consuming the most bandwidth. Provides multi-vendor support for NetFlow, > J-Flow, sFlow and other flows. Make informed decisions using capacity > planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e > _______________________________________________ > Libmesh-users mailing list > Libmesh-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/libmesh-users > ------------------------------------------------------------------------------ What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e _______________________________________________ Libmesh-users mailing list Libmesh-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libmesh-users