On 9/3/21 7:49 PM, Nicholas Yue wrote:
The file system I am mounting via NFS is an ordinary Linux file system, it is
not a HPC parallel filesystems like Lustre or anything like them.
I tried commenting out the call to check-pointing as you suggested and was
able to run the code on 4 node (each with 4 cores) and it finished very quickly.
My mpirun command line looks like this
mpirun --host pc1,pc2,pc3,pc4 --mca btl_tcp_if_include 192.168.0.0/24 --mca
btl tcp,self /nfs/systems/dealii/head-bost_1_70_0/examples/step-69/step-69.release
It is unlikely that I will have the resource to spin up a Lustre like parallel
filesystems, do you have additional suggestion that may allow me to enable
check-pointing ?
You don't need something like Lustre. There are a number of ways to make
things work with NFS, for example using this approach
https://docs.huihoo.com/mpich/mpichman-chp4/node60.htm
I bet you can also find other approaches if you search for "MPI I/O" and "NFS".
Best
W.
--
------------------------------------------------------------------------
Wolfgang Bangerth email: [email protected]
www: http://www.math.colostate.edu/~bangerth/
--
The deal.II project is located at http://www.dealii.org/
For mailing list/forum options, see
https://groups.google.com/d/forum/dealii?hl=en
---
You received this message because you are subscribed to the Google Groups "deal.II User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/dealii/0df71f12-0074-a34b-ec03-42db16579d92%40colostate.edu.