You can also change the location of tmp files with the following mca option: -mca orte_tmpdir_base /some/place
ompi_info --param all all -l 9 | grep tmp MCA orte: parameter "orte_tmpdir_base" (current value: "", data source: default, level: 9 dev/all, type: string) MCA orte: parameter "orte_local_tmpdir_base" (current value: "", data source: default, level: 9 dev/all, type: string) MCA orte: parameter "orte_remote_tmpdir_base" (current value: "", data source: default, level: 9 dev/all, type: string) -- Aurélien Bouteiller ~~ https://icl.cs.utk.edu/~bouteill/ > Le 23 mai 2015 à 03:55, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> a > écrit : > > Bill, > > the root cause is likely there is not enough free space in /tmp. > > the simplest, but slowest, option is to run mpirun --mac btl tcp ... > if you cannot make enough space under /tmp (maybe you run diskless) > there are some options to create these kind of files under /dev/shm > > Cheers, > > Gilles > > > On Saturday, May 23, 2015, Lane, William <william.l...@cshs.org > <mailto:william.l...@cshs.org>> wrote: > I've compiled the linpack benchmark using openMPI 1.8.5 libraries > and include files on CentOS 6.4. > > I've tested the binary on the one Intel node (some > sort of 4-core Xeon) and it runs, but when I try to run it on any of > the old Sunfire opteron compute nodes it appears to hang (although > top indicates CPU and memory usage) and eventually terminates > by itself. I'm also getting the following openMPI error messages/warnings: > > mpirun -np 16 --report-bindings --hostfile hostfile --prefix > /hpc/apps/mpi/openmpi/1.8.5-dev --mca btl_tcp_if_include eth0 xhpl > > [cscld1-0-6:24370] create_and_attach: unable to create shared memory BTL > coordinating structure :: size 134217728 > [cscld1-0-3:24734] create_and_attach: unable to create shared memory BTL > coordinating structure :: size 134217728 > [cscld1-0-7:25152] create_and_attach: unable to create shared memory BTL > coordinating structure :: size 134217728 > [cscld1-0-4:18079] create_and_attach: unable to create shared memory BTL > coordinating structure :: size 134217728 > [cscld1-0-8:21443] create_and_attach: unable to create shared memory BTL > coordinating structure :: size 134217728 > [cscld1-0-2:19704] create_and_attach: unable to create shared memory BTL > coordinating structure :: size 134217728 > [cscld1-0-5:13481] create_and_attach: unable to create shared memory BTL > coordinating structure :: size 134217728 > [cscld1-0-0:21884] create_and_attach: unable to create shared memory BTL > coordinating structure :: size 134217728 > [cscld1:24240] 7 more processes have sent help message > help-opal-shmem-mmap.txt / target full > > Note these errors also occur when I try to run the linpack benchmark on a > single > node as well. > > Does anyone know what's going on here? Google came up w/nothing and I have no > idea what a BTL coordinating structure is. > > -Bill L. > > IMPORTANT WARNING: This message is intended for the use of the person or > entity to which it is addressed and may contain information that is > privileged and confidential, the disclosure of which is governed by > applicable law. If the reader of this message is not the intended recipient, > or the employee or agent responsible for delivering it to the intended > recipient, you are hereby notified that any dissemination, distribution or > copying of this information is strictly prohibited. Thank you for your > cooperation. > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2015/05/26907.php > <http://www.open-mpi.org/community/lists/users/2015/05/26907.php>
signature.asc
Description: Message signed with OpenPGP using GPGMail