You can also change the location of tmp files with the following mca option:
-mca orte_tmpdir_base /some/place

ompi_info --param all all -l 9 | grep tmp
                MCA orte: parameter "orte_tmpdir_base" (current value: "", data 
source: default, level: 9 dev/all, type: string)
                MCA orte: parameter "orte_local_tmpdir_base" (current value: 
"", data source: default, level: 9 dev/all, type: string)
                MCA orte: parameter "orte_remote_tmpdir_base" (current value: 
"", data source: default, level: 9 dev/all, type: string)

--
Aurélien Bouteiller ~~ https://icl.cs.utk.edu/~bouteill/

> Le 23 mai 2015 à 03:55, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> a 
> écrit :
> 
> Bill,
> 
> the root cause is likely there is not enough free space in /tmp.
> 
> the simplest, but slowest, option is to run mpirun --mac btl tcp ...
> if you cannot make enough space under /tmp (maybe you run diskless)
> there are some options to create these kind of files under /dev/shm
> 
> Cheers,
> 
> Gilles
> 
> 
> On Saturday, May 23, 2015, Lane, William <william.l...@cshs.org 
> <mailto:william.l...@cshs.org>> wrote:
> I've compiled the linpack benchmark using openMPI 1.8.5 libraries
> and include files on CentOS 6.4.
> 
> I've tested the binary on the one Intel node (some
> sort of 4-core Xeon) and it runs, but when I try to run it on any of
> the old Sunfire opteron compute nodes it appears to hang (although
> top indicates CPU and memory usage) and eventually terminates
> by itself. I'm also getting the following openMPI error messages/warnings:
> 
> mpirun -np 16 --report-bindings --hostfile hostfile --prefix 
> /hpc/apps/mpi/openmpi/1.8.5-dev --mca btl_tcp_if_include eth0 xhpl
> 
> [cscld1-0-6:24370] create_and_attach: unable to create shared memory BTL 
> coordinating structure :: size 134217728
> [cscld1-0-3:24734] create_and_attach: unable to create shared memory BTL 
> coordinating structure :: size 134217728
> [cscld1-0-7:25152] create_and_attach: unable to create shared memory BTL 
> coordinating structure :: size 134217728
> [cscld1-0-4:18079] create_and_attach: unable to create shared memory BTL 
> coordinating structure :: size 134217728
> [cscld1-0-8:21443] create_and_attach: unable to create shared memory BTL 
> coordinating structure :: size 134217728
> [cscld1-0-2:19704] create_and_attach: unable to create shared memory BTL 
> coordinating structure :: size 134217728
> [cscld1-0-5:13481] create_and_attach: unable to create shared memory BTL 
> coordinating structure :: size 134217728
> [cscld1-0-0:21884] create_and_attach: unable to create shared memory BTL 
> coordinating structure :: size 134217728
> [cscld1:24240] 7 more processes have sent help message 
> help-opal-shmem-mmap.txt / target full
> 
> Note these errors also occur when I try to run the linpack benchmark on a 
> single
> node as well.
> 
> Does anyone know what's going on here? Google came up w/nothing and I have no
> idea what a BTL coordinating structure is.
> 
> -Bill L.
> 
> IMPORTANT WARNING: This message is intended for the use of the person or 
> entity to which it is addressed and may contain information that is 
> privileged and confidential, the disclosure of which is governed by 
> applicable law. If the reader of this message is not the intended recipient, 
> or the employee or agent responsible for delivering it to the intended 
> recipient, you are hereby notified that any dissemination, distribution or 
> copying of this information is strictly prohibited. Thank you for your 
> cooperation.
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/05/26907.php 
> <http://www.open-mpi.org/community/lists/users/2015/05/26907.php>

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to