What file system are you running your code on ? And is the same directory 
shared across all nodes? I have seen this error if users try to use a 
non-shared directory for MPI I/O operations ( e.g. /tmp which is a different 
drive/folder on each node). 

Thanks
Edgar

-----Original Message-----
From: users <users-boun...@lists.open-mpi.org> On Behalf Of bend linux4ms.net 
via users
Sent: Tuesday, November 2, 2021 3:33 PM
To: Open MPI Open MPI <users@lists.open-mpi.org>
Cc: bend linux4ms.net <b...@linux4ms.net>
Subject: [OMPI users] mca_sharedfp_lockfile issues

Ok, I got more issues. Maybe someone on the list can help me:

Open MPI version: 4.1.1 download from github source Compile on Centos 8.4  
using GCC 8.4.1 Configured is:

./configure --enable-shared --enable-static \
   --without-tm \
   --enable-mpi-cxx \
   --enable-wrapper-runpath \
   --enable-mpirun-prefix-by-default \
   --enable-mpi-thread-multiple \
   --enable-mpi-fortran=yes \
   --prefix=/p/app/compilers/mpi/openmpi/4.1.1 2>&1 \  | tee config.log

Intel HPC system, 850 nodes trying to launch IOR benchmark.

Top portion of the mpi command:
-------------------------------------------------------------------------------------

export OMPI_MCA_btl_openib_allow_ib=1
export OMPI_MCA_btl_openib_if_include="mlx5_0:1"

mpirun -machinefile ${hostlist} \
   --mca opal_common_ucx_opal_mem_hooks 1 \
    \
   -np ${NP} \
   --map-by node \
   -N ${rpn} \
   -vv \
---------------------------------------------------------------------------------------------------------
I am getting the message <node name:pid> [##] mca_sharedfp_lockedfile_file_open 
: Error during file open on all the nodes.

I've tried it with the --mca sharedfp lockedfile and without, I still get the 
errors.

What Have I done wrong ?

Thanks ..

Ben Duncan - 



Reply via email to