You can pick one test, make it standalone, and open an issue on GitHub.
How does (vanilla) Open MPI compare to your vendor Open MPI based library?
Cheers,
Gilles
On Wed, Jan 11, 2023 at 10:20 PM Dave Love via users <
users@lists.open-mpi.org> wrote:
> Gilles Gouaillardet via user
Hi Eric,
Currently, Open MPI does not provide specific support for CephFS.
MPI-IO is either implemented by ROMIO (imported from MPICH, it does not
support CephFS today)
or the "native" ompio component (that also does not support CephFS today).
A proof of concept for CephFS in ompio might
Hi,
Simply add
btl = tcp,self
If the openib error message persists, try also adding
osc_rdma_btls = ugni,uct,ucp
or simply
osc = ^rdma
Cheers,
Gilles
On 11/29/2022 5:16 PM, Gestió Servidors via users wrote:
Hi,
If I run “mpirun --mca btl tcp,self --mca allow_ib 0 -n 12
Arham,
It should be balanced: the default mapping is to allocate NUMA packages
round robin.
you can
mpirun --report-bindings -n 28 true
to have Open MPI report the bindings
or
mpirun --tag-output -n 28 grep Cpus_allowed_list /proc/self/status
to have each task report which physical cpu it is
Chris,
Did you double check libopen-rte.so.40 and libopen-pal.so.40 are installed
in /mnt/software/o/openmpi/4.1.4-ct-test/lib?
If they are not present, it means your install is busted and you should try
to reinstall it.
Cheers,
Gilles
On Sat, Nov 5, 2022 at 3:42 AM Chris Taylor via users <
;machinefile" (vs.
a copy-and-pasted "em dash")?
--
Jeff Squyres
jsquy...@cisco.com
From: users on behalf of Gilles Gouaillardet via
users
Sent: Sunday, November 13, 2022 9:18 PM
To: Open MPI Users
Cc: Gilles Gouaillardet
Subject: Re: [OMPI users
There is a typo in your command line.
You should use --mca (minus minus) instead of -mca
Also, you can try --machinefile instead of -machinefile
Cheers,
Gilles
There are not enough slots available in the system to satisfy the 2
slots that were requested by the application:
–mca
On Mon, Nov
Arun,
First Open MPI selects a pml for **all** the MPI tasks (for example,
pml/ucx or pml/ob1)
Then, if pml/ob1 ends up being selected, a btl component (e.g. btl/uct,
btl/vader) is used for each pair of MPI tasks
(tasks on the same node will use btl/vader, tasks on different nodes will
use
Rob,
Do you invoke mpirun from **inside** the container?
IIRC, mpirun is generally invoked from **outside** the container, could
you try this if not already the case?
The error message is from SLURM, so this is really a SLURM vs
singularity issue.
What if you
srun -N 2 -n 2 hostname
Todd,
Similar issues were also reported when there is Network Translation
(NAT) between hosts, and that occured when using kvm/qemu virtual
machine running on the same host.
First you need to list the available interfaces on both nodes. Then try
to restrict to a single interface that is
Luis,
That can happen if a component is linked with libnuma.so:
Open MPI will fail to open it and try to fallback on an other one.
You can run ldd on the mca_*.so components in the /.../lib/openmpi directory
to figure out which is using libnuma.so and assess if it is needed or not.
Cheers,
Open MPI 1.6.5 is an antique version and you should not expect any support
with it.
Instead, I suggest you try the latest one, rebuild your app and try again.
FWIW, that kind of error occurs when the MPI library does not match mpirun
That can happen when mpirun and libmpi.so come from different
Aziz,
When using direct run (e.g. srun), OpenMPI has to interact with SLURM.
This is typically achieved via PMI2 or PMIx
You can
srun --mpi=list
to list the available options on your system
if PMIx is available, you can
srun --mpi=pmix ...
if only PMI2 is available, you need to make sure Open
Kurt,
I think Joachim was also asking for the command line used to launch your
application.
Since you are using Slurm and MPI_Comm_spawn(), it is important to
understand whether you are using mpirun or srun
FWIW, --mpi=pmix is a srun option. you can srun --mpi=list to find the
available
Christof,
Open MPI switching to the internal PMIx is a bug I addressed in
https://github.com/open-mpi/ompi/pull/11704
Feel free to manually download and apply the patch, you will then need
recent autotools and run
./autogen.pl --force
An other option is to manually edit the configure file
Look
Hi,
Please open a GitHub issue at https://github.com/open-ompi/ompi/issues and
provide the requested information
Cheers,
Gilles
On Sat, Jan 27, 2024 at 12:04 PM Kook Jin Noh via users <
users@lists.open-mpi.org> wrote:
> Hi,
>
>
>
> I’m installing OpenMPI 5.0.1 on Archlinux 6.7.1. Everything
Hi,
please open an issue on GitHub at https://github.com/open-mpi/ompi/issues
and provide the requested information.
If the compilation failed when configured with --enable-debug, please share
the logs.
the name of the WRF subroutine suggests the crash might occur in
MPI_Comm_split(),
if so,
Christopher,
I do not think Open MPI explicitly asks SLURM which cores have been
assigned on each node.
So if you are planning to run multiple jobs on the same node, your best bet
is probably to have SLURM
use cpusets.
Cheers,
Gilles
On Sat, Feb 24, 2024 at 7:25 AM Christopher Daley via users
Greg,
If Open MPI was built with UCX, your jobs will likely use UCX (and the
shared memory provider) even if running on a single node.
You can
mpirun --mca pml ob1 --mca btl self,sm ...
if you want to avoid using UCX.
What is a typical mpirun command line used under the hood by your "make
test"?
Hi,
Is there any reason why you do not build the latest 5.0.2 package?
Anyway, the issue could be related to an unknown filesystem.
Do you get a meaningful error if you manually run
/.../test/util/opal_path_nfs?
If not, can you share the output of
mount | cut -f3,5 -d' '
Cheers,
Gilles
On
1001 - 1020 of 1020 matches
Mail list logo