Hello,


I am using ompi-v1.8 and have come across the following error:



--------------------------------------------------------------------------

Open MPI tried to bind a new process, but something went wrong.  The

process was killed without launching the target application.  Your job

will now abort.



  Local host:        vegas17

  Application name:  trivial/test_get__trivial/c_hello

  Error message:     hwloc_set_cpubind returned "Error" for bitmap "0,16"

  Location:          odls_default_module.c:551

--------------------------------------------------------------------------



This happens when running a simple trivial test with the following command
line:



mpirun --map-by node --bind-to core -display-map -np 2 -mca pml ob1
…/trivial/test_get__trivial/c_hello



What seems to eliminate this error is changing the binding policy from core
to none (--bind-to none).

The only nodes which are issuing this error are always the nodes which have
GPUs in them.

When running the same command line on other non-GPU nodes, there is no
error.

I’m using Slurm to allocate the nodes.



Has anyone seen this issue or knows what’s wrong here?



Thanks,

Alina.

Reply via email to