Try adding --hetero-nodes to your mpirun cmd line

On Sep 14, 2014, at 8:25 AM, Alina Sklarevich <ali...@dev.mellanox.co.il> wrote:

> Hello,
> 
>  
> I am using ompi-v1.8 and have come across the following error:
> 
>  
> --------------------------------------------------------------------------
> 
> Open MPI tried to bind a new process, but something went wrong.  The
> 
> process was killed without launching the target application.  Your job
> 
> will now abort.
> 
>  
>   Local host:        vegas17
> 
>   Application name:  trivial/test_get__trivial/c_hello
> 
>   Error message:     hwloc_set_cpubind returned "Error" for bitmap "0,16"
> 
>   Location:          odls_default_module.c:551
> 
> --------------------------------------------------------------------------
> 
>  
> This happens when running a simple trivial test with the following command 
> line:
> 
>  
> mpirun --map-by node --bind-to core -display-map -np 2 -mca pml ob1 
> …/trivial/test_get__trivial/c_hello
> 
>  
> What seems to eliminate this error is changing the binding policy from core 
> to none (--bind-to none).
> 
> The only nodes which are issuing this error are always the nodes which have 
> GPUs in them.
> 
> When running the same command line on other non-GPU nodes, there is no error.
> 
> I’m using Slurm to allocate the nodes.
> 
>  
> Has anyone seen this issue or knows what’s wrong here?
> 
>  
> Thanks,
> 
> Alina.
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2014/09/15824.php

Reply via email to