On Mon, Nov 12, 2018 at 8:08 AM Andrei Berceanu
<andreicberce...@gmail.com> wrote:
> Running a CUDA+MPI application on a node with 2 K80 GPUs, I get the following 
> warnings:
> --------------------------------------------------------------------------
> WARNING: There is at least non-excluded one OpenFabrics device found,
> but there are no active ports detected (or Open MPI was unable to use
> them).  This is most certainly not what you wanted.  Check your
> cables, subnet manager configuration, etc.  The openib BTL will be
> ignored for this job.
>   Local host: gpu01
> --------------------------------------------------------------------------
> [gpu01:107262] 1 more process has sent help message help-mpi-btl-openib.txt / 
> no active ports found
> [gpu01:107262] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
> help / error messages
> Any idea of what is going on and how I can fix this?
> I am using OpenMPI 3.1.2.

looks like openmpi found something like an infiniband card in the
compute node you're using, but it is not active/usable

as for a fix, it depends.

if you have an IB card should it be active?  if so, you'd have to
check the connections to see why it's disabled

if not, you'll can tell openmpi to disregard the IB ports, which will
clear the warning, but that might mean you're potentially using a
slower interface for message passing
users mailing list

Reply via email to