Re: [OMPI users] rdmacm and udcm failure in 2.0.1 on RoCE

2016-12-15 Thread Dave Turner
Nathan:  Thanks for providing the debug flags.  I've attached the
output (NetPIPE.debug1) which basically shows that for RoCE the
udcm_component_query() will always fail.  Can someone verify if
this is correct that udcm is not supported for RoCE?  When I change
the test to force usage it does not work (NetPIPE.debug2).

[hero35][[38845,1],0][connect/btl_openib_connect_udcm.c:452:udcm_component_query]
UD CPC only supported on InfiniBand; skipped on mlx4_0:1
[hero35][[38845,1],0][connect/btl_openib_connect_udcm.c:501:udcm_component_query]
unavailable for use on mlx4_0:1; skipped

from btl_openib_connect_udcm.c

 438 static int udcm_component_query(mca_btl_openib_module_t *btl,
 439 opal_btl_openib_connect_base_module_t
**cpc)
 440 {
 441 udcm_module_t *m = NULL;
 442 int rc = OPAL_ERR_NOT_SUPPORTED;
 443
 444 do {
 445 /* If we do not have struct ibv_device.transport_device, then
 446we're in an old version of OFED that is IB only (i.e., no
 447iWarp), so we can safely assume that we can use this CPC. */
 448 #if defined(HAVE_STRUCT_IBV_DEVICE_TRANSPORT_TYPE) &&
HAVE_DECL_IBV_LINK_LAYER_ETHERN ET
 449 if (BTL_OPENIB_CONNECT_BASE_CHECK_IF_NOT_IB(btl)) {
 450 BTL_VERBOSE(("UD CPC only supported on InfiniBand; skipped
on %s:%d",
 451  ibv_get_device_name(btl->device->ib_dev),
 452  btl->port_num));
 453 break;
 454 }
 455 #endif

from base.h

#ifdef OPAL_HAVE_RDMAOE
#define BTL_OPENIB_CONNECT_BASE_CHECK_IF_NOT_IB(btl)   \
(((IBV_TRANSPORT_IB != ((btl)->device->ib_dev->transport_type)) || \
(IBV_LINK_LAYER_ETHERNET == ((btl)->ib_port_attr.link_layer))) ?   \
true : false)
#else
#define BTL_OPENIB_CONNECT_BASE_CHECK_IF_NOT_IB(btl)   \
((IBV_TRANSPORT_IB != ((btl)->device->ib_dev->transport_type)) ?   \
true : false)
#endif

So clearly for RoCE the transport is InfiniBand and the link layer is
Ethernet
so this will show that NOT_IB() is true, meaning that udcm is evidently
not supported for RoCE.  udcm definitely fails under 1.10.4 for RoCE in
our tests.  That means we need rdmacm to work which it evidently does
not at the moment for 2.0.1.  Could someone please verify that rdmacm
is not currently working in 2.0.1?  And therefore I'm assuming that
2.0.1 has not been successfully tested on RoCE???

   Dave



> --
>
> Message: 1
> Date: Wed, 14 Dec 2016 21:12:16 -0700
> From: Nathan Hjelm 
> To: drdavetur...@gmail.com, Open MPI Users 
> Subject: Re: [OMPI users] rdmacm and udcm failure in 2.0.1 on RoCE
> Message-ID: <32528c5d-14bc-42ce-b19a-684b81801...@me.com>
> Content-Type: text/plain; charset=utf-8
>
> Can you configure with ?enable-debug and run with ?mca btl_base_verbose
> 100 and provide the output? It may indicate why neither udcm nor rdmacm are
> available.
>
> -Nathan
>
>
> > On Dec 14, 2016, at 2:47 PM, Dave Turner  wrote:
> >
> > 
> --
> > No OpenFabrics connection schemes reported that they were able to be
> > used on a specific port.  As such, the openib BTL (OpenFabrics
> > support) will be disabled for this port.
> >
> >   Local host:   elf22
> >   Local device: mlx4_2
> >   Local port:   1
> >   CPCs attempted:   rdmacm, udcm
> > 
> --
> >
> > We have had no problems using 1.10.4 on RoCE but 2.0.1 fails to
> > find either connection manager.  I've read that rdmacm may have
> > issues under 2.0.1 so udcm may be the only one working.  Are there
> > any known issues with that on RoCE?  Or does this just mean we
> > don't have RoCE configured correctly?
> >
> >   Dave Turner
> >
> > --
> > Work: davetur...@ksu.edu (785) 532-7791
> >  2219 Engineering Hall, Manhattan KS  66506
> > Home:drdavetur...@gmail.com
> >   cell: (785) 770-5929
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
> --
Work: davetur...@ksu.edu (785) 532-7791
 2219 Engineering Hall, Manhattan KS  66506
Home:drdavetur...@gmail.com
  cell: (785) 770-5929


NetPIPE.debug1
Description: Binary data


NetPIPE.debug2
Description: Binary data
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

[OMPI users] install OpenMPI on CentOS in HPC

2016-12-15 Thread Mahmoud MIRZAEI
Dears,

May you please let me know if there is any procedure to install OpenMPI on
CentOS in HPC?

Thanks.
Mahmoud
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] How to yield CPU more when not computing (was curious behavior during wait for broadcast: 100% cpu)

2016-12-15 Thread Thomas Röhl


On 14.12.2016 08:00, Andreas Schäfer wrote:

On 14:24 Mon 12 Dec , Dave Love wrote:

Andreas Schäfer  writes:


Yes, as root, and there are N different systems to at least provide
unprivileged read access on HPC systems, but that's a bit different, I
think.

LIKWID[1] uses a daemon to provide limited RW access to MSRs for
applications. I wouldn't wonder if support for this was added to
LIKWID by RRZE.

Yes, that's one of the N I had in mind; others provide Linux modules.

 From a system manager's point of view it's not clear what are the
implications of the unprivileged access, or even how much it really
helps.  I've seen enough setups suggested for HPC systems in areas I
understand (and used by vendors) which allow privilege escalation more
or less trivially, maybe without any real operational advantage.  If
it's clearly safe and helpful then great, but I couldn't assess that.

I think LIKWID's access daemon is specifically designed to provide a
safe way of giving limited access to MSRs. I'm cc'ing Thomas Röhl as
he knows more about this.

As Andreas stated, the access daemon was written providing a rather safe 
method for users to access the MSRs. It opens a UNIX socket to 
communicate with the actual application. The lists of allowed registers 
are compiled inside the daemon, so no changes can be done from the 
outside and users are limited to the allowed registers. The code was 
checked by an IT security team and all recommendations were integrated 
but there are possibly other bugs (like in any other code).


If a user wants to dig deep into his/her code or control the behavior of 
a machine, providing access to MSR for users is really helpful. The 
LIKWID suite contains some examples like controlling CPU frequencies, 
(de)activating various hardware prefetchers or configuring the power 
budget.


For system manger's, the user access to MSRs can be a real pain because 
all MSRs need to be checked before/after a user's work to provide the 
system in a consistent state to the next user. Moreover, for both kernel 
modules and a privilege escalating daemon, there is commonly a reduction 
of security that must be compared to the possible advantages. My 
experience shows that system manager's don't want to load third-party 
kernel modules on their in-production systems (as long as there is no 
big company behind) but they also don't trust a suid-root daemon as the 
one of LIKWID.


For a runtime management system as OpenMPI, the integration of libcap is 
probably the safest way to access to the MSRs. You don't need a daemon 
and the application keeps running with common user privileges. The 
handling of libcap can be somewhat annoying and was Linux distribution 
dependent at the time I checked it (some worked, some not, some showed 
completely undefined behavior).


Cheers,
Thomas

--
--
M.Sc. Thomas Roehl, HPC Services
Friedrich-Alexander-Universitaet Erlangen-Nuernberg
Regionales RechenZentrum Erlangen (RRZE)
Martensstrasse 1, 91058 Erlangen, Germany
Tel. +49 9131 85-20800
mailto:thomas.ro...@rrze.fau.de
http://www.hpc.rrze.uni-erlangen.de/

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users