Re: [OMPI users] rdmacm and udcm failure in 2.0.1 on RoCE
Nathan: Thanks for providing the debug flags. I've attached the output (NetPIPE.debug1) which basically shows that for RoCE the udcm_component_query() will always fail. Can someone verify if this is correct that udcm is not supported for RoCE? When I change the test to force usage it does not work (NetPIPE.debug2). [hero35][[38845,1],0][connect/btl_openib_connect_udcm.c:452:udcm_component_query] UD CPC only supported on InfiniBand; skipped on mlx4_0:1 [hero35][[38845,1],0][connect/btl_openib_connect_udcm.c:501:udcm_component_query] unavailable for use on mlx4_0:1; skipped from btl_openib_connect_udcm.c 438 static int udcm_component_query(mca_btl_openib_module_t *btl, 439 opal_btl_openib_connect_base_module_t **cpc) 440 { 441 udcm_module_t *m = NULL; 442 int rc = OPAL_ERR_NOT_SUPPORTED; 443 444 do { 445 /* If we do not have struct ibv_device.transport_device, then 446we're in an old version of OFED that is IB only (i.e., no 447iWarp), so we can safely assume that we can use this CPC. */ 448 #if defined(HAVE_STRUCT_IBV_DEVICE_TRANSPORT_TYPE) && HAVE_DECL_IBV_LINK_LAYER_ETHERN ET 449 if (BTL_OPENIB_CONNECT_BASE_CHECK_IF_NOT_IB(btl)) { 450 BTL_VERBOSE(("UD CPC only supported on InfiniBand; skipped on %s:%d", 451 ibv_get_device_name(btl->device->ib_dev), 452 btl->port_num)); 453 break; 454 } 455 #endif from base.h #ifdef OPAL_HAVE_RDMAOE #define BTL_OPENIB_CONNECT_BASE_CHECK_IF_NOT_IB(btl) \ (((IBV_TRANSPORT_IB != ((btl)->device->ib_dev->transport_type)) || \ (IBV_LINK_LAYER_ETHERNET == ((btl)->ib_port_attr.link_layer))) ? \ true : false) #else #define BTL_OPENIB_CONNECT_BASE_CHECK_IF_NOT_IB(btl) \ ((IBV_TRANSPORT_IB != ((btl)->device->ib_dev->transport_type)) ? \ true : false) #endif So clearly for RoCE the transport is InfiniBand and the link layer is Ethernet so this will show that NOT_IB() is true, meaning that udcm is evidently not supported for RoCE. udcm definitely fails under 1.10.4 for RoCE in our tests. That means we need rdmacm to work which it evidently does not at the moment for 2.0.1. Could someone please verify that rdmacm is not currently working in 2.0.1? And therefore I'm assuming that 2.0.1 has not been successfully tested on RoCE??? Dave > -- > > Message: 1 > Date: Wed, 14 Dec 2016 21:12:16 -0700 > From: Nathan Hjelm> To: drdavetur...@gmail.com, Open MPI Users > Subject: Re: [OMPI users] rdmacm and udcm failure in 2.0.1 on RoCE > Message-ID: <32528c5d-14bc-42ce-b19a-684b81801...@me.com> > Content-Type: text/plain; charset=utf-8 > > Can you configure with ?enable-debug and run with ?mca btl_base_verbose > 100 and provide the output? It may indicate why neither udcm nor rdmacm are > available. > > -Nathan > > > > On Dec 14, 2016, at 2:47 PM, Dave Turner wrote: > > > > > -- > > No OpenFabrics connection schemes reported that they were able to be > > used on a specific port. As such, the openib BTL (OpenFabrics > > support) will be disabled for this port. > > > > Local host: elf22 > > Local device: mlx4_2 > > Local port: 1 > > CPCs attempted: rdmacm, udcm > > > -- > > > > We have had no problems using 1.10.4 on RoCE but 2.0.1 fails to > > find either connection manager. I've read that rdmacm may have > > issues under 2.0.1 so udcm may be the only one working. Are there > > any known issues with that on RoCE? Or does this just mean we > > don't have RoCE configured correctly? > > > > Dave Turner > > > > -- > > Work: davetur...@ksu.edu (785) 532-7791 > > 2219 Engineering Hall, Manhattan KS 66506 > > Home:drdavetur...@gmail.com > > cell: (785) 770-5929 > > ___ > > users mailing list > > users@lists.open-mpi.org > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > -- Work: davetur...@ksu.edu (785) 532-7791 2219 Engineering Hall, Manhattan KS 66506 Home:drdavetur...@gmail.com cell: (785) 770-5929 NetPIPE.debug1 Description: Binary data NetPIPE.debug2 Description: Binary data ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
[OMPI users] install OpenMPI on CentOS in HPC
Dears, May you please let me know if there is any procedure to install OpenMPI on CentOS in HPC? Thanks. Mahmoud ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] How to yield CPU more when not computing (was curious behavior during wait for broadcast: 100% cpu)
On 14.12.2016 08:00, Andreas Schäfer wrote: On 14:24 Mon 12 Dec , Dave Love wrote: Andreas Schäferwrites: Yes, as root, and there are N different systems to at least provide unprivileged read access on HPC systems, but that's a bit different, I think. LIKWID[1] uses a daemon to provide limited RW access to MSRs for applications. I wouldn't wonder if support for this was added to LIKWID by RRZE. Yes, that's one of the N I had in mind; others provide Linux modules. From a system manager's point of view it's not clear what are the implications of the unprivileged access, or even how much it really helps. I've seen enough setups suggested for HPC systems in areas I understand (and used by vendors) which allow privilege escalation more or less trivially, maybe without any real operational advantage. If it's clearly safe and helpful then great, but I couldn't assess that. I think LIKWID's access daemon is specifically designed to provide a safe way of giving limited access to MSRs. I'm cc'ing Thomas Röhl as he knows more about this. As Andreas stated, the access daemon was written providing a rather safe method for users to access the MSRs. It opens a UNIX socket to communicate with the actual application. The lists of allowed registers are compiled inside the daemon, so no changes can be done from the outside and users are limited to the allowed registers. The code was checked by an IT security team and all recommendations were integrated but there are possibly other bugs (like in any other code). If a user wants to dig deep into his/her code or control the behavior of a machine, providing access to MSR for users is really helpful. The LIKWID suite contains some examples like controlling CPU frequencies, (de)activating various hardware prefetchers or configuring the power budget. For system manger's, the user access to MSRs can be a real pain because all MSRs need to be checked before/after a user's work to provide the system in a consistent state to the next user. Moreover, for both kernel modules and a privilege escalating daemon, there is commonly a reduction of security that must be compared to the possible advantages. My experience shows that system manager's don't want to load third-party kernel modules on their in-production systems (as long as there is no big company behind) but they also don't trust a suid-root daemon as the one of LIKWID. For a runtime management system as OpenMPI, the integration of libcap is probably the safest way to access to the MSRs. You don't need a daemon and the application keeps running with common user privileges. The handling of libcap can be somewhat annoying and was Linux distribution dependent at the time I checked it (some worked, some not, some showed completely undefined behavior). Cheers, Thomas -- -- M.Sc. Thomas Roehl, HPC Services Friedrich-Alexander-Universitaet Erlangen-Nuernberg Regionales RechenZentrum Erlangen (RRZE) Martensstrasse 1, 91058 Erlangen, Germany Tel. +49 9131 85-20800 mailto:thomas.ro...@rrze.fau.de http://www.hpc.rrze.uni-erlangen.de/ ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users