[OMPI users] How to use OPENMPI with different Service Level in Infiniband Virtual Lane?

2020-02-24 Thread Kihang Youn via users


Hello,

I am searching the option to apply different service level(SL) in Infiniband 
communication.
For example, In Intel MPI, the environment variable named  
"DAPL_IB_SERVICE_LEVEL" can change the SL.
I found a runtime options "btl_openib_ib_service_level" and "UCX_IB_SL" on the 
FAQ pages.
Please let me know similar environment variable in OPENMPI if it exist.

Thank you,
Kihang



Kihang Youn(윤기항) - Application Analyst | Lenovo DCG Professional Services
Mobile: +82-10-9374-9396
E-mail: ky...@lenovo.com


Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Adam Simpson via users
Calls to process_vm_readv() and process_vm_writev() are disabled in the default 
Docker seccomp 
profile.
 You can add the docker flag --cap-add=SYS_PTRACE or better yet modify the 
seccomp profile so that process_vm_readv and process_vm_writev are whitelisted, 
by adding them to the syscalls.names list.

You can also disable seccomp, and several other confinement and security 
features, if you prefer a heavy handed approach:

$ docker run --privileged --security-opt label=disable --security-opt 
seccomp=unconfined --security-opt apparmor=unconfined --ipc=host --network=host 
...

If you're still having trouble after fixing the above you may need to check 
yama on the host. You can check with "sysctl -w kernel.yama.ptrace_scope", if 
it returns a value other than 0 you may need to disable it with "sysctl -w 
kernel.yama.ptrace_scope=0".

Adam


From: users  on behalf of Matt Thompson via 
users 
Sent: Monday, February 24, 2020 5:15 PM
To: Open MPI Users 
Cc: Matt Thompson 
Subject: Re: [OMPI users] Help with One-Sided Communication: Works in Intel 
MPI, Fails in Open MPI

External email: Use caution opening links or attachments

Nathan,

The reproducer would be that code that's on the Intel website. That is what I 
was running. You could pull my image if you like but...since you are the genius:

[root@adac3ce0cf32 ~]# mpirun --mca btl_vader_single_copy_mechanism none -np 2 
./a.out
Rank 0 running on adac3ce0cf32
Rank 1 running on adac3ce0cf32
Rank 0 sets data in the shared memory: 00 01 02 03
Rank 1 sets data in the shared memory: 10 11 12 13
Rank 0 gets data from the shared memory: 10 11 12 13
Rank 0 has new data in the shared memory: 00 01 02 03
Rank 1 gets data from the shared memory: 00 01 02 03
Rank 1 has new data in the shared memory: 10 11 12 13

And knowing this led to: https://github.com/open-mpi/ompi/issues/4948

So, good news is that setting export 
OMPI_MCA_btl_vader_single_copy_mechanism=none let's a lot of stuff work. The 
bad news is we seem to be using MPI_THREAD_MULTIPLE and it does not like it:

Start 2: pFIO_tests_mpi

2: Test command: /opt/openmpi-4.0.2/bin/mpiexec "-n" "18" "-oversubscribe" 
"/root/project/MAPL/build/bin/pfio_ctest_io.x" "-nc" "6" "-nsi" "6" "-nso" "6" 
"-ngo" "1" "-ngi" "1" "-v" "T,U" "-s" "mpi"
2: Test timeout computed to be: 1500
2: --
2: The OSC pt2pt component does not support MPI_THREAD_MULTIPLE in this release.
2: Workarounds are to run on a single node, or to use a system with an RDMA
2: capable network such as Infiniband.
2: --
2: [adac3ce0cf32:03619] *** An error occurred in MPI_Win_create
2: [adac3ce0cf32:03619] *** reported by process [270073857,16]
2: [adac3ce0cf32:03619] *** on communicator MPI COMMUNICATOR 4 DUP FROM 3
2: [adac3ce0cf32:03619] *** MPI_ERR_WIN: invalid window
2: [adac3ce0cf32:03619] *** MPI_ERRORS_ARE_FATAL (processes in this 
communicator will now abort,
2: [adac3ce0cf32:03619] ***and potentially your MPI job)
2: [adac3ce0cf32:03587] 17 more processes have sent help message 
help-osc-pt2pt.txt / mpi-thread-multiple-not-supported
2: [adac3ce0cf32:03587] Set MCA parameter "orte_base_help_aggregate" to 0 to 
see all help / error messages
2: [adac3ce0cf32:03587] 17 more processes have sent help message 
help-mpi-errors.txt / mpi_errors_are_fatal
2/5 Test #2: pFIO_tests_mpi ...***Failed0.18 sec

40% tests passed, 3 tests failed out of 5

Total Test time (real) =   1.08 sec

The following tests FAILED:
  2 - pFIO_tests_mpi (Failed)
  3 - pFIO_tests_simple (Failed)
  4 - pFIO_tests_hybrid (Failed)
Errors while running CTest

The weird thing is, I *am* running on one node (it's all I have, I'm not fancy 
enough at AWS to try more yet) and ompi_info does mention MPI_THREAD_MULTIPLE:

[root@adac3ce0cf32 build]# ompi_info | grep -i mult
  Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: yes, 
OMPI progress: no, ORTE progress: yes, Event lib: yes)

Any ideas on this one?

On Mon, Feb 24, 2020 at 7:24 PM Nathan Hjelm via users 
mailto:users@lists.open-mpi.org>> wrote:
The error is from btl/vader. CMA is not functioning as expected. It might work 
if you set btl_vader_single_copy_mechanism=none

Performance will suffer though. It would be worth understanding with 
process_readv is failing.

Can you send a simple reproducer?

-Nathan

On Feb 24, 2020, at 2:59 PM, Gabriel, Edgar via users 
mailto:users@lists.open-mpi.org>> wrote:



I am not an expert for the one-sided code in Open MPI, I wanted to comment 
briefly on the potential MPI -IO related item. As far as I can see, the error 
message



“Read -1, expected 48, errno = 1”


does not stem from MPI I/O, at least not from the ompio library. What file 
system did you use for these 

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Matt Thompson via users
Nathan,

The reproducer would be that code that's on the Intel website. That is what
I was running. You could pull my image if you like but...since you are the
genius:

[root@adac3ce0cf32 ~]# mpirun --mca btl_vader_single_copy_mechanism none
-np 2 ./a.out

Rank 0 running on adac3ce0cf32
Rank 1 running on adac3ce0cf32
Rank 0 sets data in the shared memory: 00 01 02 03
Rank 1 sets data in the shared memory: 10 11 12 13
Rank 0 gets data from the shared memory: 10 11 12 13
Rank 0 has new data in the shared memory: 00 01 02 03
Rank 1 gets data from the shared memory: 00 01 02 03
Rank 1 has new data in the shared memory: 10 11 12 13

And knowing this led to: https://github.com/open-mpi/ompi/issues/4948

So, good news is that setting export
OMPI_MCA_btl_vader_single_copy_mechanism=none let's a lot of stuff work.
The bad news is we seem to be using MPI_THREAD_MULTIPLE and it does not
like it:

Start 2: pFIO_tests_mpi

2: Test command: /opt/openmpi-4.0.2/bin/mpiexec "-n" "18" "-oversubscribe"
"/root/project/MAPL/build/bin/pfio_ctest_io.x" "-nc" "6" "-nsi" "6" "-nso"
"6" "-ngo" "1" "-ngi" "1" "-v" "T,U" "-s" "mpi"
2: Test timeout computed to be: 1500
2:
--
2: The OSC pt2pt component does not support MPI_THREAD_MULTIPLE in this
release.
2: Workarounds are to run on a single node, or to use a system with an RDMA
2: capable network such as Infiniband.
2:
--
2: [adac3ce0cf32:03619] *** An error occurred in MPI_Win_create
2: [adac3ce0cf32:03619] *** reported by process [270073857,16]
2: [adac3ce0cf32:03619] *** on communicator MPI COMMUNICATOR 4 DUP FROM 3
2: [adac3ce0cf32:03619] *** MPI_ERR_WIN: invalid window
2: [adac3ce0cf32:03619] *** MPI_ERRORS_ARE_FATAL (processes in this
communicator will now abort,
2: [adac3ce0cf32:03619] ***and potentially your MPI job)
2: [adac3ce0cf32:03587] 17 more processes have sent help message
help-osc-pt2pt.txt / mpi-thread-multiple-not-supported
2: [adac3ce0cf32:03587] Set MCA parameter "orte_base_help_aggregate" to 0
to see all help / error messages
2: [adac3ce0cf32:03587] 17 more processes have sent help message
help-mpi-errors.txt / mpi_errors_are_fatal
2/5 Test #2: pFIO_tests_mpi ...***Failed0.18 sec

40% tests passed, 3 tests failed out of 5

Total Test time (real) =   1.08 sec

The following tests FAILED:
  2 - pFIO_tests_mpi (Failed)
  3 - pFIO_tests_simple (Failed)
  4 - pFIO_tests_hybrid (Failed)
Errors while running CTest

The weird thing is, I *am* running on one node (it's all I have, I'm not
fancy enough at AWS to try more yet) and ompi_info does mention
MPI_THREAD_MULTIPLE:

[root@adac3ce0cf32 build]# ompi_info | grep -i mult
  Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support:
yes, OMPI progress: no, ORTE progress: yes, Event lib: yes)

Any ideas on this one?

On Mon, Feb 24, 2020 at 7:24 PM Nathan Hjelm via users <
users@lists.open-mpi.org> wrote:

> The error is from btl/vader. CMA is not functioning as expected. It might
> work if you set btl_vader_single_copy_mechanism=none
>
> Performance will suffer though. It would be worth understanding with
> process_readv is failing.
>
> Can you send a simple reproducer?
>
> -Nathan
>
> On Feb 24, 2020, at 2:59 PM, Gabriel, Edgar via users <
> users@lists.open-mpi.org> wrote:
>
> 
>
> I am not an expert for the one-sided code in Open MPI, I wanted to comment
> briefly on the potential MPI -IO related item. As far as I can see, the
> error message
>
>
>
> “Read -1, expected 48, errno = 1”
>
> does not stem from MPI I/O, at least not from the ompio library. What file
> system did you use for these tests?
>
>
>
> Thanks
>
> Edgar
>
>
>
> *From:* users  *On Behalf Of *Matt
> Thompson via users
> *Sent:* Monday, February 24, 2020 1:20 PM
> *To:* users@lists.open-mpi.org
> *Cc:* Matt Thompson 
> *Subject:* [OMPI users] Help with One-Sided Communication: Works in Intel
> MPI, Fails in Open MPI
>
>
>
> All,
>
>
>
> My guess is this is a "I built Open MPI incorrectly" sort of issue, but
> I'm not sure how to fix it. Namely, I'm currently trying to get an MPI
> project's CI working on CircleCI using Open MPI to run some unit tests (on
> a single node, so need some oversubscribe). I can build everything just
> fine, but when I try to run, things just...blow up:
>
>
>
> [root@3796b115c961 build]# /opt/openmpi-4.0.2/bin/mpirun -np 18
> -oversubscribe /root/project/MAPL/build/bin/pfio_ctest_io.x -nc 6 -nsi 6
> -nso 6 -ngo 1 -ngi 1 -v T,U -s mpi
>  start app rank:   0
>  start app rank:   1
>  start app rank:   2
>  start app rank:   3
>  start app rank:   4
>  start app rank:   5
> [3796b115c961:03629] Read -1, expected 48, errno = 1
> [3796b115c961:03629] *** An error occurred in MPI_Get
> [3796b115c961:03629] *** reported by process [2144600065,12]
> [3796b115c961:03629] *** on 

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Matt Thompson via users
On Mon, Feb 24, 2020 at 4:57 PM Gabriel, Edgar 
wrote:

> I am not an expert for the one-sided code in Open MPI, I wanted to comment
> briefly on the potential MPI -IO related item. As far as I can see, the
> error message
>
>
>
> “Read -1, expected 48, errno = 1”
>
> does not stem from MPI I/O, at least not from the ompio library. What file
> system did you use for these tests?
>

I am not sure. It was happening in a Docker image running on an AWS EC2
instance, so I guess whatever ebs is? I'm sort of a neophyte at both AWS
and Docker, so combine the two and...

Matt


Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Nathan Hjelm via users
The error is from btl/vader. CMA is not functioning as expected. It might work 
if you set btl_vader_single_copy_mechanism=none

Performance will suffer though. It would be worth understanding with 
process_readv is failing.

Can you send a simple reproducer?

-Nathan

> On Feb 24, 2020, at 2:59 PM, Gabriel, Edgar via users 
>  wrote:
> 
> 
> I am not an expert for the one-sided code in Open MPI, I wanted to comment 
> briefly on the potential MPI -IO related item. As far as I can see, the error 
> message
>  
> “Read -1, expected 48, errno = 1” 
> 
> does not stem from MPI I/O, at least not from the ompio library. What file 
> system did you use for these tests?
>  
> Thanks
> Edgar
>  
> From: users  On Behalf Of Matt Thompson via 
> users
> Sent: Monday, February 24, 2020 1:20 PM
> To: users@lists.open-mpi.org
> Cc: Matt Thompson 
> Subject: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, 
> Fails in Open MPI
>  
> All,
>  
> My guess is this is a "I built Open MPI incorrectly" sort of issue, but I'm 
> not sure how to fix it. Namely, I'm currently trying to get an MPI project's 
> CI working on CircleCI using Open MPI to run some unit tests (on a single 
> node, so need some oversubscribe). I can build everything just fine, but when 
> I try to run, things just...blow up:
>  
> [root@3796b115c961 build]# /opt/openmpi-4.0.2/bin/mpirun -np 18 
> -oversubscribe /root/project/MAPL/build/bin/pfio_ctest_io.x -nc 6 -nsi 6 -nso 
> 6 -ngo 1 -ngi 1 -v T,U -s mpi
>  start app rank:   0
>  start app rank:   1
>  start app rank:   2
>  start app rank:   3
>  start app rank:   4
>  start app rank:   5
> [3796b115c961:03629] Read -1, expected 48, errno = 1
> [3796b115c961:03629] *** An error occurred in MPI_Get
> [3796b115c961:03629] *** reported by process [2144600065,12]
> [3796b115c961:03629] *** on win rdma window 5
> [3796b115c961:03629] *** MPI_ERR_OTHER: known error not in list
> [3796b115c961:03629] *** MPI_ERRORS_ARE_FATAL (processes in this win will now 
> abort,
> [3796b115c961:03629] ***and potentially your MPI job)
>  
> I'm currently more concerned about the MPI_Get error, though I'm not sure 
> what that "Read -1, expected 48, errno = 1" bit is about (MPI-IO error?). Now 
> this code is fairly fancy MPI code, so I decided to try a simpler one. 
> Searched the internet and found an example program here:
>  
> https://software.intel.com/en-us/blogs/2014/08/06/one-sided-communication
>  
> and when I build and run with Intel MPI it works:
>  
> (1027)(master) $ mpirun -V
> Intel(R) MPI Library for Linux* OS, Version 2018 Update 4 Build 20180823 (id: 
> 18555)
> Copyright 2003-2018 Intel Corporation.
> (1028)(master) $ mpiicc rma_test.c
> (1029)(master) $ mpirun -np 2 ./a.out
> srun.slurm: cluster configuration lacks support for cpu binding
> Rank 0 running on borgj001
> Rank 1 running on borgj001
> Rank 0 sets data in the shared memory: 00 01 02 03
> Rank 1 sets data in the shared memory: 10 11 12 13
> Rank 0 gets data from the shared memory: 10 11 12 13
> Rank 1 gets data from the shared memory: 00 01 02 03
> Rank 0 has new data in the shared memory:Rank 1 has new data in the shared 
> memory: 10 11 12 13
>  00 01 02 03
>  
> So, I have some confidence it was written correctly. Now on the same system I 
> try with Open MPI (building with gcc, not Intel C):
>  
> (1032)(master) $ mpirun -V
> mpirun (Open MPI) 4.0.1
> 
> Report bugs to http://www.open-mpi.org/community/help/
> (1033)(master) $ mpicc rma_test.c
> (1034)(master) $ mpirun -np 2 ./a.out
> Rank 0 running on borgj001
> Rank 1 running on borgj001
> Rank 0 sets data in the shared memory: 00 01 02 03
> Rank 1 sets data in the shared memory: 10 11 12 13
> [borgj001:22668] *** An error occurred in MPI_Get
> [borgj001:22668] *** reported by process [2514223105,1]
> [borgj001:22668] *** on win rdma window 3
> [borgj001:22668] *** MPI_ERR_RMA_RANGE: invalid RMA address range
> [borgj001:22668] *** MPI_ERRORS_ARE_FATAL (processes in this win will now 
> abort,
> [borgj001:22668] ***and potentially your MPI job)
> [borgj001:22642] 1 more process has sent help message help-mpi-errors.txt / 
> mpi_errors_are_fatal
> [borgj001:22642] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
> help / error messages
>  
> This is a similar failure to above. Any ideas what I might be doing wrong 
> here? I don't doubt I'm missing something, but I'm not sure what. Open MPI 
> was built pretty boringly:
>  
> Configure command line: '--with-slurm' '--enable-shared' 
> '--disable-wrapper-rpath' '--disable-wrapper-runpath' 
> '--enable-mca-no-build=btl-usnic' '--prefix=...'
>  
> And I'm not sure if we need those disable-wrapper bits anymore, but long ago 
> we needed them, and so they've lived on in "how to build" READMEs until 
> something breaks. This btl-usnic is a bit unknown to me (this was built by 
> sysadmins on a cluster), but this is pretty close to how I build on my 
> 

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Gabriel, Edgar via users
I am not an expert for the one-sided code in Open MPI, I wanted to comment 
briefly on the potential MPI -IO related item. As far as I can see, the error 
message

“Read -1, expected 48, errno = 1”

does not stem from MPI I/O, at least not from the ompio library. What file 
system did you use for these tests?

Thanks
Edgar

From: users  On Behalf Of Matt Thompson via 
users
Sent: Monday, February 24, 2020 1:20 PM
To: users@lists.open-mpi.org
Cc: Matt Thompson 
Subject: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, 
Fails in Open MPI

All,

My guess is this is a "I built Open MPI incorrectly" sort of issue, but I'm not 
sure how to fix it. Namely, I'm currently trying to get an MPI project's CI 
working on CircleCI using Open MPI to run some unit tests (on a single node, so 
need some oversubscribe). I can build everything just fine, but when I try to 
run, things just...blow up:

[root@3796b115c961 build]# /opt/openmpi-4.0.2/bin/mpirun -np 18 -oversubscribe 
/root/project/MAPL/build/bin/pfio_ctest_io.x -nc 6 -nsi 6 -nso 6 -ngo 1 -ngi 1 
-v T,U -s mpi
 start app rank:   0
 start app rank:   1
 start app rank:   2
 start app rank:   3
 start app rank:   4
 start app rank:   5
[3796b115c961:03629] Read -1, expected 48, errno = 1
[3796b115c961:03629] *** An error occurred in MPI_Get
[3796b115c961:03629] *** reported by process [2144600065,12]
[3796b115c961:03629] *** on win rdma window 5
[3796b115c961:03629] *** MPI_ERR_OTHER: known error not in list
[3796b115c961:03629] *** MPI_ERRORS_ARE_FATAL (processes in this win will now 
abort,
[3796b115c961:03629] ***and potentially your MPI job)

I'm currently more concerned about the MPI_Get error, though I'm not sure what 
that "Read -1, expected 48, errno = 1" bit is about (MPI-IO error?). Now this 
code is fairly fancy MPI code, so I decided to try a simpler one. Searched the 
internet and found an example program here:

https://software.intel.com/en-us/blogs/2014/08/06/one-sided-communication

and when I build and run with Intel MPI it works:

(1027)(master) $ mpirun -V
Intel(R) MPI Library for Linux* OS, Version 2018 Update 4 Build 20180823 (id: 
18555)
Copyright 2003-2018 Intel Corporation.
(1028)(master) $ mpiicc rma_test.c
(1029)(master) $ mpirun -np 2 ./a.out
srun.slurm: cluster configuration lacks support for cpu binding
Rank 0 running on borgj001
Rank 1 running on borgj001
Rank 0 sets data in the shared memory: 00 01 02 03
Rank 1 sets data in the shared memory: 10 11 12 13
Rank 0 gets data from the shared memory: 10 11 12 13
Rank 1 gets data from the shared memory: 00 01 02 03
Rank 0 has new data in the shared memory:Rank 1 has new data in the shared 
memory: 10 11 12 13
 00 01 02 03

So, I have some confidence it was written correctly. Now on the same system I 
try with Open MPI (building with gcc, not Intel C):

(1032)(master) $ mpirun -V
mpirun (Open MPI) 4.0.1

Report bugs to http://www.open-mpi.org/community/help/
(1033)(master) $ mpicc rma_test.c
(1034)(master) $ mpirun -np 2 ./a.out
Rank 0 running on borgj001
Rank 1 running on borgj001
Rank 0 sets data in the shared memory: 00 01 02 03
Rank 1 sets data in the shared memory: 10 11 12 13
[borgj001:22668] *** An error occurred in MPI_Get
[borgj001:22668] *** reported by process [2514223105,1]
[borgj001:22668] *** on win rdma window 3
[borgj001:22668] *** MPI_ERR_RMA_RANGE: invalid RMA address range
[borgj001:22668] *** MPI_ERRORS_ARE_FATAL (processes in this win will now abort,
[borgj001:22668] ***and potentially your MPI job)
[borgj001:22642] 1 more process has sent help message help-mpi-errors.txt / 
mpi_errors_are_fatal
[borgj001:22642] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
help / error messages

This is a similar failure to above. Any ideas what I might be doing wrong here? 
I don't doubt I'm missing something, but I'm not sure what. Open MPI was built 
pretty boringly:

Configure command line: '--with-slurm' '--enable-shared' 
'--disable-wrapper-rpath' '--disable-wrapper-runpath' 
'--enable-mca-no-build=btl-usnic' '--prefix=...'

And I'm not sure if we need those disable-wrapper bits anymore, but long ago we 
needed them, and so they've lived on in "how to build" READMEs until something 
breaks. This btl-usnic is a bit unknown to me (this was built by sysadmins on a 
cluster), but this is pretty close to how I build on my desktop and it has the 
same issue.

Any ideas from the experts?

--
Matt Thompson
   “The fact is, this is about us identifying what we do best and
   finding more ways of doing less of it better” -- Director of Better Anna 
Rampton


[OMPI users] vader_single_copy_mechanism

2020-02-24 Thread Bennet Fauber via users
We are getting errors on our system that indicate that we should

export OMPI_MCA_btl_vader_single_copy_mechanism=none

Our user originally reported

> This occurs for both GCC and PGI.  The errors we get if we do not set this
> indicate something is going wrong in our communication which uses RMA,
> specifically a call to MPI_Get().

Kernel version

$ uname -r
3.10.0-957.10.1.el7.x86_64

$ ompi_info | grep vader
 MCA btl: vader (MCA v2.1.0, API v3.1.0, Component v4.0.2)

Our config.log file begins,

It was created by Open MPI configure 4.0.2, which was
generated by GNU Autoconf 2.69.  Invocation command line was

  $ ./configure --prefix=/sw/arcts/centos7/stacks/gcc/8.2.0/openmpi/4.0.2 \
--with-pmix=/opt/pmix/2.1.3 --with-libevent=external --with-hwloc=/usr \
--with-slurm --without-verbs --enable-shared --with-ucx CC=gcc FC=gfortran

and that resulted in this summary at the conclusion of configuration.

Open MPI configuration:
---
Version: 4.0.2
Build MPI C bindings: yes
Build MPI C++ bindings (deprecated): no
Build MPI Fortran bindings: mpif.h, use mpi, use mpi_f08
MPI Build Java bindings (experimental): no
Build Open SHMEM support: yes
Debug build: no
Platform file: (none)

Miscellaneous
---
CUDA support: no
HWLOC support: external
Libevent support: external
PMIx support: External (2x)

Transports
---
Cisco usNIC: no
Cray uGNI (Gemini/Aries): no
Intel Omnipath (PSM2): no
Intel TrueScale (PSM): no
Mellanox MXM: no
Open UCX: yes
OpenFabrics OFI Libfabric: no
OpenFabrics Verbs: no
Portals4: no
Shared memory/copy in+copy out: yes
Shared memory/Linux CMA: yes
Shared memory/Linux KNEM: no
Shared memory/XPMEM: no
TCP: yes

Resource Managers
---
Cray Alps: no
Grid Engine: no
LSF: no
Moab: no
Slurm: yes
ssh/rsh: yes
Torque: no

OMPIO File Systems
---
Generic Unix FS: yes
Lustre: no
PVFS2/OrangeFS: no

It seems that the MCA mechanism should be able to work, but it does not.

Our system is running Slurm, and we have configured Slurm to use
cgroups.  I do not know whether this problem arises only within a job
or also on a login node.

Anyone know what else I might need to do to enable it?

Thanks, in advance,-- bennet