FYI for developers: I just merged a little python script in the ompi-scripts
repo that watches for when CI on a PR is done. I find this very handy,
especially because Open MPI's CI can take anywhere from 15 minutes to multiple
hours. This script lets me file a PR and then move on to something
I performed some tests on our Omnipath cluster, and I have a mixed bag of
results with 4.0.0rc1
1. Good news, the problems with the psm2 mtl that I reported in June/July
seem to be fixed. I still get however a warning every time I run a job with
4.0.0, e.g.
compute-1-1.local.4351PSM2
Hi Edgar,
I also saw some similar issues, not exactly the same, but look very similar
(may be because of different version of libpsm2 ). 1 and 2 are related to the
introduction of the OFI BTL and the fact that it opens an OFI EP in its init
function. I see that all btls call the init function d
Mattias,
IIRC, OFI BTL only create one EP. If you move it to add_proc, you might need to
add some checks to not re-creating EP over and over. Do you think moving EP
creation from component_init to component_open will solve the problem?
Arm
> On Sep 19, 2018, at 1:08 PM, Cabral, Matias A
> wr
Hi Arm,
> IIRC, OFI BTL only create one EP
Correct. But only one is needed to trigger the below issues. There are
different manifestations according to combinations of MTL OFI/PSM2, the version
of libpsm2, and the support of OFI Scalable Eps.
> Do you think moving EP creation from component_ini