Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-09 Thread Jing Gong via users
Hi Ralph,


The slurm seems to be configured with pmlx.


$ ls /usr/lib64/slurm/ |grep pmi
acct_gather_energy_ipmi.so
mpi_pmi2.so
mpi_pmix.so
mpi_pmix_v1.so


(and libpmix* in /usr/lib64)



Anyway, I recompiled openmpi v3.0.0 with


$ ./configure --with-pmix=/usr --with-slurm ...


but this time I even could not run "mpirun"


$ mpirun -n 4 ./a.out
 [[42812,0],0] ORTE_ERROR_LOG: Not found in file 
../../../../../openmpi-3.0.0/orte/mca/ess/hnp/ess_hnp_module.c at line 649
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_pmix_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--


What is the issue?


Thanks a lot.


/Jing




From: users  on behalf of Ralph Castain via 
users 
Sent: Thursday, August 8, 2019 16:41
To: Open MPI Users
Cc: Ralph Castain
Subject: Re: [OMPI users] OMPI was not built with SLURM's PMI support

Did you configure Slurm to use PMIx? If so, then you simply need to set the 
"--mpi=pmix" or "--mpi=pmix_v2" (depending on which version of PMIx you used) 
flag on your srun cmd line so it knows to use it.

If not (and you can't fix it), then you have to explicitly configure OMPI to 
use Slurm's legacy PMI libraries - we won't do that by default. "./configure 
--help" will show you what needs to be done.

See https://slurm.schedmd.com/mpi_guide.html for assistance on checking your 
Slurm config and setting it up with PMIx support
Ralph


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-09 Thread Jing Gong via users

Hi,


Sorry, the email was hold and resent in the morning by the mail server. please 
ignore it due to the new conversations.


Thanks. /Jing



From: users  on behalf of Jing Gong via users 

Sent: Thursday, August 8, 2019 20:47
To: Ralph Castain
Cc: Jing Gong; Open MPI Users
Subject: Re: [OMPI users] OMPI was not built with SLURM's PMI support


Hi Ralph,


> Remove the extra configure options from OMPI - you don't need to tell it 
> --with-pmix or --with-slurm. It will "do the right thing" without that stuff.

Yes, "mpirun" works without these additional flags. But the the origin issue 
"OMPI was not built with SLURM's PMI support ..." is back if using "srun"

Thanks.

/Jing

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-09 Thread Ralph Castain via users
Remove the extra configure options from OMPI - you don't need to tell it 
--with-pmix or --with-slurm. It will "do the right thing" without that stuff.



On Aug 8, 2019, at 10:54 AM, Jing Gong mailto:gongj...@kth.se> > wrote:

Hi Ralph,

The slurm seems to be configured with pmlx. 

$ ls /usr/lib64/slurm/ |grep pmi
acct_gather_energy_ipmi.so
mpi_pmi2.so
mpi_pmix.so
mpi_pmix_v1.so

(and libpmix* in /usr/lib64)
 
Anyway, I recompiled openmpi v3.0.0 with

$ ./configure --with-pmix=/usr --with-slurm ...

but this time I even could not run "mpirun"

$ mpirun -n 4 ./a.out 
 [[42812,0],0] ORTE_ERROR_LOG: Not found in file 
../../../../../openmpi-3.0.0/orte/mca/ess/hnp/ess_hnp_module.c at line 649
--
It looks like orte_init failed for some reason; your parallel process is

likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_pmix_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--

What is the issue?

Thanks a lot.

/Jing



From: users mailto:users-boun...@lists.open-mpi.org> > on behalf of Ralph Castain via 
users mailto:users@lists.open-mpi.org> >
Sent: Thursday, August 8, 2019 16:41
To: Open MPI Users
Cc: Ralph Castain
Subject: Re: [OMPI users] OMPI was not built with SLURM's PMI support
 Did you configure Slurm to use PMIx? If so, then you simply need to set the 
"--mpi=pmix" or "--mpi=pmix_v2" (depending on which version of PMIx you used) 
flag on your srun cmd line so it knows to use it.

If not (and you can't fix it), then you have to explicitly configure OMPI to 
use Slurm's legacy PMI libraries - we won't do that by default. "./configure 
--help" will show you what needs to be done.

See https://slurm.schedmd.com/mpi_guide.html for assistance on checking your 
Slurm config and setting it up with PMIx support
Ralph




___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-09 Thread Ralph Castain via users
Artem - do you have any suggestions?


On Aug 8, 2019, at 12:06 PM, Jing Gong mailto:gongj...@kth.se> > wrote:

Hi Ralph,

$ Did you remember to add "--mpi=pmix" to your srun cmd line?

On the cluster,

$ srun  --mpi=list
srun: MPI types are...
srun: none
srun: openmpi
srun: pmi2
srun: pmix
srun: pmix_v1

I have tested srun --mpi=pmi2/pmix/pmix_v1 but no one successful ran.

Thanks. /Jing



From: Ralph Castain mailto:r...@open-mpi.org> >
Sent: Thursday, August 8, 2019 21:01
To: Jing Gong
Subject: Re: [OMPI users] OMPI was not built with SLURM's PMI support
 Did you remember to add "--mpi=pmix" to your srun cmd line?


Hi Ralph,

The slurm seems to be configured with pmlx. 

$ ls /usr/lib64/slurm/ |grep pmi
acct_gather_energy_ipmi.so
mpi_pmi2.so
mpi_pmix.so
mpi_pmix_v1.so

(and libpmix* in /usr/lib64)
 
Anyway, I recompiled openmpi v3.0.0 with

$ ./configure --with-pmix=/usr --with-slurm ...

but this time I even could not run "mpirun"

$ mpirun -n 4 ./a.out 
 [[42812,0],0] ORTE_ERROR_LOG: Not found in file 
../../../../../openmpi-3.0.0/orte/mca/ess/hnp/ess_hnp_module.c at line 649
--
It looks like orte_init failed for some reason; your parallel process is

likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_pmix_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--

What is the issue?

Thanks a lot.

/Jing

Did you configure Slurm to use PMIx? If so, then you simply need to set the 
"--mpi=pmix" or "--mpi=pmix_v2" (depending on which version of PMIx you used) 
flag on your srun cmd line so it knows to use it.

If not (and you can't fix it), then you have to explicitly configure OMPI to 
use Slurm's legacy PMI libraries - we won't do that by default. "./configure 
--help" will show you what needs to be done.

See https://slurm.schedmd.com/mpi_guide.html for assistance on checking your 
Slurm config and setting it up with PMIx support
Ralph


On Aug 8, 2019, at 7:25 AM, Jing Gong via users mailto:users@lists.open-mpi.org> > wrote:

Hi,

Recently our Slurm system has been upgraded to 19.0.5. I tried to recompile 
openmpi v3.0 due to the bug reported in

https://bugs.schedmd.com/show_bug.cgi?id=6993

The configure flags are:

$./configure --enable-shared --enable-static --with-slurm --with-pmix

and the output of ompi_info is following

$ ompi_info -a |grep pmix
  Configure command line: '--enable-shared' '--enable-static' '--with-slurm' 
'--with-pmix'

   MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v3.0.0)
 MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0, Component v3.0.0)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix" (current value: "", data source: 
default, level: 2 user/detail, type: string)
  Default selection set of components for the pmix 
framework ( means use all components that can be found)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix_base_verbose" (current value: 
"error", data source: default, level: 8 dev/detail, type: int)
  Verbosity level for the pmix framework (default: 0)
   MCA pmix base: parameter "pmix_base_async_modex" (current value: 
"false", data source: default, level: 9 dev/all, type: bool)
   MCA pmix base: parameter "pmix_base_collect_data" (current value: 
"true", data source: default, level: 9 dev/all, type: bool)
   MCA pmix base: parameter "pmix_base_exchange_timeout" (current 
value: "-1", data source: default, level: 3 user/all, type: int)
 MCA pmix pmix2x: ---
 MCA pmix pmix2x: parameter "pmix_pmix2x_silence_warning" (current 
value: "false", data source: default, level: 4 tuner/basic, type: bool)

But when srun the openmpi, I got error likes


$ srun -n 4 ./a.out

--
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-09 Thread Jing Gong via users
Hi Ralph,


> Remove the extra configure options from OMPI - you don't need to tell it 
> --with-pmix or --with-slurm. It will "do the right thing" without that stuff.

Yes, "mpirun" works without these additional flags. But the the origin issue 
"OMPI was not built with SLURM's PMI support ..." is back if using "srun"

Thanks.

/Jing





From: Ralph Castain 
Sent: Thursday, August 8, 2019 19:59
To: Jing Gong
Cc: Open MPI Users
Subject: Re: [OMPI users] OMPI was not built with SLURM's PMI support

Remove the extra configure options from OMPI - you don't need to tell it 
--with-pmix or --with-slurm. It will "do the right thing" without that stuff.



On Aug 8, 2019, at 10:54 AM, Jing Gong 
mailto:gongj...@kth.se>> wrote:

Hi Ralph,

The slurm seems to be configured with pmlx.

$ ls /usr/lib64/slurm/ |grep pmi
acct_gather_energy_ipmi.so
mpi_pmi2.so
mpi_pmix.so
mpi_pmix_v1.so

(and libpmix* in /usr/lib64)

Anyway, I recompiled openmpi v3.0.0 with

$ ./configure --with-pmix=/usr --with-slurm ...

but this time I even could not run "mpirun"

$ mpirun -n 4 ./a.out
 [[42812,0],0] ORTE_ERROR_LOG: Not found in file 
../../../../../openmpi-3.0.0/orte/mca/ess/hnp/ess_hnp_module.c at line 649
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_pmix_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--

What is the issue?

Thanks a lot.

/Jing



From: users 
mailto:users-boun...@lists.open-mpi.org>> on 
behalf of Ralph Castain via users 
mailto:users@lists.open-mpi.org>>
Sent: Thursday, August 8, 2019 16:41
To: Open MPI Users
Cc: Ralph Castain
Subject: Re: [OMPI users] OMPI was not built with SLURM's PMI support

Did you configure Slurm to use PMIx? If so, then you simply need to set the 
"--mpi=pmix" or "--mpi=pmix_v2" (depending on which version of PMIx you used) 
flag on your srun cmd line so it knows to use it.

If not (and you can't fix it), then you have to explicitly configure OMPI to 
use Slurm's legacy PMI libraries - we won't do that by default. "./configure 
--help" will show you what needs to be done.

See https://slurm.schedmd.com/mpi_guide.html for assistance on checking your 
Slurm config and setting it up with PMIx support
Ralph




___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-09 Thread Artem Polyakov via users
Hello,

I'd suggest to run PMIx application with Slurm to differentiate between 
OMPI/PMIx issue versus Slurm/PMIx issue.

As the PMIx application you can use pmix perf tool:
https://github.com/pmix/pmix/tree/master/contrib/perf_tools

The page has the building and running instructions.
In the "run.sh" replace "mpirun" with corresponding srun command:

"srun -n $np `pwd`/pmix_intra_perf $@"

From: Ralph Castain 
Sent: Thursday, August 8, 2019 1:59 PM
To: Jing Gong 
Cc: Open MPI Users ; Artem Polyakov 

Subject: Re: [OMPI users] OMPI was not built with SLURM's PMI support

Artem - do you have any suggestions?



On Aug 8, 2019, at 12:06 PM, Jing Gong 
mailto:gongj...@kth.se>> wrote:

Hi Ralph,

$ Did you remember to add "--mpi=pmix" to your srun cmd line?

On the cluster,

$ srun  --mpi=list
srun: MPI types are...
srun: none
srun: openmpi
srun: pmi2
srun: pmix
srun: pmix_v1

I have tested srun --mpi=pmi2/pmix/pmix_v1 but no one successful ran.

Thanks. /Jing



From: Ralph Castain mailto:r...@open-mpi.org>>
Sent: Thursday, August 8, 2019 21:01
To: Jing Gong
Subject: Re: [OMPI users] OMPI was not built with SLURM's PMI support

Did you remember to add "--mpi=pmix" to your srun cmd line?


Hi Ralph,

The slurm seems to be configured with pmlx.

$ ls /usr/lib64/slurm/ |grep pmi
acct_gather_energy_ipmi.so
mpi_pmi2.so
mpi_pmix.so
mpi_pmix_v1.so

(and libpmix* in /usr/lib64)

Anyway, I recompiled openmpi v3.0.0 with

$ ./configure --with-pmix=/usr --with-slurm ...

but this time I even could not run "mpirun"

$ mpirun -n 4 ./a.out
 [[42812,0],0] ORTE_ERROR_LOG: Not found in file 
../../../../../openmpi-3.0.0/orte/mca/ess/hnp/ess_hnp_module.c at line 649
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  opal_pmix_base_select failed
  --> Returned value Not found (-13) instead of ORTE_SUCCESS
--

What is the issue?

Thanks a lot.

/Jing

Did you configure Slurm to use PMIx? If so, then you simply need to set the 
"--mpi=pmix" or "--mpi=pmix_v2" (depending on which version of PMIx you used) 
flag on your srun cmd line so it knows to use it.

If not (and you can't fix it), then you have to explicitly configure OMPI to 
use Slurm's legacy PMI libraries - we won't do that by default. "./configure 
--help" will show you what needs to be done.

See 
https://slurm.schedmd.com/mpi_guide.html<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslurm.schedmd.com%2Fmpi_guide.html=02%7C01%7Cartemp%40mellanox.com%7C2676a273246343a156ea08d71c432e8f%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637008947180769826=z2iro9IttliN%2BFrhip%2FfnRSeiicWSN7XSrEfcFNu9Aw%3D=0>
 for assistance on checking your Slurm config and setting it up with PMIx 
support
Ralph



On Aug 8, 2019, at 7:25 AM, Jing Gong via users 
mailto:users@lists.open-mpi.org>> wrote:

Hi,

Recently our Slurm system has been upgraded to 19.0.5. I tried to recompile 
openmpi v3.0 due to the bug reported in

https://bugs.schedmd.com/show_bug.cgi?id=6993<https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.schedmd.com%2Fshow_bug.cgi%3Fid%3D6993=02%7C01%7Cartemp%40mellanox.com%7C2676a273246343a156ea08d71c432e8f%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637008947180769826=smjcjXuQTxnJx2gGI2%2Fn25JTHxsmvDJEjKBfAegENO8%3D=0>

The configure flags are:

$./configure --enable-shared --enable-static --with-slurm --with-pmix

and the output of ompi_info is following

$ ompi_info -a |grep pmix
  Configure command line: '--enable-shared' '--enable-static' '--with-slurm' 
'--with-pmix'

   MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v3.0.0)
 MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0, Component v3.0.0)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix" (current value: "", data source: 
default, level: 2 user/detail, type: string)
  Default selection set of components for the pmix 
framework ( means use all components that can be found)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix_base_verbose" (current value: 
"error", data source: default, level: 8 dev/detail, type: int)
  Verbosity level for the pmix framework (default: 0)
   MCA pmix base: parameter "pmix_base_async_modex" (c

Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-08 Thread Jing Gong via users
Hi Gilles,


> You need to

> configure --with-pmi ...


Originally I specified the flag "--with-pmi" but looked at the output of slurm

...
  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

..

I recompiled with flag --with-pmix.


Thanks. /Jing



From: Gilles GOUAILLARDET 
Sent: Thursday, August 8, 2019 16:41
To: Open MPI Users
Cc: Jing Gong
Subject: Re: [OMPI users] OMPI was not built with SLURM's PMI support


Hi,

You need to

configure --with-pmi ...

Cheers,

Gilles

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-08 Thread Ralph Castain via users
Did you configure Slurm to use PMIx? If so, then you simply need to set the 
"--mpi=pmix" or "--mpi=pmix_v2" (depending on which version of PMIx you used) 
flag on your srun cmd line so it knows to use it.

If not (and you can't fix it), then you have to explicitly configure OMPI to 
use Slurm's legacy PMI libraries - we won't do that by default. "./configure 
--help" will show you what needs to be done.

See https://slurm.schedmd.com/mpi_guide.html for assistance on checking your 
Slurm config and setting it up with PMIx support
Ralph


On Aug 8, 2019, at 7:25 AM, Jing Gong via users mailto:users@lists.open-mpi.org> > wrote:

Hi,

Recently our Slurm system has been upgraded to 19.0.5. I tried to recompile 
openmpi v3.0 due to the bug reported in

https://bugs.schedmd.com/show_bug.cgi?id=6993

The configure flags are:

$./configure --enable-shared --enable-static --with-slurm --with-pmix

and the output of ompi_info is following

$ ompi_info -a |grep pmix
  Configure command line: '--enable-shared' '--enable-static' '--with-slurm' 
'--with-pmix'

   MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v3.0.0)
 MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0, Component v3.0.0)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix" (current value: "", data source: 
default, level: 2 user/detail, type: string)
  Default selection set of components for the pmix 
framework ( means use all components that can be found)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix_base_verbose" (current value: 
"error", data source: default, level: 8 dev/detail, type: int)
  Verbosity level for the pmix framework (default: 0)
   MCA pmix base: parameter "pmix_base_async_modex" (current value: 
"false", data source: default, level: 9 dev/all, type: bool)
   MCA pmix base: parameter "pmix_base_collect_data" (current value: 
"true", data source: default, level: 9 dev/all, type: bool)
   MCA pmix base: parameter "pmix_base_exchange_timeout" (current 
value: "-1", data source: default, level: 3 user/all, type: int)
 MCA pmix pmix2x: ---
 MCA pmix pmix2x: parameter "pmix_pmix2x_silence_warning" (current 
value: "false", data source: default, level: 4 tuner/basic, type: bool)

But when srun the openmpi, I got error likes


$ srun -n 4 ./a.out

--
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

  Versions earlier than 16.05: you must use either SLURM's PMI-1 or

  PMI-2 support. SLURM builds PMI-1 by default, or you can manually

  install PMI-2. You must then build Open MPI using --with-pmi pointing
  to the SLURM PMI library location.

Please configure as appropriate and try again.
--
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,

***    and potentially your MPI job)
Local abort before MPI_INIT completed completed successfully, but am not able 
to aggregate error messages, and not able to guarantee that all other processes 
were killed!
===

How can I check if openmpi is built for the PMI support ?

Thanks a lot. /Jing 




___
users mailing list
users@lists.open-mpi.org  
https://lists.open-mpi.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] OMPI was not built with SLURM's PMI support

2019-08-08 Thread Gilles GOUAILLARDET via users
Hi,

You need to

configure --with-pmi ...


Cheers,

Gilles

On August 8, 2019, at 11:28 PM, Jing Gong via users  
wrote:

 

Hi,


Recently our Slurm system has been upgraded to 19.0.5. I tried to recompile 
openmpi v3.0 due to the bug reported in


https://bugs.schedmd.com/show_bug.cgi?id=6993


The configure flags are:


$./configure --enable-shared --enable-static --with-slurm --with-pmix


and the output of ompi_info is following


$ ompi_info -a |grep pmix
  Configure command line: '--enable-shared' '--enable-static' '--with-slurm' 
'--with-pmix'


   MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v3.0.0)
 MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0, Component v3.0.0)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix" (current value: "", data source: 
default, level: 2 user/detail, type: string)
  Default selection set of components for the pmix 
framework ( means use all components that can be found)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix_base_verbose" (current value: 
"error", data source: default, level: 8 dev/detail, type: int)
  Verbosity level for the pmix framework (default: 0)
   MCA pmix base: parameter "pmix_base_async_modex" (current value: 
"false", data source: default, level: 9 dev/all, type: bool)
   MCA pmix base: parameter "pmix_base_collect_data" (current value: 
"true", data source: default, level: 9 dev/all, type: bool)
   MCA pmix base: parameter "pmix_base_exchange_timeout" (current 
value: "-1", data source: default, level: 3 user/all, type: int)
 MCA pmix pmix2x: ---
 MCA pmix pmix2x: parameter "pmix_pmix2x_silence_warning" (current 
value: "false", data source: default, level: 4 tuner/basic, type: bool)

But when srun the openmpi, I got error likes




$ srun -n 4 ./a.out


--
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

  Versions earlier than 16.05: you must use either SLURM's PMI-1 or
  PMI-2 support. SLURM builds PMI-1 by default, or you can manually
  install PMI-2. You must then build Open MPI using --with-pmi pointing
  to the SLURM PMI library location.

Please configure as appropriate and try again.
--
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***    and potentially your MPI job)
Local abort before MPI_INIT completed completed successfully, but am not able 
to aggregate error messages, and not able to guarantee that all other processes 
were killed!
===


How can I check if openmpi is built for the PMI support ?


Thanks a lot. /Jing 





___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] OMPI was not built with SLURM's PMI support

2019-08-08 Thread Jing Gong via users
Hi,


Recently our Slurm system has been upgraded to 19.0.5. I tried to recompile 
openmpi v3.0 due to the bug reported in


https://bugs.schedmd.com/show_bug.cgi?id=6993


The configure flags are:


$./configure --enable-shared --enable-static --with-slurm --with-pmix


and the output of ompi_info is following


$ ompi_info -a |grep pmix
  Configure command line: '--enable-shared' '--enable-static' '--with-slurm' 
'--with-pmix'


   MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component v3.0.0)
 MCA pmix: pmix2x (MCA v2.1.0, API v2.0.0, Component v3.0.0)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix" (current value: "", data source: 
default, level: 2 user/detail, type: string)
  Default selection set of components for the pmix 
framework ( means use all components that can be found)
   MCA pmix base: ---
   MCA pmix base: parameter "pmix_base_verbose" (current value: 
"error", data source: default, level: 8 dev/detail, type: int)
  Verbosity level for the pmix framework (default: 0)
   MCA pmix base: parameter "pmix_base_async_modex" (current value: 
"false", data source: default, level: 9 dev/all, type: bool)
   MCA pmix base: parameter "pmix_base_collect_data" (current value: 
"true", data source: default, level: 9 dev/all, type: bool)
   MCA pmix base: parameter "pmix_base_exchange_timeout" (current 
value: "-1", data source: default, level: 3 user/all, type: int)
 MCA pmix pmix2x: ---
 MCA pmix pmix2x: parameter "pmix_pmix2x_silence_warning" (current 
value: "false", data source: default, level: 4 tuner/basic, type: bool)


But when srun the openmpi, I got error likes




$ srun -n 4 ./a.out


--
The application appears to have been direct launched using "srun",
but OMPI was not built with SLURM's PMI support and therefore cannot
execute. There are several options for building PMI support under
SLURM, depending upon the SLURM version you are using:

  version 16.05 or later: you can use SLURM's PMIx support. This
  requires that you configure and build SLURM --with-pmix.

  Versions earlier than 16.05: you must use either SLURM's PMI-1 or
  PMI-2 support. SLURM builds PMI-1 by default, or you can manually
  install PMI-2. You must then build Open MPI using --with-pmi pointing
  to the SLURM PMI library location.

Please configure as appropriate and try again.
--
*** An error occurred in MPI_Init
*** on a NULL communicator
*** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
***and potentially your MPI job)
Local abort before MPI_INIT completed completed successfully, but am not able 
to aggregate error messages, and not able to guarantee that all other processes 
were killed!
===


How can I check if openmpi is built for the PMI support ?


Thanks a lot. /Jing




___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users