Re: [OMPI users] --host works but --hostfile does not

2017-06-22 Thread r...@open-mpi.org
From “man mpirun” - note that not specifying “slots=N” in a hostfile defaults 
to slots=#cores on that node (as it states in the text):

   Specifying Host Nodes
   Host nodes can be identified on the mpirun command line with the -host 
option or in a hostfile.

   For example,

   mpirun -H aa,aa,bb ./a.out
   launches two processes on node aa and one on bb.

   Or, consider the hostfile

  % cat myhostfile
  aa slots=2
  bb slots=2
  cc slots=2

   Here, we list both the host names (aa, bb, and cc) but also how many 
"slots" there are for each.  Slots indicate how many processes can potentially
   execute on a node.  For best performance, the number of slots may be 
chosen to be the number of cores on the node or the number of processor  sock‐
   ets.   If  the  hostfile  does  not  provide  slots  information,  Open 
MPI will attempt to discover the number of cores (or hwthreads, if the use-
   hwthreads-as-cpus option is set) and set the number of slots to that 
value. This default behavior also occurs when specifying the -host option with
   a single hostname. Thus, the command

   mpirun -H aa ./a.out
   launches a number of processes equal to the number of cores on node 
aa.

   mpirun -hostfile myhostfile ./a.out
   will launch two processes on each of the three nodes.

   mpirun -hostfile myhostfile -host aa ./a.out
   will launch two processes, both on node aa.

   mpirun -hostfile myhostfile -host dd ./a.out
   will find no hosts to run on and abort with an error.  That is, the 
specified host dd is not in the specified hostfile.

   When running under resource managers (e.g., SLURM, Torque, etc.), Open 
MPI will obtain both the hostnames and the number of slots directly from the
   resource manger.

   Specifying Number of Processes
   As we have just seen, the number of processes to run can be set using 
the hostfile.  Other mechanisms exist.

   The number of processes launched can be specified as a multiple of the 
number of nodes or processor sockets available.  For example,

   mpirun -H aa,bb -npersocket 2 ./a.out
   launches processes 0-3 on node aa and process 4-7 on node bb, where 
aa and bb are both dual-socket nodes.  The -npersocket option also turns on
   the -bind-to-socket option, which is discussed in a later section.

   mpirun -H aa,bb -npernode 2 ./a.out
   launches processes 0-1 on node aa and processes 2-3 on node bb.

   mpirun -H aa,bb -npernode 1 ./a.out
   launches one process per host node.

   mpirun -H aa,bb -pernode ./a.out
   is the same as -npernode 1.

   Another alternative is to specify the number of processes with the -np 
option.  Consider now the hostfile

  % cat myhostfile
  aa slots=4
  bb slots=4
  cc slots=4

   Now,

   mpirun -hostfile myhostfile -np 6 ./a.out
   will  launch  processes 0-3 on node aa and processes 4-5 on node bb. 
 The remaining slots in the hostfile will not be used since the -np option
   indicated that only 6 processes should be launched.


> On Jun 22, 2017, at 7:49 AM, Info via users  wrote:
> 
> I am just learning to use openmpi 1.8.4 that is installed on our cluster. I 
> am running into a baffling issue. If I run:
> 
> mpirun -np 3 --host b1,b2,b3 hostname
> 
> I get the expected output:
> 
> b1
> b2
> b3
> 
> But if I do:
> 
> mpirun -np 3 --hostfile hostfile hostname
> 
> I get:
> 
> b1
> b1
> b1
> 
> Where hostfile contains:
> 
> b1
> b2
> b3
> 
> Any ideas what could going on?
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread r...@open-mpi.org
I gather you are using OMPI 2.x, yes? And you configured it 
--with-pmi=, then moved the executables/libs to your workstation?

I suppose I could state the obvious and say “don’t do that - just rebuild it”, 
and I fear that (after checking the 2.x code) you really have no choice. OMPI 
v3.0 will have a way around the problem, but not the 2.x series.


> On Jun 22, 2017, at 8:04 AM, Michael Di Domenico  
> wrote:
> 
> On Thu, Jun 22, 2017 at 10:43 AM, John Hearns via users
>  wrote:
>> Having had some problems with ssh launching (a few minutes ago) I can
>> confirm that this works:
>> 
>> --mca plm_rsh_agent "ssh -v"
> 
> this doesn't do anything for me
> 
> if i set OMPI_MCA_sec=^munge
> 
> i can clear the mca_sec_munge error
> 
> but the mca_pmix_pmix112 and opal_pmix_base_select errors still
> exists.  the plm_rsh_agent switch/env var doesn't seem to affect that
> error
> 
> down the road, i may still need the rsh_agent flag, but i think we're
> still before that in the sequence of events
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread Michael Di Domenico
On Thu, Jun 22, 2017 at 10:43 AM, John Hearns via users
 wrote:
> Having had some problems with ssh launching (a few minutes ago) I can
> confirm that this works:
>
> --mca plm_rsh_agent "ssh -v"

this doesn't do anything for me

if i set OMPI_MCA_sec=^munge

i can clear the mca_sec_munge error

but the mca_pmix_pmix112 and opal_pmix_base_select errors still
exists.  the plm_rsh_agent switch/env var doesn't seem to affect that
error

down the road, i may still need the rsh_agent flag, but i think we're
still before that in the sequence of events
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


[OMPI users] --host works but --hostfile does not

2017-06-22 Thread Info via users
I am just learning to use openmpi 1.8.4 that is installed on our cluster. I am 
running into a baffling issue. If I run:

mpirun -np 3 --host b1,b2,b3 hostname

I get the expected output:

b1
b2
b3

But if I do:

mpirun -np 3 --hostfile hostfile hostname

I get:

b1
b1
b1

Where hostfile contains:

b1
b2
b3

Any ideas what could going on?
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


[OMPI users] Openmpi with btl_openib_ib_service_level

2017-06-22 Thread John Hearns via users
I may have asked this recently (if so sorry).
If anyoen has worked with QoS settings with OpenMPI please ping me off list,
eg


mpirun --mca btl_openib_ib_service_level N
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread John Hearns via users
Having had some problems with ssh launching (a few minutes ago) I can
confirm that this works:

--mca plm_rsh_agent "ssh -v"

Stupidly I thought there was a majr problem - when it turned otu I could
not ssh into a host.. ahem.



On 22 June 2017 at 16:35, r...@open-mpi.org  wrote:

> You can add "OMPI_MCA_plm=rsh OMPI_MCA_sec=^munge” to your environment
>
>
> On Jun 22, 2017, at 7:28 AM, John Hearns via users <
> users@lists.open-mpi.org> wrote:
>
> Michael,  try
>  --mca plm_rsh_agent ssh
>
> I've been fooling with this myself recently, in the contect of a PBS
> cluster
>
> On 22 June 2017 at 16:16, Michael Di Domenico 
> wrote:
>
>> is it possible to disable slurm/munge/psm/pmi(x) from the mpirun
>> command line or (better) using environment variables?
>>
>> i'd like to use the installed version of openmpi i have on a
>> workstation, but it's linked with slurm from one of my clusters.
>>
>> mpi/slurm work just fine on the cluster, but when i run it on a
>> workstation i get the below errors
>>
>> mca_base_component_repositoy_open: unable to open mca_sec_munge:
>> libmunge missing
>> ORTE_ERROR_LOG Not found in file ess_hnp_module.c at line 648
>> opal_pmix_base_select failed
>> returned value not found (-13) instead of orte_success
>>
>> there's probably a magical incantation of mca parameters, but i'm not
>> adept enough at determining what they are
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread Michael Di Domenico
that took care of one of the errors, but i missed a re-type on the second error

mca_base_component_repository_open: unable to open mca_pmix_pmix112:
libmunge missing

and the opal_pmix_base_select error is still there (which is what's
actually halting my job)



On Thu, Jun 22, 2017 at 10:35 AM, r...@open-mpi.org  wrote:
> You can add "OMPI_MCA_plm=rsh OMPI_MCA_sec=^munge” to your environment
>
>
> On Jun 22, 2017, at 7:28 AM, John Hearns via users
>  wrote:
>
> Michael,  try
>  --mca plm_rsh_agent ssh
>
> I've been fooling with this myself recently, in the contect of a PBS cluster
>
> On 22 June 2017 at 16:16, Michael Di Domenico 
> wrote:
>>
>> is it possible to disable slurm/munge/psm/pmi(x) from the mpirun
>> command line or (better) using environment variables?
>>
>> i'd like to use the installed version of openmpi i have on a
>> workstation, but it's linked with slurm from one of my clusters.
>>
>> mpi/slurm work just fine on the cluster, but when i run it on a
>> workstation i get the below errors
>>
>> mca_base_component_repositoy_open: unable to open mca_sec_munge:
>> libmunge missing
>> ORTE_ERROR_LOG Not found in file ess_hnp_module.c at line 648
>> opal_pmix_base_select failed
>> returned value not found (-13) instead of orte_success
>>
>> there's probably a magical incantation of mca parameters, but i'm not
>> adept enough at determining what they are
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread r...@open-mpi.org
You can add "OMPI_MCA_plm=rsh OMPI_MCA_sec=^munge” to your environment


> On Jun 22, 2017, at 7:28 AM, John Hearns via users  
> wrote:
> 
> Michael,  try
>  --mca plm_rsh_agent ssh
> 
> I've been fooling with this myself recently, in the contect of a PBS cluster
> 
> On 22 June 2017 at 16:16, Michael Di Domenico  > wrote:
> is it possible to disable slurm/munge/psm/pmi(x) from the mpirun
> command line or (better) using environment variables?
> 
> i'd like to use the installed version of openmpi i have on a
> workstation, but it's linked with slurm from one of my clusters.
> 
> mpi/slurm work just fine on the cluster, but when i run it on a
> workstation i get the below errors
> 
> mca_base_component_repositoy_open: unable to open mca_sec_munge:
> libmunge missing
> ORTE_ERROR_LOG Not found in file ess_hnp_module.c at line 648
> opal_pmix_base_select failed
> returned value not found (-13) instead of orte_success
> 
> there's probably a magical incantation of mca parameters, but i'm not
> adept enough at determining what they are
> ___
> users mailing list
> users@lists.open-mpi.org 
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users 
> 
> 
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread John Hearns via users
Michael,  try
 --mca plm_rsh_agent ssh

I've been fooling with this myself recently, in the contect of a PBS cluster

On 22 June 2017 at 16:16, Michael Di Domenico 
wrote:

> is it possible to disable slurm/munge/psm/pmi(x) from the mpirun
> command line or (better) using environment variables?
>
> i'd like to use the installed version of openmpi i have on a
> workstation, but it's linked with slurm from one of my clusters.
>
> mpi/slurm work just fine on the cluster, but when i run it on a
> workstation i get the below errors
>
> mca_base_component_repositoy_open: unable to open mca_sec_munge:
> libmunge missing
> ORTE_ERROR_LOG Not found in file ess_hnp_module.c at line 648
> opal_pmix_base_select failed
> returned value not found (-13) instead of orte_success
>
> there's probably a magical incantation of mca parameters, but i'm not
> adept enough at determining what they are
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

[OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread Michael Di Domenico
is it possible to disable slurm/munge/psm/pmi(x) from the mpirun
command line or (better) using environment variables?

i'd like to use the installed version of openmpi i have on a
workstation, but it's linked with slurm from one of my clusters.

mpi/slurm work just fine on the cluster, but when i run it on a
workstation i get the below errors

mca_base_component_repositoy_open: unable to open mca_sec_munge:
libmunge missing
ORTE_ERROR_LOG Not found in file ess_hnp_module.c at line 648
opal_pmix_base_select failed
returned value not found (-13) instead of orte_success

there's probably a magical incantation of mca parameters, but i'm not
adept enough at determining what they are
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


[OMPI users] open program to profiling my openMPI code

2017-06-22 Thread Diego Avesani
Dear all,

I have done a program with gfortran\fortran and openMPI.
I would like to profile it.

Can someone suggest me a open program to profile it?
I have done some Internet researches but I have no enough information to
choose the best one.

Thanks in advance to all of you.


Diego
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users