Re: [OMPI users] Unable to run a python code on cluster with mpirun in parallel

2019-09-09 Thread Ralph Castain via users
Take a look at "man orte_hosts" for a full explanation of how to use hostfile - 
/etc/hosts is not a properly formatted hostfile.

You really just want a file that lists the names of the hosts, one per line, as 
that is the simplest hostfile.

> On Sep 7, 2019, at 4:23 AM, Sepinoud Azimi via users 
>  wrote:
> 
> Hi,
> 
> I have a parallelized code that works fine on my local computer with the 
> command 
> 
>   $mpirun -n 5 python parallel_simulation.py
> but when I try the same code on cluster I only get one process and it is not 
> running in parallel. I have both mpich and openmpi loaded on my cluster. 
> 
> I tried to use 
> 
> $mpirun -n 5 -hostfile /etc/hosts python parallel_simulation.py
> which gives me error 
> 
> Open RTE detected a parse error in the hostfile:
> 
> 
> /etc/
> hosts
> 
> It occured on line number 39 on token 12.
> --
> --
> An internal error has occurred in ORTE:
> 
> 
> 
> [[57450,0],0] FORCE-TERMINATE AT (null):1 - error 
> base/ras_base_allocate.c(302)
> 
> 
> 
> This is something that should be reported to the developers.
> This is what I get when I run:
> 
> $cat /etc/hosts
> 
> 
> # Ansible managed file, do not edit directly
> 
> 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.
> localdomain4
> 
> ::1 localhost localhost.localdomain localhost6 localhost6.
> localdomain6
> 
> 
> # Hosts from ansible hosts file
> 10.1.1.2 titan-install.int.utu.fi titan-install
> 10.1.1.1 titan-admin.int.utu.fi titan-admin
> 10.1.1.3 titan-grid.int.utu.fi titan-grid
> 10.1.100.1 ti1.int.utu.fi ti1
> 10.2.100.1 ti1-ib.int.utu.fi ti1-ib
> 10.1.100.2 ti2.int.utu.fi ti2
> 
> 
> I would be very grateful if someone could suggest a solution. I am very new 
> to this and I am not sure how to solve the problem.
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] Unable to run a python code on cluster with mpirun in parallel

2019-09-07 Thread Sepinoud Azimi via users
Hi,

I have a parallelized code that works fine on my local computer with the 
command 

   $mpirun -n 5 python parallel_simulation.py
but when I try the same code on cluster I only get one process and it is not 
running in parallel. I have both mpich and openmpi loaded on my cluster. 

I tried to use 

$mpirun -n 5 -hostfile /etc/hosts python parallel_simulation.py
which gives me error 

Open RTE detected a parse error in the hostfile:


/etc/
hosts

It occured on line number 39 on token 12.
--
--
An internal error has occurred in ORTE:



[[57450,0],0] FORCE-TERMINATE AT (null):1 - error base/ras_base_allocate.c(302)



This is something that should be reported to the developers.
This is what I get when I run:

$cat /etc/hosts


# Ansible managed file, do not edit directly

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.
localdomain4

::1 localhost localhost.localdomain localhost6 localhost6.
localdomain6


# Hosts from ansible hosts file
10.1.1.2 titan-install.int.utu.fi titan-install
10.1.1.1 titan-admin.int.utu.fi titan-admin
10.1.1.3 titan-grid.int.utu.fi titan-grid
10.1.100.1 ti1.int.utu.fi ti1
10.2.100.1 ti1-ib.int.utu.fi ti1-ib
10.1.100.2 ti2.int.utu.fi ti2


I would be very grateful if someone could suggest a solution. I am very new to 
this and I am not sure how to solve the problem.
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users