Re: [OMPI users] Viewing the output of the program

2016-10-01 Thread Gilles Gouaillardet
Mahmood,

iirc, a related bug was fixed in v2.0.0
Can you please update to 2.0.1 and try again ?

Cheers,

Gilles

On Saturday, October 1, 2016, Mahmood Naderan  wrote:

> Hi,
> Here is the bizarre behavior of the system and hope that someone can
> clarify is this related to OMPI or not.
>
> When I issue the mpirun command with -np 2, I can see the output of the
> program online as it is running (I am std out). However, if I issue the
> command with -np 4, the progress is not shown!!
>
>
> Please see the output below. I ran 'date' command first and the issued the
> command with '-np 4'. After some seconds, I pressed ^C and ran 'date'
> again. As you can see, there is no output information. Next, I ran with
> '-np 2' and after a while I pressed ^C. You can see that the progress of
> the program is shown.
>
>
>
>
> mahmood@cluster:A4$ date
> Sat Oct  1 11:26:13 2016
> mahmood@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun
> --hostfile hosts.txt -np 4 /share/apps/chemistry/siesta-4.0/spar/siesta <
> A.fdf
> Siesta Version: siesta-4.0--500
> Architecture  : x86_64-unknown-linux-gnu--unknown
> Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort
> PP flags  : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
> PARALLEL version
>
> * Running on4 nodes in parallel
> >> Start of run:   1-OCT-2016  11:26:23
>
>***
>*  WELCOME TO SIESTA  *
>***
>
> reinit: Reading from standard input
> ** Dump of input data file
> 
> ^CKilled by signal 2.
> mahmood@cluster:A4$ date
> Sat Oct  1 11:26:30 2016
> mahmood@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun
> --hostfile hosts.txt -np 2 /share/apps/chemistry/siesta-4.0/spar/siesta <
> A.fdf
> Siesta Version: siesta-4.0--500
> Architecture  : x86_64-unknown-linux-gnu--unknown
> Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort
> PP flags  : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
> PARALLEL version
>
> * Running on2 nodes in parallel
> >> Start of run:   1-OCT-2016  11:26:36
>
>***
>*  WELCOME TO SIESTA  *
>***
>
> reinit: Reading from standard input
> ** Dump of input data file
> 
> SystemLabel  A
> NumberOfAtoms54
> NumberOfSpecies  2
> %block ChemicalSpeciesLabel
> ...
> ...
> ...
> ^CKilled by signal 2.
> mahmood@cluster:A4$ date
> Sat Oct  1 11:26:38 2016
>
>
>
>
>
> Any idea about that? The problem occurs when I change the MPI's switches.
>
> Regards,
> Mahmood
>
>
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

[OMPI users] Viewing the output of the program

2016-10-01 Thread Mahmood Naderan
Hi,
Here is the bizarre behavior of the system and hope that someone can
clarify is this related to OMPI or not.

When I issue the mpirun command with -np 2, I can see the output of the
program online as it is running (I am std out). However, if I issue the
command with -np 4, the progress is not shown!!


Please see the output below. I ran 'date' command first and the issued the
command with '-np 4'. After some seconds, I pressed ^C and ran 'date'
again. As you can see, there is no output information. Next, I ran with
'-np 2' and after a while I pressed ^C. You can see that the progress of
the program is shown.




mahmood@cluster:A4$ date
Sat Oct  1 11:26:13 2016
mahmood@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun
--hostfile hosts.txt -np 4 /share/apps/chemistry/siesta-4.0/spar/siesta <
A.fdf
Siesta Version: siesta-4.0--500
Architecture  : x86_64-unknown-linux-gnu--unknown
Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort
PP flags  : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
PARALLEL version

* Running on4 nodes in parallel
>> Start of run:   1-OCT-2016  11:26:23

   ***
   *  WELCOME TO SIESTA  *
   ***

reinit: Reading from standard input
** Dump of input data file

^CKilled by signal 2.
mahmood@cluster:A4$ date
Sat Oct  1 11:26:30 2016
mahmood@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun
--hostfile hosts.txt -np 2 /share/apps/chemistry/siesta-4.0/spar/siesta <
A.fdf
Siesta Version: siesta-4.0--500
Architecture  : x86_64-unknown-linux-gnu--unknown
Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort
PP flags  : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT
PARALLEL version

* Running on2 nodes in parallel
>> Start of run:   1-OCT-2016  11:26:36

   ***
   *  WELCOME TO SIESTA  *
   ***

reinit: Reading from standard input
** Dump of input data file

SystemLabel  A
NumberOfAtoms54
NumberOfSpecies  2
%block ChemicalSpeciesLabel
...
...
...
^CKilled by signal 2.
mahmood@cluster:A4$ date
Sat Oct  1 11:26:38 2016





Any idea about that? The problem occurs when I change the MPI's switches.

Regards,
Mahmood
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] How to paralellize the algorithm...

2016-10-01 Thread Constantinos Makassikis
Dear,

It seems to me you are trying to parallelize following a Master/Slave
paradigm.

Perhaps you may want to take a look and test some existing MPI-based
library for achieving that:
- DMSM: https://www.hpcvl.org/faqs/dmsm-library
- TOP-C: http://www.ccs.neu.edu/home/gene/topc.html
- ADLB: https://www.cs.mtsu.edu/~rbutler/adlb/


Kind Regards,
--
Constantinos


On Fri, Sep 30, 2016 at 5:36 PM, Bar�� Ke�eci via users <
users@lists.open-mpi.org> wrote:

> Hello everyone,
> I'm trying to investigate the paralellization of an algorithm with OpenMPI
> on a distributed computers' network. In the network there are one Master
> PC and 24 Computing Node PCs. I'm quite a newbie in this field. However i
> achieved installing the OpenMPI and compiling and running my first parallel
> codes on this parallel platform.
>
> Now here is my question.
> The algorithm in my concern is a simple one. Such that: in the outer "for
> loop" the algorithm repeats until a stopping conditon is met. The Master PC
> should do this outer loop. And in the "inner loop" a local search procedure
> is performed in paralel by the 24 Computing Nodes. That means i actually
> want to paralellize the inner loop since it is the most time cosuming part
> of my algorithm. I have already managed to code this part since i know the
> total number of steps of the "inner loop" and hence i was able to
> paralellize the inner "for loop" over the distributed pcs. Now here is the
> problem. I want the Master PC repeats the main loop until a stopping
> criterion is met, but at each step it should distribute the inner loop over
> 24 compute nodes. And i dont have any idea how should i do this. It appears
> to me i should build a code something like, i have to make each compute
> node wait a signal from the master code and reapeat the inner loop over and
> over...
>
> I hope i could make it clear with my poor English. I would appreciate if
> anyone can help me or at least give the broad methodology.
>
> Best regars to all.
>
> Doctor Keceee
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users