Re: [OMPI users] Viewing the output of the program
Mahmood, iirc, a related bug was fixed in v2.0.0 Can you please update to 2.0.1 and try again ? Cheers, Gilles On Saturday, October 1, 2016, Mahmood Naderanwrote: > Hi, > Here is the bizarre behavior of the system and hope that someone can > clarify is this related to OMPI or not. > > When I issue the mpirun command with -np 2, I can see the output of the > program online as it is running (I am std out). However, if I issue the > command with -np 4, the progress is not shown!! > > > Please see the output below. I ran 'date' command first and the issued the > command with '-np 4'. After some seconds, I pressed ^C and ran 'date' > again. As you can see, there is no output information. Next, I ran with > '-np 2' and after a while I pressed ^C. You can see that the progress of > the program is shown. > > > > > mahmood@cluster:A4$ date > Sat Oct 1 11:26:13 2016 > mahmood@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun > --hostfile hosts.txt -np 4 /share/apps/chemistry/siesta-4.0/spar/siesta < > A.fdf > Siesta Version: siesta-4.0--500 > Architecture : x86_64-unknown-linux-gnu--unknown > Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort > PP flags : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT > PARALLEL version > > * Running on4 nodes in parallel > >> Start of run: 1-OCT-2016 11:26:23 > >*** >* WELCOME TO SIESTA * >*** > > reinit: Reading from standard input > ** Dump of input data file > > ^CKilled by signal 2. > mahmood@cluster:A4$ date > Sat Oct 1 11:26:30 2016 > mahmood@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun > --hostfile hosts.txt -np 2 /share/apps/chemistry/siesta-4.0/spar/siesta < > A.fdf > Siesta Version: siesta-4.0--500 > Architecture : x86_64-unknown-linux-gnu--unknown > Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort > PP flags : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT > PARALLEL version > > * Running on2 nodes in parallel > >> Start of run: 1-OCT-2016 11:26:36 > >*** >* WELCOME TO SIESTA * >*** > > reinit: Reading from standard input > ** Dump of input data file > > SystemLabel A > NumberOfAtoms54 > NumberOfSpecies 2 > %block ChemicalSpeciesLabel > ... > ... > ... > ^CKilled by signal 2. > mahmood@cluster:A4$ date > Sat Oct 1 11:26:38 2016 > > > > > > Any idea about that? The problem occurs when I change the MPI's switches. > > Regards, > Mahmood > > > ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
[OMPI users] Viewing the output of the program
Hi, Here is the bizarre behavior of the system and hope that someone can clarify is this related to OMPI or not. When I issue the mpirun command with -np 2, I can see the output of the program online as it is running (I am std out). However, if I issue the command with -np 4, the progress is not shown!! Please see the output below. I ran 'date' command first and the issued the command with '-np 4'. After some seconds, I pressed ^C and ran 'date' again. As you can see, there is no output information. Next, I ran with '-np 2' and after a while I pressed ^C. You can see that the progress of the program is shown. mahmood@cluster:A4$ date Sat Oct 1 11:26:13 2016 mahmood@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun --hostfile hosts.txt -np 4 /share/apps/chemistry/siesta-4.0/spar/siesta < A.fdf Siesta Version: siesta-4.0--500 Architecture : x86_64-unknown-linux-gnu--unknown Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort PP flags : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT PARALLEL version * Running on4 nodes in parallel >> Start of run: 1-OCT-2016 11:26:23 *** * WELCOME TO SIESTA * *** reinit: Reading from standard input ** Dump of input data file ^CKilled by signal 2. mahmood@cluster:A4$ date Sat Oct 1 11:26:30 2016 mahmood@cluster:A4$ /share/apps/computer/openmpi-2.0.0/bin/mpirun --hostfile hosts.txt -np 2 /share/apps/chemistry/siesta-4.0/spar/siesta < A.fdf Siesta Version: siesta-4.0--500 Architecture : x86_64-unknown-linux-gnu--unknown Compiler flags: /share/apps/computer/openmpi-2.0.0/bin/mpifort PP flags : -DMPI -DFC_HAVE_FLUSH -DFC_HAVE_ABORT PARALLEL version * Running on2 nodes in parallel >> Start of run: 1-OCT-2016 11:26:36 *** * WELCOME TO SIESTA * *** reinit: Reading from standard input ** Dump of input data file SystemLabel A NumberOfAtoms54 NumberOfSpecies 2 %block ChemicalSpeciesLabel ... ... ... ^CKilled by signal 2. mahmood@cluster:A4$ date Sat Oct 1 11:26:38 2016 Any idea about that? The problem occurs when I change the MPI's switches. Regards, Mahmood ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] How to paralellize the algorithm...
Dear, It seems to me you are trying to parallelize following a Master/Slave paradigm. Perhaps you may want to take a look and test some existing MPI-based library for achieving that: - DMSM: https://www.hpcvl.org/faqs/dmsm-library - TOP-C: http://www.ccs.neu.edu/home/gene/topc.html - ADLB: https://www.cs.mtsu.edu/~rbutler/adlb/ Kind Regards, -- Constantinos On Fri, Sep 30, 2016 at 5:36 PM, Bar�� Ke�eci via users < users@lists.open-mpi.org> wrote: > Hello everyone, > I'm trying to investigate the paralellization of an algorithm with OpenMPI > on a distributed computers' network. In the network there are one Master > PC and 24 Computing Node PCs. I'm quite a newbie in this field. However i > achieved installing the OpenMPI and compiling and running my first parallel > codes on this parallel platform. > > Now here is my question. > The algorithm in my concern is a simple one. Such that: in the outer "for > loop" the algorithm repeats until a stopping conditon is met. The Master PC > should do this outer loop. And in the "inner loop" a local search procedure > is performed in paralel by the 24 Computing Nodes. That means i actually > want to paralellize the inner loop since it is the most time cosuming part > of my algorithm. I have already managed to code this part since i know the > total number of steps of the "inner loop" and hence i was able to > paralellize the inner "for loop" over the distributed pcs. Now here is the > problem. I want the Master PC repeats the main loop until a stopping > criterion is met, but at each step it should distribute the inner loop over > 24 compute nodes. And i dont have any idea how should i do this. It appears > to me i should build a code something like, i have to make each compute > node wait a signal from the master code and reapeat the inner loop over and > over... > > I hope i could make it clear with my poor English. I would appreciate if > anyone can help me or at least give the broad methodology. > > Best regars to all. > > Doctor Keceee > > ___ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users