Re: [OMPI users] openmpi.ld.conf file
On Mar 31, 2010, at 5:25 PM, Abhishek Gupta wrote: > I am trying to find out the location of openmpi.ld.conf file for my > openmpi/openmpi-libs. Can someone tell me where that file is placed? There is no openmpi.ld.conf in the official Open MPI distribution. Are you installing Open MPI from a package? Other Open MPI packagers may have created this file and put it in a supplemental RPM (or whatever package you're using)...? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] ompi-checkpoint --term
On Wed, Mar 31, 2010 at 7:39 PM, Addepalli, Srirangam Vwrote: > Hello All. > I am trying to checkpoint a mpi application that has been started using the > follwong mpirun command > > mpirun -am ft-enable-cr -np 8 pw.x < Ge46.pw.in > Ge46.ph.out > > ompi-checkpoint 31396 ( Works) How ever when i try to terminate the process > > ompi-checkpoint --term 31396 it never finishes. How do i bebug this issue. ompi-checkpoint is exactly ompi-checkpoint + sending SIGTERM to your app. If ompi-checkpoint finishes, then your app is not dealing with SIGTERM correctly. Make sure you're not ignoring SIGTERM, you need to either handle it or let it kill your app. If it's a multithreaded app, make sure you can "distribute" the SIGTERM to ALL the threads, i.e., when you receive SIGTERM, notify all other threads that they should join or quit. Regards,
[OMPI users] ompi-checkpoint --term
Hello All. I am trying to checkpoint a mpi application that has been started using the follwong mpirun command mpirun -am ft-enable-cr -np 8 pw.x < Ge46.pw.in > Ge46.ph.out ompi-checkpoint 31396 ( Works) How ever when i try to terminate the process ompi-checkpoint --term 31396 it never finishes. How do i bebug this issue. Rangam
Re: [OMPI users] Hide Abort output
Yes, Dick has isolated the issue - novice users often believe Open MPI (not their application) had a problem. Anything along the lines he suggests can only help. David On 04/01/2010 01:12 AM, Richard Treumann wrote: I do not know what the OpenMPI message looks like or why people want to hide it. It should be phrased to avoid any implication of a problem with OpenMPI itself. How about something like this which: "The application has called MPI_Abort. The application is terminated by OpenMPI as the application demanded" Dick Treumann - MPI Team IBM Systems& Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 From: "Jeff Squyres (jsquyres)"To: , Date: 03/31/2010 06:43 AM Subject:Re: [OMPI users] Hide Abort output Sent by:users-boun...@open-mpi.org At present there is no such feature, but it should not be hard to add. Can you guys be a little more specific about exactly what you are seeing and exactly what you want to see? (And what version you're working with - I'll caveat my discussion that this may be a 1.5-and-forward thing) -jms Sent from my PDA. No type good. - Original Message - From: users-boun...@open-mpi.org To: Open MPI Users Sent: Wed Mar 31 05:38:48 2010 Subject: Re: [OMPI users] Hide Abort output I have to say this is a very common issue for our users. They repeatedly report the long Open MPI MPI_Abort() message in help queries and fail to look for the application error message about the root cause. A short MPI_Abort() message that said "look elsewhere for the real error message" would be useful. Cheers, David On 03/31/2010 07:58 PM, Yves Caniou wrote: Dear all, I am using the MPI_Abort() command in a MPI program. I would like to not see the note explaining that the command caused Open MPI to kill all the jobs and so on. I thought that I could find a --mca parameter, but couldn't grep it. The only ones deal with the delay and printing more information (the stack). Is there a mean to avoid the printing of the note (except the 2>/dev/null tips)? Or to delay this printing? Thank you. .Yves.
[OMPI users] openmpi.ld.conf file
Hi, I am trying to find out the location of openmpi.ld.conf file for my openmpi/openmpi-libs. Can someone tell me where that file is placed? Thanks, Abhi.
Re: [OMPI users] openMPI on Xgrid
Yes, good idea. SGE is a fine scheduler; it's actively supported by Open MPI. On Mar 31, 2010, at 11:21 AM, Cristobal Navarro wrote: > and how about Sun Grid Engine + openMPI, good idea?? > > im asking because i just checked out that Mathematica 7 supports cluster > integration with SGE which will be a plus apart from our C programs. > > > Cristobal > > > > > On Tue, Mar 30, 2010 at 4:06 PM, Gus Correawrote: > Craig Tierney wrote: > Jody Klymak wrote: > On Mar 30, 2010, at 11:12 AM, Cristobal Navarro wrote: > > i just have some questions, > Torque requires moab, but from what i've read on the site you have to > buy moab right? > I am pretty sure you can download torque w/o moab. I do not use moab, > which I think is a higher-level scheduling layer on top of pbs. However, > there are folks here who would know far more than I do about > these sorts of things. > > Cheers, Jody > > > Moab is a scheduler, which works with Torque and several other > products. Torque comes with a basic scheduler, and Moab is not > required. If you want more features but not pay for Moab, you > can look at Maui. > > Craig > > > > Hi > > Just adding to what Craig and Jody said. > Moab is not required for Torque. > > A small cluster with a few users can work well with > the basic Torque/PBS scheduler (pbs_sched), > and its first-in-first-out job policy. > An alternative is to replace pbs_sched with the > free Maui scheduler, if you need fine grained job control. > > You can install both Torque and Maui from source code (available here > http://www.clusterresources.com/), but it takes some work. > > Some Linux distributions have Torque and Maui available as packages > through yum, apt-get, etc. > I would guess for the Mac you can get at least Torque through fink, > or not? > > Gus Correa > - > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > - > > > > > -- > Jody Klymak > http://web.uvic.ca/~jklymak/ > > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] openMPI on Xgrid
and how about Sun Grid Engine + openMPI, good idea?? im asking because i just checked out that Mathematica 7 supports cluster integration with SGE which will be a plus apart from our C programs. Cristobal On Tue, Mar 30, 2010 at 4:06 PM, Gus Correawrote: > Craig Tierney wrote: > >> Jody Klymak wrote: >> >>> On Mar 30, 2010, at 11:12 AM, Cristobal Navarro wrote: >>> >>> i just have some questions, Torque requires moab, but from what i've read on the site you have to buy moab right? >>> I am pretty sure you can download torque w/o moab. I do not use moab, >>> which I think is a higher-level scheduling layer on top of pbs. However, >>> there are folks here who would know far more than I do about >>> these sorts of things. >>> >>> Cheers, Jody >>> >>> >> Moab is a scheduler, which works with Torque and several other >> products. Torque comes with a basic scheduler, and Moab is not >> required. If you want more features but not pay for Moab, you >> can look at Maui. >> >> Craig >> >> >> > Hi > > Just adding to what Craig and Jody said. > Moab is not required for Torque. > > A small cluster with a few users can work well with > the basic Torque/PBS scheduler (pbs_sched), > and its first-in-first-out job policy. > An alternative is to replace pbs_sched with the > free Maui scheduler, if you need fine grained job control. > > You can install both Torque and Maui from source code (available here > http://www.clusterresources.com/), but it takes some work. > > Some Linux distributions have Torque and Maui available as packages > through yum, apt-get, etc. > I would guess for the Mac you can get at least Torque through fink, > or not? > > Gus Correa > - > Gustavo Correa > Lamont-Doherty Earth Observatory - Columbia University > Palisades, NY, 10964-8000 - USA > - > > > >> >> -- >>> Jody Klymak >>> http://web.uvic.ca/~jklymak/ >>> >>> >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Segmentation fault (11)
That is interesting. I cannot think of any reason why this might be causing a problem just in Open MPI. popen() is similar to fork()/system() so you have to be careful with interconnects that do not play nice with fork(), like openib. But since it looks like you are excluding openib, this should not be the problem. I wonder if this has something to so with the way we use BLCR (maybe we need to pass additional parameters to cr_checkpoint()). When the process fails, are there any messages in the system logs from BLCR indicating an issue that it encountered? It is common for BLCR to post a 'socket open' warning, but that is expected/normal since we leave TCP sockets open in most cases as an optimization. I am wondering if there is a warning about the popen'ed process. Personally, I will not have an opportunity to look into this in more detail until probably mid-April. :/ Let me know what you find, and maybe we can sort out what is happening on the list. -- Josh On Mar 29, 2010, at 2:28 PM, Jean Potsam wrote: > Hi Josh/All, >I just tested a simple c application with blcr and it worked > fine. > > ## > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > #include > > char * getprocessid() > { > FILE * read_fp; > char buffer[BUFSIZ + 1]; > int chars_read; > char * buffer_data="12345"; > memset(buffer, '\0', sizeof(buffer)); > read_fp = popen("uname -a", "r"); > /* > ... > */ > return buffer_data; > } > > int main(int argc, char ** argv) > { > > int rank; >int size; > char * thedata; > int n=0; > thedata=getprocessid(); > printf(" the data is %s", thedata); > > while( n <10) > { > printf("value is %d\n", n); > n++; > sleep(1); >} > printf("bye\n"); > > } > > > jean@sun32:/tmp$ cr_run ./pipetest3 & > [1] 31807 > jean@sun32:~$ the data is 12345value is 0 > value is 1 > value is 2 > ... > value is 9 > bye > > jean@sun32:/tmp$ cr_checkpoint 31807 > > jean@sun32:/tmp$ cr_restart context.31807 > value is 7 > value is 8 > value is 9 > bye > > ## > > > It looks like its more to do with Openmpi. Any ideas from you side? > > Thank you. > > Kind regards, > > Jean. > > > > > > --- On Mon, 29/3/10, Josh Hurseywrote: > > From: Josh Hursey > Subject: Re: [OMPI users] Segmentation fault (11) > To: "Open MPI Users" > Date: Monday, 29 March, 2010, 16:08 > > I wonder if this is a bug with BLCR (since the segv stack is in the BLCR > thread). Can you try an non-MPI version of this application that uses > popen(), and see if BLCR properly checkpoints/restarts it? > > If so, we can start to see what Open MPI might be doing to confuse things, > but I suspect that this might be a bug with BLCR. Either way let us know what > you find out. > > Cheers, > Josh > > On Mar 27, 2010, at 6:17 AM, jody wrote: > > > I'm not sure if this is the cause of your problems: > > You define the constant BUFFER_SIZE, but in the code you use a constant > > called BUFSIZ... > > Jody > > > > > > On Fri, Mar 26, 2010 at 10:29 PM, Jean Potsam > > wrote: > > Dear All, > > I am having a problem with openmpi . I have installed openmpi > > 1.4 and blcr 0.8.1 > > > > I have written a small mpi application as follows below: > > > > ### > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > > > #define BUFFER_SIZE PIPE_BUF > > > > char * getprocessid() > > { > > FILE * read_fp; > > char buffer[BUFSIZ + 1]; > > int chars_read; > > char * buffer_data="12345"; > > memset(buffer, '\0', sizeof(buffer)); > > read_fp = popen("uname -a", "r"); > > /* > > ... > > */ > > return buffer_data; > > } > > > > int main(int argc, char ** argv) > > { > > MPI_Status status; > > int rank; > >int size; > > char * thedata; > > MPI_Init(, ); > > MPI_Comm_size(MPI_COMM_WORLD,); > > MPI_Comm_rank(MPI_COMM_WORLD,); > > thedata=getprocessid(); > > printf(" the data is %s", thedata); > > MPI_Finalize(); > > } > > > > > > I get the following result: > > > > ### > > jean@sunn32:~$ mpicc pipetest2.c -o pipetest2 > > jean@sunn32:~$ mpirun -np 1 -am ft-enable-cr -mca btl ^openib pipetest2 > > [sun32:19211] *** Process received signal *** > > [sun32:19211] Signal: Segmentation fault (11) > > [sun32:19211] Signal code: Address not mapped (1) > > [sun32:19211] Failing at address: 0x4 > > [sun32:19211] [ 0] [0xb7f3c40c] > > [sun32:19211] [ 1] /lib/libc.so.6(cfree+0x3b) [0xb796868b] > > [sun32:19211] [ 2]
Re: [OMPI users] Hide Abort output
I do not know what the OpenMPI message looks like or why people want to hide it. It should be phrased to avoid any implication of a problem with OpenMPI itself. How about something like this which: "The application has called MPI_Abort. The application is terminated by OpenMPI as the application demanded" Dick Treumann - MPI Team IBM Systems & Technology Group Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601 Tele (845) 433-7846 Fax (845) 433-8363 From: "Jeff Squyres (jsquyres)"To: , Date: 03/31/2010 06:43 AM Subject:Re: [OMPI users] Hide Abort output Sent by:users-boun...@open-mpi.org At present there is no such feature, but it should not be hard to add. Can you guys be a little more specific about exactly what you are seeing and exactly what you want to see? (And what version you're working with - I'll caveat my discussion that this may be a 1.5-and-forward thing) -jms Sent from my PDA. No type good. - Original Message - From: users-boun...@open-mpi.org To: Open MPI Users Sent: Wed Mar 31 05:38:48 2010 Subject: Re: [OMPI users] Hide Abort output I have to say this is a very common issue for our users. They repeatedly report the long Open MPI MPI_Abort() message in help queries and fail to look for the application error message about the root cause. A short MPI_Abort() message that said "look elsewhere for the real error message" would be useful. Cheers, David On 03/31/2010 07:58 PM, Yves Caniou wrote: > Dear all, > > I am using the MPI_Abort() command in a MPI program. > I would like to not see the note explaining that the command caused Open MPI > to kill all the jobs and so on. > I thought that I could find a --mca parameter, but couldn't grep it. The only > ones deal with the delay and printing more information (the stack). > > Is there a mean to avoid the printing of the note (except the 2>/dev/null > tips)? Or to delay this printing? > > Thank you. > > .Yves. > ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] kernel 2.6.23 vs 2.6.24 - communication/wait times
I have tried up to kernel 2.6.33.1 on both architectures (Core2 Duo and I5) with the same results. The "slow" results are also appearing for distribution of processes on the 4 cores one single node. We use btl = self,sm,tcp in /etc/openmpi/openmpi-mca-params.conf Distributing several process to each one core on several machines is fast and has "normal" communication times. So I guess tcp communication shouldn't be the problem. Also multiple instances of the program, started on one "master" node, with each instance distributing several processes to one core of "slave" nodes don't seem to be a problem. In effect 4 instances of the program occupie all 4 cores on each node which doesn't influence communication and overall calculation time much. But running 4 processes from the same "master" instance on 4 cores on the same node does. Do you have some more ideas what I can test for? I tried to test connectivity_c from openmpi examples on 8 nodes/32 processes. It is hard to get reliable/consistent figures from 'top' since the programm terminates quite fast and interesting usage is very short. But these are some shots of 'top' (master and slave nodes show similar images) System and/or Wait Time are up. sh-3.2$ mpirun -np 4 -host cluster-05 connectivity_c : -np 28 -host cluster-06,cluster-07,cluster-08,cluster-09,cluster-10,cluster-11,cluster-12 connectivity_c Connectivity test on 32 processes PASSED. Cpu(s): 37.5%us, 46.6%sy, 0.0%ni, 0.0%id, 15.9%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8181236k total, 168200k used, 8013036k free,0k buffers Swap:0k total,0k used,0k free, 132092k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND 25179 oli 20 0 143m 3436 2196 R 43 0.0 0:00.57 0 25180 oli 20 0 142m 3392 2180 R 100 0.0 0:00.85 3 25182 oli 20 0 142m 3312 2172 R 100 0.0 0:00.93 2 25181 oli 20 0 134m 3052 2172 R 100 0.0 0:00.93 1 Cpu(s): 10.3%us, 8.7%sy, 0.0%ni, 21.4%id, 58.7%wa, 0.8%hi, 0.0%si, 0.0%st Mem: 8181236k total, 171352k used, 8009884k free,0k buffers Swap:0k total,0k used,0k free, 130572k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND 29496 oli 20 0 142m 3300 2176 D 33 0.0 0:00.21 2 29497 oli 20 0 142m 3280 2160 R 25 0.0 0:00.17 0 29494 oli 20 0 134m 3044 2180 D0 0.0 0:00.01 1 29495 oli 20 0 134m 3036 2172 R 16 0.0 0:00.11 3 Cpu(s): 18.3%us, 36.3%sy, 0.0%ni, 38.0%id, 6.3%wa, 1.1%hi, 0.0%si, 0.0%st Mem: 8181236k total, 141704k used, 8039532k free,0k buffers Swap:0k total,0k used,0k free,99828k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND 29452 oli 20 0 143m 3452 2212 R 52 0.0 0:00.37 1 29455 oli 20 0 143m 3452 2212 S 57 0.0 0:00.41 3 29453 oli 20 0 143m 3440 2200 S 55 0.0 0:00.39 0 29454 oli 20 0 143m 3440 2200 R 55 0.0 0:00.39 2 Thanks for your thoughts, each input is appreciated. Oli On 3/31/2010 8:38 AM, Jeff Squyres wrote: > I have a very dim recollection of some kernel TCP issues back in some older > kernel versions -- such issues affected all TCP communications, not just MPI. > Can you try a newer kernel, perchance? > > > On Mar 30, 2010, at 1:26 PM,wrote: > >> Hello List, >> >> I hope you can help us out on that one, as we are trying to figure out >> since weeks. >> >> The situation: We have a program being capable of slitting to several >> processes to be shared on nodes within a cluster network using openmpi. >> We were running that system on "older" cluster hardware (Intel Core2 Duo >> based, 2GB RAM) using an "older" kernel (2.6.18.6). All nodes are >> diskless network booting. Recently we upgraded the hardware (Intel i5, >> 8GB RAM) which also required an upgrade to a recent kernel version >> (2.6.26+). >> >> Here is the problem: We experience overall performance loss on the new >> hardware and think, we can break it down to a communication issue >> inbetween the processes. >> >> Also, we found out, the issue araises in the transition from kernel >> 2.6.23 to 2.6.24 (tested on the Core2 Duo system). >> >> Here is an output from our programm: >> >> 2.6.23.17 (64bit), MPI 1.2.7 >> 5 Iterationen (Core2 Duo) 6 CPU: >> 93.33 seconds per iteration. >> Node 0 communication/computation time: 6.83 /647.64 seconds. >> Node 1 communication/computation time: 10.09 /644.36 seconds. >> Node 2 communication/computation time: 7.27 /645.03 seconds. >> Node 3 communication/computation time:165.02 /485.52 seconds. >> Node 4 communication/computation time: 6.50 /643.82 seconds. >> Node 5 communication/computation time: 7.80 /627.63 seconds. >> Computation time:897.00 seconds. >> >> 2.6.24.7 (64bit) ..
Re: [OMPI users] strange problem with OpenMPI + rankfile + Intelcompiler 11.0.074 + centos/fedora-12
On Mar 24, 2010, at 12:49 AM, Anton Starikov wrote: > Two different OSes: centos 5.4 (2.6.18 kernel) and Fedora-12 (2.6.32 kernel) > Two different CPUs: Opteron 248 and Opteron 8356. > > same binary for OpenMPI. Same binary for user code (vasp compiled for older > arch) Are you sure that the code is binary compatible between the two platforms? Can you repeat the process with native builds of Open MPI and the app for both architectures? > When I supply rankfile, then depending on combo of OS and CPU results are > different > > centos+Opt8356 : works > centos+Opt248 : works > fedora+Opt8356 : works > fedora+Opt248 : fails > > rankfile is (in case of Opt248) > > rank 0=node014 slot=1 > rank 1=node014 slot=0 > > I tried play with formats, leave one slot (and start one process) - it > doesn't change result > Without rankfile it works on all combos. Nifty (meaning: ick!). I wonder if the processor affinity code is causing the problem here...? It could be a problem in a heterogeneous environment if the systems are "close" but not "exact" in terms of binary compatibility...? > Just in case, all this happens inside of cpuset which always wraps all slots > given in rankfile (I just use torque with cpusets and my custom patch for > torque which also creates rankfile for openmpi, in this case MPI tasks are > bound to particular cores and multithreaded codes limited by given cpuset). > > AFAIR, it also works without problem on both hardware setups with 1.3.x/1.4.0 > and 2.6.30 kernel from OpenSuSE 11.1. > > Strangely, but when I run OSU benchmarks (osu_bw etc), it works without any > problems. Can you re-run with a trivial test, like MPI hello world and/or ring? See the examples/ directory. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] kernel 2.6.23 vs 2.6.24 - communication/wait times
I have a very dim recollection of some kernel TCP issues back in some older kernel versions -- such issues affected all TCP communications, not just MPI. Can you try a newer kernel, perchance? On Mar 30, 2010, at 1:26 PM,wrote: > Hello List, > > I hope you can help us out on that one, as we are trying to figure out > since weeks. > > The situation: We have a program being capable of slitting to several > processes to be shared on nodes within a cluster network using openmpi. > We were running that system on "older" cluster hardware (Intel Core2 Duo > based, 2GB RAM) using an "older" kernel (2.6.18.6). All nodes are > diskless network booting. Recently we upgraded the hardware (Intel i5, > 8GB RAM) which also required an upgrade to a recent kernel version > (2.6.26+). > > Here is the problem: We experience overall performance loss on the new > hardware and think, we can break it down to a communication issue > inbetween the processes. > > Also, we found out, the issue araises in the transition from kernel > 2.6.23 to 2.6.24 (tested on the Core2 Duo system). > > Here is an output from our programm: > > 2.6.23.17 (64bit), MPI 1.2.7 > 5 Iterationen (Core2 Duo) 6 CPU: > 93.33 seconds per iteration. > Node 0 communication/computation time: 6.83 /647.64 seconds. > Node 1 communication/computation time: 10.09 /644.36 seconds. > Node 2 communication/computation time: 7.27 /645.03 seconds. > Node 3 communication/computation time:165.02 /485.52 seconds. > Node 4 communication/computation time: 6.50 /643.82 seconds. > Node 5 communication/computation time: 7.80 /627.63 seconds. > Computation time:897.00 seconds. > > 2.6.24.7 (64bit) .. re-evaluated, MPI 1.2.7 > 5 Iterationen (Core2 Duo) 6 CPU: >131.33 seconds per iteration. > Node 0 communication/computation time:364.15 /645.24 seconds. > Node 1 communication/computation time:362.83 /645.26 seconds. > Node 2 communication/computation time:349.39 /645.07 seconds. > Node 3 communication/computation time:508.34 /485.53 seconds. > Node 4 communication/computation time:349.94 /643.81 seconds. > Node 5 communication/computation time:349.07 /627.47 seconds. > Computation time: 1251.00 seconds. > > The program is 32 bit software, but it doesn't make any difference > whether the kernel is 64 or 32 bit. Also the OpenMPI version 1.4.1 was > tested, cut communication times by half (which still is too high), but > improvement decreased with increasing kernel version number. > > The communication time is meant to be the time the master process > distributes the data portions for calculation and collecting the results > from the slave processes. The value also contains times a slave has to > wait to communicate with the master as he is occupied. This explains the > extended communication time of node #3 as the calculation time is > reduced (based on the nature of the data) > > The command to start the calculation: > mpirun -np 2 -host cluster-17 invert-master -b -s -p inv_grav.inp : -np > 4 -host cluster-18,cluster-19 > > Using top (with 'f' and 'j' showing P row) we could track which process > runs on which core. We found processes stayed on its initial core in > kernel 2.6.23, but started to flip around with 2.6.24. Using the > --bind-to-core option in openmpi 1.4.1 kept the processes on its cores > again, but that didn't influence the overall outcome, didn't fix the issue. > > We found top showing ~25% CPU wait time, and processes showing 'D' , > also on slave only nodes. According to our programmer communications are > only between the master process and its slaves, but not among slaves. On > kernel 2.6.23 and lower CPU usage is 100% on user, no wait or system > percentage. > > Example from top: > > Cpu(s): 75.3%us, 0.6%sy, 0.0%ni, 0.0%id, 23.1%wa, 0.7%hi, 0.3%si, > 0.0%st > Mem: 8181236k total, 131224k used, 8050012k free,0k buffers > Swap:0k total,0k used,0k free,49868k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ P COMMAND > 3386 oli 20 0 90512 20m 3988 R 74 0.3 12:31.80 0 invert- > 3387 oli 20 0 85072 15m 3780 D 67 0.2 11:59.30 1 invert- > 3388 oli 20 0 85064 14m 3588 D 77 0.2 12:56.90 2 invert- > 3389 oli 20 0 84936 14m 3436 R 85 0.2 13:28.30 3 invert- > > > Some system information that might be helpful: > > Nodes Hardware: > 1. "older": Intel Core2 Duo, (2x1)GB RAM > 2. "newer": Intel(R) Core(TM) i5 CPU, Mainboard ASUS RS100-E6, (4x2)GB RAM > > Debian stable (lenny) distribution with > ii libc6 2.7-18lenny2 > ii libopenmpi1 1.2.7~rc2-2 > ii openmpi-bin 1.2.7~rc2-2 > ii openmpi-common1.2.7~rc2-2 > > Nodes are booting diskless with
Re: [OMPI users] OPEN_MPI macro for mpif.h?
On Mar 29, 2010, at 4:10 PM, Martin Bernreuther wrote: > looking at the Open MPI mpi.h include file there's a preprocessor macro > OPEN_MPI defined, as well as e.g. OMPI_MAJOR_VERSION, OMPI_MINOR_VERSION > and OMPI_RELEASE_VERSION. version.h e.g. also defines OMPI_VERSION > This seems to be missing in mpif.h and therefore something like > > include 'mpif.h' > [...] > #ifdef OPEN_MPI >write( *, '("MPI library: OpenMPI",I2,".",I2,".",I2)' ) & > &OMPI_MAJOR_VERSION, OMPI_MINOR_VERSION, OMPI_RELEASE_VERSION > #endif > > doesn't work for a FORTRAN openmpi program. Correct. The reason we didn't do this is because not all Fortran compilers will submit your code through a preprocessor. For example: - shell% cat bogus.h #define MY_VALUE 1 shell% cat bogus.f90 program main #include "bogus.h" implicit none integer a a = MY_VALUE end program shell% ln -s bogus.f90 bogus-preproc.F90 shell% gfortran bogus.f90 Warning: bogus.f90:2: Illegal preprocessor directive bogus.f90:5.14: a = MY_VALUE 1 Error: Symbol 'my_value' at (1) has no IMPLICIT type shell% gfortran bogus-preproc.F90 shell% - That's one example. I used gfortran here; I learned during the process that include'd files are not preprocessed by gfortran, but #include'd files are (regardless of the filename of the main source file). The moral of the story here is that it's a losing game for our wrappers to try and keep up with what file extensions and/or compiler switches enable preprocessing, and trying to determine whether mpif.h was include'd or #include'd. :-( That being said, I have a [very] dim recollection of adding some -D's to the wrapper compiler command line so that -DOPEN_MPI would be defined and we wouldn't have to worry about all the .f90 vs. .F90 / include vs. #include muckety muck... I don't remember what happened with that, though... Are you enough of a fortran person to know whether -D is pretty universally supported among Fortran compilers? It wouldn't be too hard to add a configure test to see if -D is supported. Would you have any time/interest to create a patch for this, perchance? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Problem in remote nodes
On Mar 30, 2010, at 4:28 PM, Robert Collyer wrote: > I changed the SELinux config to permissive (log only), and it didn't > change anything. Back to the drawing board. I'm afraid I have no expereince with SELinux -- I don't know what it restricts. Generally, you need to be able to run processes on remote nodes without entering a password and also be able to open random TCP and unix sockets between previously unrelated processes. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] Help om Openmpi
Yes, you need to install open mpi on all nodes and you need to be able to login to each node without being prompted for a password. Also, not that v1.2.7 is pretty ancient. If you're juist starting with open mpi, can you upgrade to the latest version? -jms Sent from my PDA. No type good. From: users-boun...@open-mpi.orgTo: us...@open-mpi.org Sent: Wed Mar 31 03:39:08 2010 Subject: [OMPI users] Help om Openmpi Dear all, I had install my cluster which the configuration as following: - headnode : + linux CenOS 5.4, 4 CPUs, 3G RAM + sun gridengine sge6.0u12. The headnode is admin and submit node too. + Openmpi 1.2.9. In the installation openmpi :.configure --prefix=/opt/openmpi --with-sge ...Processes complilation and make was fine. + I have 2 others nodes which confg. are: 4 CPU, 1 G RAM and on which run sgeexecd. Testing for SGE on headnode and nodes by qsub was fine. When testing openmpi with as folowing: [guser1@ioitg2 examples]$ /opt/openmpi/bin/mpirun -np 4 --hostfile myhosts hello_cxx Hello, world! I am 0 of 4 Hello, world! I am 1 of 4 Hello, world! I am 3 of 4 Hello, world! I am 2 of 4 [guser1@ioitg2 examples]$ The openmpi runs well. My file myhosts: ioitg2.ioit-grid.ac.vn slots=4 node1.ioit-grid.ac.vn slots=4 node2.ioit-grid.ac.vn slots=4 Now for more processes: [guser1@ioitg2 examples]$ /opt/openmpi/bin/mpirun -np 6 --hostfile myhosts hello_cxx gus...@node1.ioit-grid.ac.vn's password: -- Failed to find the following executable: Host: node1.ioit-grid.ac.vn Executable: hello_cxx Cannot continue. -- mpirun noticed that job rank 0 with PID 19164 on node ioitg2.ioit-grid.ac.vn exited on signal 15 (Terminated). 3 additional processes aborted (not shown) [guser1@ioitg2 examples]$ This is error massage. I was login on node1 successful. PLS, Help me. What problems I have 9installation, configurations, ...). Have I install openmpi on all nodes ? Thank you very much and I am waitting your helps.
Re: [OMPI users] Hide Abort output
At present there is no such feature, but it should not be hard to add. Can you guys be a little more specific about exactly what you are seeing and exactly what you want to see? (And what version you're working with - I'll caveat my discussion that this may be a 1.5-and-forward thing) -jms Sent from my PDA. No type good. - Original Message - From: users-boun...@open-mpi.orgTo: Open MPI Users Sent: Wed Mar 31 05:38:48 2010 Subject: Re: [OMPI users] Hide Abort output I have to say this is a very common issue for our users. They repeatedly report the long Open MPI MPI_Abort() message in help queries and fail to look for the application error message about the root cause. A short MPI_Abort() message that said "look elsewhere for the real error message" would be useful. Cheers, David On 03/31/2010 07:58 PM, Yves Caniou wrote: > Dear all, > > I am using the MPI_Abort() command in a MPI program. > I would like to not see the note explaining that the command caused Open MPI > to kill all the jobs and so on. > I thought that I could find a --mca parameter, but couldn't grep it. The only > ones deal with the delay and printing more information (the stack). > > Is there a mean to avoid the printing of the note (except the 2>/dev/null > tips)? Or to delay this printing? > > Thank you. > > .Yves. > ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Problem in remote nodes
Those are normal ssh messages, I think - an ssh session may try mulktiple auth methods before one succeeds. You're absolutely sure that there's no firewalling software and selinux is disabled? Ompi is behaving as if it is trying to communicate and failing (e.g., its hanging while trying to open some tcp sockets back). Can you open random tcp sockets between your nodes? (E.g., in non-mpi processes) -jms Sent from my PDA. No type good. - Original Message - From: users-boun...@open-mpi.orgTo: Open MPI Users Sent: Wed Mar 31 06:25:43 2010 Subject: Re: [OMPI users] Problem in remote nodes I've been checking the /var/log/messages on the compute node and there is nothing new after executing ' mpirun --host itanium2 -np 2 helloworld.out', but in the /var/log/messages file on the remote node it appears the following messages, nothing about unix_chkpwd. Mar 31 11:56:51 itanium2 sshd(pam_unix)[15349]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=itanium1 user=otro Mar 31 11:56:53 itanium2 sshd[15349]: Accepted publickey for otro from 192.168.3.1 port 40999 ssh2 Mar 31 11:56:53 itanium2 sshd(pam_unix)[15351]: session opened for user otro by (uid=500) Mar 31 11:56:53 itanium2 sshd(pam_unix)[15351]: session closed for user otro It seems that the authentication fails at first, but in the next message it connects with the node... El Mar, 30 de Marzo de 2010, 20:02, Robert Collyer escribió: > I've been having similar problems using Fedora core 9. I believe the > issue may be with SELinux, but this is just an educated guess. In my > setup, shortly after a login via mpi, there is a notation in the > /var/log/messages on the compute node as follows: > > Mar 30 12:39:45 kernel: type=1400 audit(1269970785.534:588): > avc: denied { read } for pid=8047 comm="unix_chkpwd" name="hosts" > dev=dm-0 ino=24579 > scontext=system_u:system_r:system_chkpwd_t:s0-s0:c0.c1023 > tcontext=unconfined_u:object_r:etc_runtime_t:s0 tclass=file > > which says SELinux denied unix_chkpwd read access to hosts. > > Are you getting anything like this? > > In the meantime, I'll check if allowing unix_chkpwd read access to hosts > eliminates the problem on my system, and if it works, I'll post the > steps involved. > > uriz.49...@e.unavarra.es wrote: >> I've benn investigating and there is no firewall that could stop TCP >> traffic in the cluster. With the option --mca plm_base_verbose 30 I get >> the following output: >> >> [itanium1] /home/otro > mpirun --mca plm_base_verbose 30 --host itanium2 >> helloworld.out >> [itanium1:08311] mca: base: components_open: Looking for plm components >> [itanium1:08311] mca: base: components_open: opening plm components >> [itanium1:08311] mca: base: components_open: found loaded component rsh >> [itanium1:08311] mca: base: components_open: component rsh has no >> register >> function >> [itanium1:08311] mca: base: components_open: component rsh open function >> successful >> [itanium1:08311] mca: base: components_open: found loaded component >> slurm >> [itanium1:08311] mca: base: components_open: component slurm has no >> register function >> [itanium1:08311] mca: base: components_open: component slurm open >> function >> successful >> [itanium1:08311] mca:base:select: Auto-selecting plm components >> [itanium1:08311] mca:base:select:( plm) Querying component [rsh] >> [itanium1:08311] mca:base:select:( plm) Query of component [rsh] set >> priority to 10 >> [itanium1:08311] mca:base:select:( plm) Querying component [slurm] >> [itanium1:08311] mca:base:select:( plm) Skipping component [slurm]. >> Query >> failed to return a module >> [itanium1:08311] mca:base:select:( plm) Selected component [rsh] >> [itanium1:08311] mca: base: close: component slurm closed >> [itanium1:08311] mca: base: close: unloading component slurm >> >> --Hangs here >> >> It seems a slurm problem?? >> >> Thanks to any idea >> >> El Vie, 19 de Marzo de 2010, 17:57, Ralph Castain escribió: >> >>> Did you configure OMPI with --enable-debug? You should do this so that >>> more diagnostic output is available. >>> >>> You can also add the following to your cmd line to get more info: >>> >>> --debug --debug-daemons --leave-session-attached >>> >>> Something is likely blocking proper launch of the daemons and processes >>> so >>> you aren't getting to the btl's at all. >>> >>> >>> On Mar 19, 2010, at 9:42 AM, uriz.49...@e.unavarra.es wrote: >>> >>> The processes are running on the remote nodes but they don't give the response to the origin node. I don't know why. With the option --mca btl_base_verbose 30, I have the same problems and it doesn't show any message. Thanks > On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres > wrote: > >> On Mar 17, 2010, at 4:39 AM, wrote: >> >> >>> Hi everyone I'm a new Open MPI user and I
Re: [OMPI users] Problem in remote nodes
I've been checking the /var/log/messages on the compute node and there is nothing new after executing ' mpirun --host itanium2 -np 2 helloworld.out', but in the /var/log/messages file on the remote node it appears the following messages, nothing about unix_chkpwd. Mar 31 11:56:51 itanium2 sshd(pam_unix)[15349]: authentication failure; logname= uid=0 euid=0 tty=NODEVssh ruser= rhost=itanium1 user=otro Mar 31 11:56:53 itanium2 sshd[15349]: Accepted publickey for otro from 192.168.3.1 port 40999 ssh2 Mar 31 11:56:53 itanium2 sshd(pam_unix)[15351]: session opened for user otro by (uid=500) Mar 31 11:56:53 itanium2 sshd(pam_unix)[15351]: session closed for user otro It seems that the authentication fails at first, but in the next message it connects with the node... El Mar, 30 de Marzo de 2010, 20:02, Robert Collyer escribió: > I've been having similar problems using Fedora core 9. I believe the > issue may be with SELinux, but this is just an educated guess. In my > setup, shortly after a login via mpi, there is a notation in the > /var/log/messages on the compute node as follows: > > Mar 30 12:39:45 kernel: type=1400 audit(1269970785.534:588): > avc: denied { read } for pid=8047 comm="unix_chkpwd" name="hosts" > dev=dm-0 ino=24579 > scontext=system_u:system_r:system_chkpwd_t:s0-s0:c0.c1023 > tcontext=unconfined_u:object_r:etc_runtime_t:s0 tclass=file > > which says SELinux denied unix_chkpwd read access to hosts. > > Are you getting anything like this? > > In the meantime, I'll check if allowing unix_chkpwd read access to hosts > eliminates the problem on my system, and if it works, I'll post the > steps involved. > > uriz.49...@e.unavarra.es wrote: >> I've benn investigating and there is no firewall that could stop TCP >> traffic in the cluster. With the option --mca plm_base_verbose 30 I get >> the following output: >> >> [itanium1] /home/otro > mpirun --mca plm_base_verbose 30 --host itanium2 >> helloworld.out >> [itanium1:08311] mca: base: components_open: Looking for plm components >> [itanium1:08311] mca: base: components_open: opening plm components >> [itanium1:08311] mca: base: components_open: found loaded component rsh >> [itanium1:08311] mca: base: components_open: component rsh has no >> register >> function >> [itanium1:08311] mca: base: components_open: component rsh open function >> successful >> [itanium1:08311] mca: base: components_open: found loaded component >> slurm >> [itanium1:08311] mca: base: components_open: component slurm has no >> register function >> [itanium1:08311] mca: base: components_open: component slurm open >> function >> successful >> [itanium1:08311] mca:base:select: Auto-selecting plm components >> [itanium1:08311] mca:base:select:( plm) Querying component [rsh] >> [itanium1:08311] mca:base:select:( plm) Query of component [rsh] set >> priority to 10 >> [itanium1:08311] mca:base:select:( plm) Querying component [slurm] >> [itanium1:08311] mca:base:select:( plm) Skipping component [slurm]. >> Query >> failed to return a module >> [itanium1:08311] mca:base:select:( plm) Selected component [rsh] >> [itanium1:08311] mca: base: close: component slurm closed >> [itanium1:08311] mca: base: close: unloading component slurm >> >> --Hangs here >> >> It seems a slurm problem?? >> >> Thanks to any idea >> >> El Vie, 19 de Marzo de 2010, 17:57, Ralph Castain escribió: >> >>> Did you configure OMPI with --enable-debug? You should do this so that >>> more diagnostic output is available. >>> >>> You can also add the following to your cmd line to get more info: >>> >>> --debug --debug-daemons --leave-session-attached >>> >>> Something is likely blocking proper launch of the daemons and processes >>> so >>> you aren't getting to the btl's at all. >>> >>> >>> On Mar 19, 2010, at 9:42 AM, uriz.49...@e.unavarra.es wrote: >>> >>> The processes are running on the remote nodes but they don't give the response to the origin node. I don't know why. With the option --mca btl_base_verbose 30, I have the same problems and it doesn't show any message. Thanks > On Wed, Mar 17, 2010 at 1:41 PM, Jeff Squyres> wrote: > >> On Mar 17, 2010, at 4:39 AM, wrote: >> >> >>> Hi everyone I'm a new Open MPI user and I have just installed Open >>> MPI >>> in >>> a 6 nodes cluster with Scientific Linux. When I execute it in local >>> it >>> works perfectly, but when I try to execute it on the remote nodes >>> with >>> the >>> --host option it hangs and gives no message. I think that the >>> problem >>> could be with the shared libraries but i'm not sure. In my opinion >>> the >>> problem is not ssh because i can access to the nodes with no >>> password >>> >> You might want to check that Open MPI processes are actually running >> on >> the remote nodes -- check with ps if you see any "orted" or other
Re: [OMPI users] Hide Abort output
I have to say this is a very common issue for our users. They repeatedly report the long Open MPI MPI_Abort() message in help queries and fail to look for the application error message about the root cause. A short MPI_Abort() message that said "look elsewhere for the real error message" would be useful. Cheers, David On 03/31/2010 07:58 PM, Yves Caniou wrote: Dear all, I am using the MPI_Abort() command in a MPI program. I would like to not see the note explaining that the command caused Open MPI to kill all the jobs and so on. I thought that I could find a --mca parameter, but couldn't grep it. The only ones deal with the delay and printing more information (the stack). Is there a mean to avoid the printing of the note (except the 2>/dev/null tips)? Or to delay this printing? Thank you. .Yves.
[OMPI users] Hide Abort output
Dear all, I am using the MPI_Abort() command in a MPI program. I would like to not see the note explaining that the command caused Open MPI to kill all the jobs and so on. I thought that I could find a --mca parameter, but couldn't grep it. The only ones deal with the delay and printing more information (the stack). Is there a mean to avoid the printing of the note (except the 2>/dev/null tips)? Or to delay this printing? Thank you. .Yves. -- Yves Caniou Associate Professor at Université Lyon 1, Member of the team project INRIA GRAAL in the LIP ENS-Lyon, Délégation CNRS in Japan French Laboratory of Informatics (JFLI), * in Information Technology Center, The University of Tokyo, 2-11-16 Yayoi, Bunkyo-ku, Tokyo 113-8658, Japan tel: +81-3-5841-0540 * in National Institute of Informatics 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan tel: +81-3-4212-2412 http://graal.ens-lyon.fr/~ycaniou/
[OMPI users] Help om Openmpi
Dear all, I had install my cluster which the configuration as following: - headnode : + linux CenOS 5.4, 4 CPUs, 3G RAM + sun gridengine sge6.0u12. The headnode is admin and submit node too. + Openmpi 1.2.9. In the installation openmpi :.configure --prefix=/opt/openmpi --with-sge ...Processes complilation and make was fine. + I have 2 others nodes which confg. are: 4 CPU, 1 G RAM and on which run sgeexecd. Testing for SGE on headnode and nodes by qsub was fine. When testing openmpi with as folowing: [guser1@ioitg2 examples]$ /opt/openmpi/bin/mpirun -np 4 --hostfile myhosts hello_cxx Hello, world! I am 0 of 4 Hello, world! I am 1 of 4 Hello, world! I am 3 of 4 Hello, world! I am 2 of 4 [guser1@ioitg2 examples]$ The openmpi runs well. My file myhosts: ioitg2.ioit-grid.ac.vn slots=4 node1.ioit-grid.ac.vn slots=4 node2.ioit-grid.ac.vn slots=4 Now for more processes: [guser1@ioitg2 examples]$ /opt/openmpi/bin/mpirun -np 6 --hostfile myhosts hello_cxx gus...@node1.ioit-grid.ac.vn's password: -- Failed to find the following executable: Host: node1.ioit-grid.ac.vn Executable: hello_cxx Cannot continue. -- mpirun noticed that job rank 0 with PID 19164 on node ioitg2.ioit-grid.ac.vnexited on signal 15 (Terminated). 3 additional processes aborted (not shown) [guser1@ioitg2 examples]$ This is error massage. I was login on node1 successful. PLS, Help me. What problems I have 9installation, configurations, ...). Have I install openmpi on all nodes ? Thank you very much and I am waitting your helps.
Re: [OMPI users] Best way to reduce 3D array
On Tue, 30 Mar 2010, Gus Correa wrote: Salve Ricardo Reis! Como vai a Radio Zero? :) busy, busy, busy. we are preparing to celebrate Yuri's Night, April the 12th! Doesn't this serialize the I/O operation across the processors, whereas MPI_Gather followed by rank_0 I/O may perhaps move the data faster to rank_0, and eventually to disk (particularly when the number of processes is large)? oh, yes. I remember now why I thought of this. If the problem is large enough you will run out of memory in the master machine (for me MPI-IO is the way to go unless you're tied up with NFS). Off course one could always send it by chunks, let the master write, then send other chunk... abraco! Ricardo Reis 'Non Serviam' PhD candidate @ Lasef Computational Fluid Dynamics, High Performance Computing, Turbulence http://www.lasef.ist.utl.pt Cultural Instigator @ Rádio Zero http://www.radiozero.pt Keep them Flying! Ajude a/help Aero Fénix! http://www.aeronauta.com/aero.fenix http://www.flickr.com/photos/rreis/ < sent with alpine 2.00 >