[OMPI users] Is Iprobe fast when there is no message to recieve
I am not sure if this is the right place the ask this question but here it goes. Simplified abstract version of the question. I have 2 MPI processes and I want one to make an occasional signal to the other process. These signals will not happen at predictable times. I want the other process sitting in some kind of work loop to be able to make a very fast check to see if a signal has been sent to it. What is the best way to do this. Actual problem I am working on a realistic neural net simulator. The neurons are split into groups with one group to each processor to simulate them. Occasionally a neuron will spike and have to send that message to neurons on a different processor. This is a relatively rare event. The receiving neurons need to be able to make a very fast check to see if there is a message from neurons on another processor. The way I am doing it now is to use simple send and receive commands. The receiving cell does an iprobe check on every loop through the simulation for every cell that connects to it to see if there is a message(spike) from that cell. If the iprobe says there is a message is does a receive on that message. This seems convoluted though. I do not actually need to receive the message just know that a message is there. And it seems like depending on how Iprobe works there might be a faster method. Is Iprobe fast if there is no message to receive? Would persistent connections work better? Anyway any help would be greatly appreciated.
Re: [OMPI users] Profiling OpenMPI routines
Aniruddha Marathe wrote: I am trying to profile (get the call graph/call sequence of) Open MPI communication routines using GNU Profiler (gprof) since the communication calls are implemented using macros and it's harder to trace them statically. In order to do that I compiled the OpenMPI source code with following options supplied to 'configure' tool: ./configure CFLAGS=-pg CPPFLAGS=-pg --enable-debug --prefix=/home/amarathe/mpi/svn_openmpi/install When I recompiled my test MPI application that does MPI_Send and MPI_Recv with the new library, it generated gmon.out file as expected (I ran it as 'mpirun -np 2 send_recv'). However, running 'gprof' on this file didn't provide any information such as the call graphs for MPI_Send or MPI_Recv. Following is the only function call that I see in the output: $ gprof send_recv gmon.out ... ... % cumulative self self total time seconds secondscalls Ts/call Ts/call name 0.00 0.00 0.00 25 0.00 0.00 data_start ... ... I would like to know if anyone has done something similar with gprof or any other open source tool with OpenMPI code. (I found a similar, fairly recent post on the mailing list, but it seems to talk about profiling the MPI application itself and not the OpenMPI library routines - http://www.open-mpi.org/community/lists/users/2009/04/8999.php) Open source tool or free download? That is, do you really need to be able to see the tool's source code, or are you just interested in avoiding license fees? In any case, since that post you mention, a FAQ has appeared on performance tools. Check http://www.open-mpi.org/faq/?category=perftools You make an important distinction between profiling MPI applications versus profiling the library itself, and many tools will help just with applications. But I've used Sun Studio for profiling Open MPI. Ideally, you should ./configure with -g among the compilation switches so that you get symbolic information about the library, but that isn't necessary. The use of macros and dynamically loaded objects makes correlating profiles with source code hard, but it works. When you bring the Analyzer up, I think you also have to unhide the symbols within the MPI library, which as I remember are hidden by default. Anyhow, it works and I've learned a lot doing things this way.
Re: [OMPI users] memalign usage in OpenMPIand it's consequencesforTotalVIew
On Oct 1, 2009, at 3:27 PM, Åke Sandgren wrote: Yes, but perhaps you need to verify that posix_memalign is also intercepted? Er... right. Of course. :-) https://svn.open-mpi.org/trac/ompi/changeset/22045 I commented on memalign being obsolete since there are a couple of uses of it in the rest of the openmpi code apart from that particular case. They should probably be changed. Some of those are in ROMIO; we don't really want to change those -- it just makes it harder to import new versions (speaking of which, we're due for a ROMIO refresh sometime in the 1.5 series). -- Jeff Squyres jsquy...@cisco.com
[OMPI users] Profiling OpenMPI routines
Hi All, I am trying to profile (get the call graph/call sequence of) Open MPI communication routines using GNU Profiler (gprof) since the communication calls are implemented using macros and it's harder to trace them statically. In order to do that I compiled the OpenMPI source code with following options supplied to 'configure' tool: ./configure CFLAGS=-pg CPPFLAGS=-pg --enable-debug --prefix=/home/amarathe/mpi/svn_openmpi/install When I recompiled my test MPI application that does MPI_Send and MPI_Recv with the new library, it generated gmon.out file as expected (I ran it as 'mpirun -np 2 send_recv'). However, running 'gprof' on this file didn't provide any information such as the call graphs for MPI_Send or MPI_Recv. Following is the only function call that I see in the output: $ gprof send_recv gmon.out ... ... % cumulative self self total time seconds secondscalls Ts/call Ts/call name 0.00 0.00 0.00 25 0.00 0.00 data_start ... ... I would like to know if anyone has done something similar with gprof or any other open source tool with OpenMPI code. (I found a similar, fairly recent post on the mailing list, but it seems to talk about profiling the MPI application itself and not the OpenMPI library routines - http://www.open-mpi.org/community/lists/users/2009/04/8999.php) Thanks & Regards, Aniruddha
Re: [OMPI users] memalign usage in OpenMPI and it's consequencesforTotalVIew
On Thu, 2009-10-01 at 15:19 -0400, Jeff Squyres wrote: > On Oct 1, 2009, at 2:19 PM, Åke Sandgren wrote: > > > No it didn't. And memalign is obsolete according to the manpage. > > posix_memalign is the one to use. > > > > > This particular call is testing the memalign intercept in the ptmalloc > component during startup; we can't replace it with posix_memalign. > > Hence, the values that are passed are fairly meaningless. It's just > testing that the intercept works. Yes, but perhaps you need to verify that posix_memalign is also intercepted? I commented on memalign being obsolete since there are a couple of uses of it in the rest of the openmpi code apart from that particular case. They should probably be changed.
Re: [OMPI users] memalign usage in OpenMPI and it's consequencesforTotalVIew
On Oct 1, 2009, at 2:19 PM, Åke Sandgren wrote: No it didn't. And memalign is obsolete according to the manpage. posix_memalign is the one to use. This particular call is testing the memalign intercept in the ptmalloc component during startup; we can't replace it with posix_memalign. Hence, the values that are passed are fairly meaningless. It's just testing that the intercept works. -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI users] memalign usage in OpenMPI and it's consequences for TotalVIew
The value of 4 might be invalid (though maybe on a 32b machine, it would be okay?) but it's enough to allow TotalView to continue on without raising a memory event, so I'm okay with it ;-) PeterT Ashley Pittman wrote: Simple malloc() returns pointers that are at least eight byte aligned anyway, I'm not sure what the reason for calling memalign() with a value of four would be be anyway. Ashley, On Thu, 2009-10-01 at 20:19 +0200, Åke Sandgren wrote: No it didn't. And memalign is obsolete according to the manpage. posix_memalign is the one to use. https://svn.open-mpi.org/trac/ompi/changeset/21744
Re: [OMPI users] memalign usage in OpenMPI and it's consequencesfor TotalVIew
Good point. That particular call to memalign, however, is part of a series of OMPI memory hook tests. The memory allocated by that memalign call is promptly freed (opal/mca/memory/ptmalloc2/ opal_ptmalloc2_component.c : line 111). The change is to silence TotalView's memory alignment error when memory debugging is enabled. -- Samuel K. Gutierrez Los Alamos National Laboratory On Oct 1, 2009, at 12:56 PM, Ashley Pittman wrote: Simple malloc() returns pointers that are at least eight byte aligned anyway, I'm not sure what the reason for calling memalign() with a value of four would be be anyway. Ashley, On Thu, 2009-10-01 at 20:19 +0200, Åke Sandgren wrote: No it didn't. And memalign is obsolete according to the manpage. posix_memalign is the one to use. https://svn.open-mpi.org/trac/ompi/changeset/21744 -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] memalign usage in OpenMPI and it's consequencesfor TotalVIew
On Thu, 2009-10-01 at 19:56 +0100, Ashley Pittman wrote: > Simple malloc() returns pointers that are at least eight byte aligned > anyway, I'm not sure what the reason for calling memalign() with a value > of four would be be anyway. That is not necessarily true on all systems. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
[OMPI users] Are there ways to reduce the memory used by OpenMPI?
Are there are tuning parameters than I can use to reduce the amount of memory used by OpenMPI? I would very much like to use OpenMPI instead of MVAPICH, but I'm on a cluster where memory usage is the most important consideration. Here are three results which capture the problem: With the "leave_pinned" behavior turned on, I get good performance (19.528, lower is better) mpirun --prefix /usr/mpi/intel/openmpi-1.2.8 --machinefile /var/spool/torque/aux/7972.fwnaeglingio -np 28 --mca btl ^tcp --mca mpi_leave_pinned 1 --mca mpool_base_use_mem_hooks 1 -x LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 /tmp/7972.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/7972.fwnaeglingio/restart.0 Compute rate (processor-microseconds/cell/cycle): 19.528 Total memory usage:38155.3477 MB (38.1553 GB) Turning off the leave_pinned behavior, I get considerably slower performance (28.788), but the memory usage is unchanged (still 38 GB) mpirun --prefix /usr/mpi/intel/openmpi-1.2.8 --machinefile /var/spool/torque/aux/7972.fwnaeglingio -np 28 -x LD_LIBRARY_PATH -x MPI_ENVIRONMENT=1 /tmp/7972.fwnaeglingio/falconv4_ibm_openmpi -cycles 100 -ri restart.0 -ro /tmp/7972.fwnaeglingio/restart.0 Compute rate (processor-microseconds/cell/cycle): 28.788 Total memory usage:38335.7656 MB (38.3358 GB) Using MVAPICH, the performance is in the middle (23.6), but the memory usage is reduced by 5 to 6 GB out of 38 GB, a significant decrease to me. /usr/mpi/intel/mvapich-1.1.0/bin/mpirun_rsh -ssh -np 28 -hostfile /var/spool/torque/aux/7972.fwnaeglingio LD_LIBRARY_PATH="/usr/mpi/intel/mvapich-1.1.0/lib/shared:/usr/mpi/intel/openmpi-1.2.8/lib64:/appserv/intel/fce/10.1.008/lib:/appserv/intel/cce/10.1.008/lib" MPI_ENVIRONMENT=1 /tmp/7972.fwnaeglingio/falconv4_ibm_mvapich -cycles 100 -ri restart.0 -ro /tmp/7972.fwnaeglingio/restart.0 Compute rate (processor-microseconds/cell/cycle): 23.608 Total memory usage:32753.0586 MB (32.7531 GB) I didn't see anything in the FAQ that discusses memory usage other than the impact of the "leave_pinned" option, which apparently does not affect the memory usage in my case. But I figure there must be a justification why OpenMPI would use 6 GB more than MVAPICH on the same case. Thanks for any insights. Also attached is the output of ompi_info -a. ompi_info.output Description: ompi_info.output
Re: [OMPI users] memalign usage in OpenMPI and it's consequencesfor TotalVIew
Simple malloc() returns pointers that are at least eight byte aligned anyway, I'm not sure what the reason for calling memalign() with a value of four would be be anyway. Ashley, On Thu, 2009-10-01 at 20:19 +0200, Åke Sandgren wrote: > No it didn't. And memalign is obsolete according to the manpage. > posix_memalign is the one to use. > > > https://svn.open-mpi.org/trac/ompi/changeset/21744 -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [OMPI users] memalign usage in OpenMPI and it's consequences for TotalVIew
Took a look at the changes and that looks like it should work. It's certainly not in 1.3.3, but as long as you guys are on top of it, that relieves my concerns ;-) Thanks, PeterT Samuel K. Gutierrez wrote: Ticket created (#2040). I hope it's okay ;-). -- Samuel K. Gutierrez Los Alamos National Laboratory On Oct 1, 2009, at 11:58 AM, Jeff Squyres wrote: Did that make it over to the v1.3 branch? On Oct 1, 2009, at 1:39 PM, Samuel K. Gutierrez wrote: Hi, I think Jeff has already addressed this problem. https://svn.open-mpi.org/trac/ompi/changeset/21744 -- Samuel K. Gutierrez Los Alamos National Laboratory On Oct 1, 2009, at 11:25 AM, Peter Thompson wrote: > We had a question from a user who had turned on memory debugging in > TotalView and experience a memory event error Invalid memory > alignment request. Having a 1.3.3 build of OpenMPI handy, I tested > it and sure enough, saw the error. I traced it down to, surprise, a > call to memalign. I find there are a few places where memalign is > called, but the one I think I was dealing with was from malloc.c in > ompi/mca//io/romio/romio/adio/common in the following lines: > > > #ifdef ROMIO_XFS >new = (void *) memalign(XFS_MEMALIGN, size); > #else >new = (void *) malloc(size); > #endif > > I searched, but couldn't find a value for XFS_MEMALIGN, so maybe it > was from opal_pt_malloc2_component.c instead, where the call is > >p = memalign(1, 1024 * 1024); > > There are only 10 to 12 references to memalign in the code that I > can see, so it shouldn't be too hard to find. What I can tell you > is that the value that TotalView saw for alignment, the first arg, > was 1, and the second, the size, was 0x10, which is probably > right for 1024 squared. > > The man page for memalign says that the first argument is the > alignment that the allocated memory use, and it must be a power of > two. The second is the length you want allocated. One could argue > that 1 is a power of two, but it seems a bit specious to me, and > TotalView's memory debugger certainly objects to it. Can anyone tell > me what the intent here is, and whether the memalign alignment > argument is thought to be valid? Or is this a bug (that might not > affect anyone other than TotalView memory debug users?) > > Thanks, > Peter Thompson > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] memalign usage in OpenMPI and it's consequencesfor TotalVIew
On Thu, 2009-10-01 at 13:58 -0400, Jeff Squyres wrote: > Did that make it over to the v1.3 branch? No it didn't. And memalign is obsolete according to the manpage. posix_memalign is the one to use. > > > > I think Jeff has already addressed this problem. > > > > https://svn.open-mpi.org/trac/ompi/changeset/21744
Re: [OMPI users] memalign usage in OpenMPI and it's consequencesfor TotalVIew
Ticket created (#2040). I hope it's okay ;-). -- Samuel K. Gutierrez Los Alamos National Laboratory On Oct 1, 2009, at 11:58 AM, Jeff Squyres wrote: Did that make it over to the v1.3 branch? On Oct 1, 2009, at 1:39 PM, Samuel K. Gutierrez wrote: Hi, I think Jeff has already addressed this problem. https://svn.open-mpi.org/trac/ompi/changeset/21744 -- Samuel K. Gutierrez Los Alamos National Laboratory On Oct 1, 2009, at 11:25 AM, Peter Thompson wrote: > We had a question from a user who had turned on memory debugging in > TotalView and experience a memory event error Invalid memory > alignment request. Having a 1.3.3 build of OpenMPI handy, I tested > it and sure enough, saw the error. I traced it down to, surprise, a > call to memalign. I find there are a few places where memalign is > called, but the one I think I was dealing with was from malloc.c in > ompi/mca//io/romio/romio/adio/common in the following lines: > > > #ifdef ROMIO_XFS >new = (void *) memalign(XFS_MEMALIGN, size); > #else >new = (void *) malloc(size); > #endif > > I searched, but couldn't find a value for XFS_MEMALIGN, so maybe it > was from opal_pt_malloc2_component.c instead, where the call is > >p = memalign(1, 1024 * 1024); > > There are only 10 to 12 references to memalign in the code that I > can see, so it shouldn't be too hard to find. What I can tell you > is that the value that TotalView saw for alignment, the first arg, > was 1, and the second, the size, was 0x10, which is probably > right for 1024 squared. > > The man page for memalign says that the first argument is the > alignment that the allocated memory use, and it must be a power of > two. The second is the length you want allocated. One could argue > that 1 is a power of two, but it seems a bit specious to me, and > TotalView's memory debugger certainly objects to it. Can anyone tell > me what the intent here is, and whether the memalign alignment > argument is thought to be valid? Or is this a bug (that might not > affect anyone other than TotalView memory debug users?) > > Thanks, > Peter Thompson > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] openmpi 1.4 and barrier
Hmm, i don't recall seeing that... On Thu, Oct 1, 2009 at 1:51 PM, Jeff Squyres wrote: > FWIW, I saw this bug to have race-condition-like behavior. I could run a > few times and then it would work. > > On Oct 1, 2009, at 1:42 PM, Michael Di Domenico wrote: > >> On Thu, Oct 1, 2009 at 1:37 PM, Jeff Squyres wrote: >> > On Oct 1, 2009, at 1:24 PM, Michael Di Domenico wrote: >> > >> >> I just upgraded to the devel snapshot of 1.4a1r22031 >> >> >> >> when i run a simple hello world with a barrier i get >> >> >> >> btl_tcp_endpoint.c:484:mca_btl_tcp_endpoint_recv_connect_ack] received >> >> unexpected process identifier >> > >> > I have seen this failure over the last day or three myself. I'll file a >> > trac ticket about it. >> > >> > (all's fair in love, war, and trunk development snapshots!) >> >> Okay, thanks... Unfortunately i need the dev snap for slurm >> intergration... :( >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > -- > Jeff Squyres > jsquy...@cisco.com > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] memalign usage in OpenMPI and it's consequencesfor TotalVIew
Did that make it over to the v1.3 branch? On Oct 1, 2009, at 1:39 PM, Samuel K. Gutierrez wrote: Hi, I think Jeff has already addressed this problem. https://svn.open-mpi.org/trac/ompi/changeset/21744 -- Samuel K. Gutierrez Los Alamos National Laboratory On Oct 1, 2009, at 11:25 AM, Peter Thompson wrote: > We had a question from a user who had turned on memory debugging in > TotalView and experience a memory event error Invalid memory > alignment request. Having a 1.3.3 build of OpenMPI handy, I tested > it and sure enough, saw the error. I traced it down to, surprise, a > call to memalign. I find there are a few places where memalign is > called, but the one I think I was dealing with was from malloc.c in > ompi/mca//io/romio/romio/adio/common in the following lines: > > > #ifdef ROMIO_XFS >new = (void *) memalign(XFS_MEMALIGN, size); > #else >new = (void *) malloc(size); > #endif > > I searched, but couldn't find a value for XFS_MEMALIGN, so maybe it > was from opal_pt_malloc2_component.c instead, where the call is > >p = memalign(1, 1024 * 1024); > > There are only 10 to 12 references to memalign in the code that I > can see, so it shouldn't be too hard to find. What I can tell you > is that the value that TotalView saw for alignment, the first arg, > was 1, and the second, the size, was 0x10, which is probably > right for 1024 squared. > > The man page for memalign says that the first argument is the > alignment that the allocated memory use, and it must be a power of > two. The second is the length you want allocated. One could argue > that 1 is a power of two, but it seems a bit specious to me, and > TotalView's memory debugger certainly objects to it. Can anyone tell > me what the intent here is, and whether the memalign alignment > argument is thought to be valid? Or is this a bug (that might not > affect anyone other than TotalView memory debug users?) > > Thanks, > Peter Thompson > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI users] openmpi 1.4 and barrier
FWIW, I saw this bug to have race-condition-like behavior. I could run a few times and then it would work. On Oct 1, 2009, at 1:42 PM, Michael Di Domenico wrote: On Thu, Oct 1, 2009 at 1:37 PM, Jeff Squyres wrote: > On Oct 1, 2009, at 1:24 PM, Michael Di Domenico wrote: > >> I just upgraded to the devel snapshot of 1.4a1r22031 >> >> when i run a simple hello world with a barrier i get >> >> btl_tcp_endpoint.c:484:mca_btl_tcp_endpoint_recv_connect_ack] received >> unexpected process identifier > > I have seen this failure over the last day or three myself. I'll file a > trac ticket about it. > > (all's fair in love, war, and trunk development snapshots!) Okay, thanks... Unfortunately i need the dev snap for slurm intergration... :( ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI users] openmpi 1.4 and barrier
On Thu, Oct 1, 2009 at 1:37 PM, Jeff Squyres wrote: > On Oct 1, 2009, at 1:24 PM, Michael Di Domenico wrote: > >> I just upgraded to the devel snapshot of 1.4a1r22031 >> >> when i run a simple hello world with a barrier i get >> >> btl_tcp_endpoint.c:484:mca_btl_tcp_endpoint_recv_connect_ack] received >> unexpected process identifier > > I have seen this failure over the last day or three myself. I'll file a > trac ticket about it. > > (all's fair in love, war, and trunk development snapshots!) Okay, thanks... Unfortunately i need the dev snap for slurm intergration... :(
Re: [OMPI users] memalign usage in OpenMPI and it's consequences for TotalVIew
Hi, I think Jeff has already addressed this problem. https://svn.open-mpi.org/trac/ompi/changeset/21744 -- Samuel K. Gutierrez Los Alamos National Laboratory On Oct 1, 2009, at 11:25 AM, Peter Thompson wrote: We had a question from a user who had turned on memory debugging in TotalView and experience a memory event error Invalid memory alignment request. Having a 1.3.3 build of OpenMPI handy, I tested it and sure enough, saw the error. I traced it down to, surprise, a call to memalign. I find there are a few places where memalign is called, but the one I think I was dealing with was from malloc.c in ompi/mca//io/romio/romio/adio/common in the following lines: #ifdef ROMIO_XFS new = (void *) memalign(XFS_MEMALIGN, size); #else new = (void *) malloc(size); #endif I searched, but couldn't find a value for XFS_MEMALIGN, so maybe it was from opal_pt_malloc2_component.c instead, where the call is p = memalign(1, 1024 * 1024); There are only 10 to 12 references to memalign in the code that I can see, so it shouldn't be too hard to find. What I can tell you is that the value that TotalView saw for alignment, the first arg, was 1, and the second, the size, was 0x10, which is probably right for 1024 squared. The man page for memalign says that the first argument is the alignment that the allocated memory use, and it must be a power of two. The second is the length you want allocated. One could argue that 1 is a power of two, but it seems a bit specious to me, and TotalView's memory debugger certainly objects to it. Can anyone tell me what the intent here is, and whether the memalign alignment argument is thought to be valid? Or is this a bug (that might not affect anyone other than TotalView memory debug users?) Thanks, Peter Thompson ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] openmpi 1.4 and barrier
On Oct 1, 2009, at 1:24 PM, Michael Di Domenico wrote: I just upgraded to the devel snapshot of 1.4a1r22031 when i run a simple hello world with a barrier i get btl_tcp_endpoint.c:484:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process identifier I have seen this failure over the last day or three myself. I'll file a trac ticket about it. (all's fair in love, war, and trunk development snapshots!) -- Jeff Squyres jsquy...@cisco.com
[OMPI users] memalign usage in OpenMPI and it's consequences for TotalVIew
We had a question from a user who had turned on memory debugging in TotalView and experience a memory event error Invalid memory alignment request. Having a 1.3.3 build of OpenMPI handy, I tested it and sure enough, saw the error. I traced it down to, surprise, a call to memalign. I find there are a few places where memalign is called, but the one I think I was dealing with was from malloc.c in ompi/mca//io/romio/romio/adio/common in the following lines: #ifdef ROMIO_XFS new = (void *) memalign(XFS_MEMALIGN, size); #else new = (void *) malloc(size); #endif I searched, but couldn't find a value for XFS_MEMALIGN, so maybe it was from opal_pt_malloc2_component.c instead, where the call is p = memalign(1, 1024 * 1024); There are only 10 to 12 references to memalign in the code that I can see, so it shouldn't be too hard to find. What I can tell you is that the value that TotalView saw for alignment, the first arg, was 1, and the second, the size, was 0x10, which is probably right for 1024 squared. The man page for memalign says that the first argument is the alignment that the allocated memory use, and it must be a power of two. The second is the length you want allocated. One could argue that 1 is a power of two, but it seems a bit specious to me, and TotalView's memory debugger certainly objects to it. Can anyone tell me what the intent here is, and whether the memalign alignment argument is thought to be valid? Or is this a bug (that might not affect anyone other than TotalView memory debug users?) Thanks, Peter Thompson
[OMPI users] openmpi 1.4 and barrier
I just upgraded to the devel snapshot of 1.4a1r22031 when i run a simple hello world with a barrier i get btl_tcp_endpoint.c:484:mca_btl_tcp_endpoint_recv_connect_ack] received unexpected process identifier if i pull the barrier out the hello world runs fine interestingly enough, i can run IMB which also uses barrier and it runs just fine Any thoughts?
Re: [OMPI users] How to force the configure, and make to build a 32 bit opmi on a 64 bit linux ?
You probably just want to pass the relevant compiler/linker flags to Open MPI's configure script, such as: ./configure CFLAGS=-m32 CXXFLAGS=-m32 FFLAGS=-m32 FCFLAGS=-m32 ... You need to pass them to all four language flags (C, C++, F77, F90). I used -m32 as an example here; use whatever flag(s) is(are) relevant for your compiler. On Oct 1, 2009, at 10:43 AM, Nader Ahmadi wrote: Hello, We have 64 bit linux box. For a number of reason I need to build a 32 bit openMPI. I have searched FAQ and archived mail, but I couldn't find a good answer. There are some references to this question, in the developer mailing list with no clear response. I am I looking for is: How do I force the configure, and make to build a 32 bit OMPI on a 64 bit linux. Thanks Nader, . Microsoft brings you a new way to search the web. Try Bing™ now ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com
[OMPI users] How to force the configure, and make to build a 32 bit opmi on a 64 bit linux ?
Hello, We have 64 bit linux box. For a number of reason I need to build a 32 bit openMPI. I have searched FAQ and archived mail, but I couldn't find a good answer. There are some references to this question, in the developer mailing list with no clear response. I am I looking for is: How do I force the configure, and make to build a 32 bit OMPI on a 64 bit linux. Thanks Nader, . _ Microsoft brings you a new way to search the web. Try Bing™ now http://www.bing.com?form=MFEHPG&publ=WLHMTAG&crea=TEXT_MFEHPG_Core_tagline_try bing_1x1
Re: [OMPI users] how to SPMD on openmpi
Open MPI is a compliant MPI-2.1 implementation, meaning that your MPI applications are source compatible with other MPI 2.1 implementations. In short: use MPI_Send and all the other MPI_* functions that you're used to. On Oct 1, 2009, at 6:36 AM, ankur pachauri wrote: hi vipin, thanks for the answer but one thing more, do openmpi had bit different library functions than mpi or it's usage is different (such as i'll have to use ompi_** insted of mpi_** functions) thanks in advance On Thu, Oct 1, 2009 at 2:53 PM, vipin kumar wrote: Hi Ankur, try this command, $ mpirun -np 2 -host firstHostIp,secondHostIp a.out for details read manual page for "mpirun". $ man mpirun Regards, On Wed, Sep 30, 2009 at 3:22 PM, ankur pachauri > wrote: Dear all, I have been able to install open mpi on two independent machines having FC 10. The simple hello world programms are running fine on the independent machinesBut can any one pls help me by letting me know how to connect the two machines and run a common program between the twohow do we a do a lamboot -v lamhosts in case of openmpi? How do we get the open mpi running on the two computers simultaneously and excute a common program on the two machines. Thanks in advance -- Ankur Pachauri. 09927590910 Research Scholar, software engineering. Department of Mathematics Dayalbagh Educational Institute Dayalbagh, AGRA ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Vipin K. Research Engineer, C-DOTB, India ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Ankur Pachauri. 09927590910 Research Scholar, software engineering. Department of Mathematics Dayalbagh Educational Institute Dayalbagh, AGRA ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com
Re: [OMPI users] MPI_Comm_accept()/connect() errors
The following is the information regarding the error. I am running Open MPI 1.2.5 on Ubuntu 4.2.4, kernel version 2.6.24 I ran the server program as mpirun -np 1 server. This program gave me the output port as 0.1.0:2000. I used this port name value as the command line argument for the client program: mpirun -np 1 client 0.1.1:2000. - The output of the "ompi_info --all" is attached with the email - PATH Variable: /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr /local/maui/bin/: - LD_LIBRARY_PATH variable was empty - The following is the output of ifconfig on hpcc00 from where the error has been generated: eth0 Link encap:Ethernet HWaddr 00:12:3f:4c:2d:78 inet addr:134.225.200.100 Bcast:134.225.200.255 Mask:255.255.255.0 inet6 addr: fe80::212:3fff:fe4c:2d78/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:15912728 errors:0 dropped:0 overruns:0 frame:0 TX packets:15312376 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2951880321 (2.7 GB) TX bytes:2788249498 (2.5 GB) Interrupt:16 loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:3507489 errors:0 dropped:0 overruns:0 frame:0 TX packets:3507489 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:1794266658 (1.6 GB) TX bytes:1794266658 (1.6 GB) Regards, Blesson. From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Ralph Castain Sent: 29 September 2009 23:59 To: Open MPI Users Subject: Re: [OMPI users] MPI_Comm_accept()/connect() errors I will ask the obvious - what version of Open MPI are you running? In what environment? What was your command line? :-) On Sep 29, 2009, at 3:50 PM, Blesson Varghese wrote: Hi, I have been trying to execute the server.c and client.c program provided in http://www.mpi-forum.org/docs/mpi21-report/node213.htm#Node213, using accept() and connect() function in MPI. However, the following errors are generated. [hpcc00:16522] *** An error occurred in MPI_Comm_connect [hpcc00:16522] *** on communicator MPI_COMM_WORLD [hpcc00:16522] *** MPI_ERR_INTERN: internal error [hpcc00:16522] *** MPI_ERRORS_ARE_FATAL (goodbye) Could anybody please help me? Many thanks, Blesson. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users Open MPI: 1.2.5 Open MPI SVN revision: r16989 Open RTE: 1.2.5 Open RTE SVN revision: r16989 OPAL: 1.2.5 OPAL SVN revision: r16989 MCA backtrace: execinfo (MCA v1.0, API v1.0, Component v1.2.5) MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.2.5) MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.2.5) MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.2.5) MCA timer: linux (MCA v1.0, API v1.0, Component v1.2.5) MCA installdirs: env (MCA v1.0, API v1.0, Component v1.2.5) MCA installdirs: config (MCA v1.0, API v1.0, Component v1.2.5) MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0) MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0) MCA coll: basic (MCA v1.0, API v1.0, Component v1.2.5) MCA coll: self (MCA v1.0, API v1.0, Component v1.2.5) MCA coll: sm (MCA v1.0, API v1.0, Component v1.2.5) MCA coll: tuned (MCA v1.0, API v1.0, Component v1.2.5) MCA io: romio (MCA v1.0, API v1.0, Component v1.2.5) MCA mpool: rdma (MCA v1.0, API v1.0, Component v1.2.5) MCA mpool: sm (MCA v1.0, API v1.0, Component v1.2.5) MCA pml: cm (MCA v1.0, API v1.0, Component v1.2.5) MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.2.5) MCA bml: r2 (MCA v1.0, API v1.0, Component v1.2.5) MCA rcache: vma (MCA v1.0, API v1.0, Component v1.2.5) MCA btl: self (MCA v1.0, API v1.0.1, Component v1.2.5) MCA btl: sm (MCA v1.0, API v1.0.1, Component v1.2.5) MCA btl: tcp (MCA v1.0, API v1.0.1, Component v1.0) MCA topo: unity (MCA v1.0, API v1.0, Component v1.2.5) MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.2.5) MCA errmgr: hnp (MCA v1.0, API v1.3, Component v1.2.5) MCA errmgr: orted (MCA v1.0, API v1.3, Component v1.2.5) MCA errmgr: proxy (MCA v1.0, API v1.3, Component v1.2.5) MCA gpr: null (MCA v1.0, API v1.0, Component v1.2.5) MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.2.5)
Re: [OMPI users] how to SPMD on openmpi
hi vipin, thanks for the answer but one thing more, do openmpi had bit different library functions than mpi or it's usage is different (such as i'll have to use ompi_** insted of mpi_** functions) thanks in advance On Thu, Oct 1, 2009 at 2:53 PM, vipin kumar wrote: > Hi Ankur, > > try this command, > > $ mpirun -np 2 -host firstHostIp,secondHostIp a.out > > for details read manual page for "mpirun". > > $ man mpirun > > > Regards, > > > On Wed, Sep 30, 2009 at 3:22 PM, ankur pachauri > wrote: > >> Dear all, >> >> I have been able to install open mpi on two independent machines having FC >> 10. The simple hello world programms are running fine on the independent >> machinesBut can any one pls help me by letting me know how to connect >> the two machines and run a common program between the twohow do we a do >> a lamboot -v lamhosts in case of openmpi? >> How do we get the open mpi running on the two computers simultaneously and >> excute a common program on the two machines. >> >> Thanks in advance >> >> -- >> Ankur Pachauri. >> 09927590910 >> >> Research Scholar, >> software engineering. >> Department of Mathematics >> Dayalbagh Educational Institute >> Dayalbagh, >> AGRA >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > > -- > Vipin K. > Research Engineer, > C-DOTB, India > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Ankur Pachauri. 09927590910 Research Scholar, software engineering. Department of Mathematics Dayalbagh Educational Institute Dayalbagh, AGRA
[OMPI users] job fails with "Signal: Bus error (7)"
Hi, A fortran application which is compiled with ifort-10.1 and open mpi 1.3.1 on Cent OS 5.2 fails after running 4 days with following error message: [compute-0-7:25430] *** Process received signal *** [compute-0-7:25433] *** Process received signal *** [compute-0-7:25433] Signal: Bus error (7) [compute-0-7:25433] Signal code: (2) [compute-0-7:25433] Failing at address: 0x4217b8 [compute-0-7:25431] *** Process received signal *** [compute-0-7:25431] Signal: Bus error (7) [compute-0-7:25431] Signal code: (2) [compute-0-7:25431] Failing at address: 0x4217b8 [compute-0-7:25432] *** Process received signal *** [compute-0-7:25432] Signal: Bus error (7) [compute-0-7:25432] Signal code: (2) [compute-0-7:25432] Failing at address: 0x4217b8 [compute-0-7:25430] Signal: Bus error (7) [compute-0-7:25430] Signal code: (2) [compute-0-7:25430] Failing at address: 0x4217b8 [compute-0-7:25431] *** Process received signal *** [compute-0-7:25431] Signal: Segmentation fault (11) [compute-0-7:25431] Signal code: (128) [compute-0-7:25431] Failing at address: (nil) [compute-0-7:25430] *** Process received signal *** [compute-0-7:25433] *** Process received signal *** [compute-0-7:25433] Signal: Segmentation fault (11) [compute-0-7:25433] Signal code: (128) [compute-0-7:25433] Failing at address: (nil) [compute-0-7:25432] *** Process received signal *** [compute-0-7:25432] Signal: Segmentation fault (11) [compute-0-7:25432] Signal code: (128) [compute-0-7:25432] Failing at address: (nil) [compute-0-7:25430] Signal: Segmentation fault (11) [compute-0-7:25430] Signal code: (128) [compute-0-7:25430] Failing at address: (nil) -- mpirun noticed that process rank 3 with PID 25433 on node compute-0-7.local exited on signal 11 (Segmentation fault). -- This job is run with 4 open mpi processes, on the nodes which have interconnected with Gigabit. The same job runs well on the nodes with infiniband connectivity. What could be the reason for this? Is this due to loose physical connectivities, as its giving a bus error?
Re: [OMPI users] how to SPMD on openmpi
Hi Ankur, try this command, $ mpirun -np 2 -host firstHostIp,secondHostIp a.out for details read manual page for "mpirun". $ man mpirun Regards, On Wed, Sep 30, 2009 at 3:22 PM, ankur pachauri wrote: > Dear all, > > I have been able to install open mpi on two independent machines having FC > 10. The simple hello world programms are running fine on the independent > machinesBut can any one pls help me by letting me know how to connect > the two machines and run a common program between the twohow do we a do > a lamboot -v lamhosts in case of openmpi? > How do we get the open mpi running on the two computers simultaneously and > excute a common program on the two machines. > > Thanks in advance > > -- > Ankur Pachauri. > 09927590910 > > Research Scholar, > software engineering. > Department of Mathematics > Dayalbagh Educational Institute > Dayalbagh, > AGRA > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Vipin K. Research Engineer, C-DOTB, India
Re: [OMPI users] error in checkpointing an mpi application
Hi, from what you describe below, seems as if you did not configure well OpenMPI. You issued ./configure --with-ft=cr --enable-mpi-threads --with-blcr=/usr/local/bin --with-blcr-libdir=/usr/local/lib while according to the installation paths you gave it should have been more like ./configure --with-ft=cr --enable-mpi-threads --with-blcr=/root/MS --with-blcr-libdir=/root/MS/lib Apart from that, if you wish to have BLCR modules loaded at start up of your machine, a simple way is to add the following lines in rc.local This file is somewhere in /etc: the exact location can vary from one linux distribution to another (e.g.: /etc/rc.d/rc.local or /etc/rc.local) /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr_imports.ko /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr.ko Just in case, if you have multiple MPIs installed, you can check which you are using with the following command: which mpirun HTH, -- Constantinos Mallikarjuna Shastry wrote: dear sir i am sending the details as follows 1. i am using openmpi-1.3.3 and blcr 0.8.2 2. i have installed blcr 0.8.2 first under /root/MS 3. then i installed openmpi 1.3.3 under /root/MS 4 i have configured and installed open mpi as follows #./configure --with-ft=cr --enable-mpi-threads --with-blcr=/usr/local/bin --with-blcr-libdir=/usr/local/lib # make # make install then i added the following to the .bash_profile under home directory( i went to home directory by doing cd ~) /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr_imports.ko /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr.ko PATH=$PATH:/usr/local/bin MANPATH=$MANPATH:/usr/local/man LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib then i compiled and run the file arr_add.c as follows [root@localhost examples]# mpicc -o res arr_add.c [root@localhost examples]# mpirun -np 2 -am ft-enable-cr ./res 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 -- Error: The process with PID 5790 is not checkpointable. This could be due to one of the following: - An application with this PID doesn't currently exist - The application with this PID isn't checkpointable - The application with this PID isn't an OPAL application. We were looking for the named files: /tmp/opal_cr_prog_write.5790 /tmp/opal_cr_prog_read.5790 -- [localhost.localdomain:05788] local) Error: Unable to initiate the handshake with peer [[7788,1],1]. -1 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 567 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 1054 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 NOTE: the PID of mpirun is 5788 i geve the following command for taking the checkpoint [root@localhost examples]#ompi-checkpoint -s 5788 i got the following output , but it was hanging like this [localhost.localdomain:05796] Requested - Global Snapshot Reference: (null) [localhost.localdomain:05796] Pending - Global Snapshot Reference: (null) [localhost.localdomain:05796] Running - Global Snapshot Reference: (null) can anybody resolve this problem kindly rectify it. with regards mallikarjuna shastry ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] error in checkpointing an mpi application
dear sir i am sending the details as follows 1. i am using openmpi-1.3.3 and blcr 0.8.2 2. i have installed blcr 0.8.2 first under /root/MS 3. then i installed openmpi 1.3.3 under /root/MS 4 i have configured and installed open mpi as follows #./configure --with-ft=cr --enable-mpi-threads --with-blcr=/usr/local/bin --with-blcr-libdir=/usr/local/lib # make # make install then i added the following to the .bash_profile under home directory( i went to home directory by doing cd ~) /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr_imports.ko /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr.ko PATH=$PATH:/usr/local/bin MANPATH=$MANPATH:/usr/local/man LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib then i compiled and run the file arr_add.c as follows [root@localhost examples]# mpicc -o res arr_add.c [root@localhost examples]# mpirun -np 2 -am ft-enable-cr ./res 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 -- Error: The process with PID 5790 is not checkpointable. This could be due to one of the following: - An application with this PID doesn't currently exist - The application with this PID isn't checkpointable - The application with this PID isn't an OPAL application. We were looking for the named files: /tmp/opal_cr_prog_write.5790 /tmp/opal_cr_prog_read.5790 -- [localhost.localdomain:05788] local) Error: Unable to initiate the handshake with peer [[7788,1],1]. -1 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 567 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 1054 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 NOTE: the PID of mpirun is 5788 i geve the following command for taking the checkpoint [root@localhost examples]#ompi-checkpoint -s 5788 i got the following output , but it was hanging like this [localhost.localdomain:05796] Requested - Global Snapshot Reference: (null) [localhost.localdomain:05796] Pending - Global Snapshot Reference: (null) [localhost.localdomain:05796] Running - Global Snapshot Reference: (null) can anybody resolve this problem kindly rectify it. with regards mallikarjuna shastry
Re: [OMPI users] Openmpi setup with intel compiler.
Dear Peter, I got info from the net that OPenmpi requires F77 bindings for F90 support. Thats where I was making mistake, i didnt configured for F77 bindings during openmpi setup. I rectified my mistake and after that openmpi was installed successfully for both PGI and INTEL compiler. It was great help from you. :-) Thanks and Regards, Vighnesh >Dear Peter, >Your suggestions did worked, it didnt showed any error during make and >make install. But it didnt got installed with mpif90 support. I tried to >compile my mpi code, but it gave following error. >[vighnesh@test_node SIVA]$ /share/apps/mpi/openmpi/intel/bin/mpif90 code.f >-o code.exe >-- >Unfortunately, this installation of Open MPI was not compiled with Fortran >90 support. As such, the mpif90 compiler is non-functional. >-- > My configure script line is: >[root@test_node vighnesh]# ./configure >--prefix=/share/apps/mpi/openmpi/intel FC=ifort --with-tm=/opt/torque >Please help me. >Thanks and Regards, >Vighnesh > On Wednesday 30 September 2009, vighn...@aero.iitb.ac.in wrote: > ... >> during >> configuring with Intel 9.0 compiler the installation gives following error. >> >> [root@test_node openmpi-1.3.3]# make all install > ... >> make[3]: Entering directory `/tmp/openmpi-1.3.3/orte' >> test -z "/share/apps/mpi/openmpi/intel/lib" || /bin/mkdir -p >> "/share/apps/mpi/openmpi/intel/lib" >> /bin/sh ../libtool --mode=install /usr/bin/install -c >> 'libopen-rte.la' >> '/share/apps/mpi/openmpi/intel/lib/libopen-rte.la' >> libtool: install: error: cannot install `libopen-rte.la' to a directory not ending in /share/apps/mpi/openmpi/pgi/lib > > The line above indicates that you've somehow attempted this from a dirty tree > and/or environment (dirty from the previous pgi installation...). > > Try a clean environment, clean build tree. Source the icc/ifort-vars.sh files > from your intel install dir, set CC, CXX, FC, F77 and do: > "./configure --prefix=... && make && make install" > > /Peter >
Re: [OMPI users] Openmpi setup with intel compiler.
Dear Peter, Your suggestions did worked, it didnt showed any error during make and make install. But it didnt got installed with mpif90 support. I tried to compile my mpi code, but it gave following error. [vighnesh@test_node SIVA]$ /share/apps/mpi/openmpi/intel/bin/mpif90 code.f -o code.exe -- Unfortunately, this installation of Open MPI was not compiled with Fortran 90 support. As such, the mpif90 compiler is non-functional. -- My configure script line is: [root@test_node vighnesh]# ./configure --prefix=/share/apps/mpi/openmpi/intel FC=ifort --with-tm=/opt/torque Please help me. Thanks and Regards, Vighnesh > On Wednesday 30 September 2009, vighn...@aero.iitb.ac.in wrote: > ... >> during >> configuring with Intel 9.0 compiler the installation gives following >> error. >> >> [root@test_node openmpi-1.3.3]# make all install > ... >> make[3]: Entering directory `/tmp/openmpi-1.3.3/orte' >> test -z "/share/apps/mpi/openmpi/intel/lib" || /bin/mkdir -p >> "/share/apps/mpi/openmpi/intel/lib" >> /bin/sh ../libtool --mode=install /usr/bin/install -c >> 'libopen-rte.la' >> '/share/apps/mpi/openmpi/intel/lib/libopen-rte.la' >> libtool: install: error: cannot install `libopen-rte.la' to a directory >> not ending in /share/apps/mpi/openmpi/pgi/lib > > The line above indicates that you've somehow attempted this from a dirty > tree > and/or environment (dirty from the previous pgi installation...). > > Try a clean environment, clean build tree. Source the icc/ifort-vars.sh > files > from your intel install dir, set CC, CXX, FC, F77 and do: > "./configure --prefix=... && make && make install" > > /Peter >