Re: [OMPI users] which gcc to compile openmpi with?
Maybe maybe not, I don't know anyone who has tried it with that old of a compiler. FYI, you would be better off to start updating your code for newer compilers (go for 4.x series) They are much improved and most systems (like ours) can't even have 2.95 work on them at all because they are 64 bit only. Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Sep 24, 2008, at 10:49 PM, Shafagh Jafer wrote: The problem is that I am building my entire MPI-based simulator with openmpi wrappers and my simulator code only compiles with gcc-2.95.3...any thought??does openmpi NOT work with 2.95.3?? --- On Wed, 9/24/08, Terry Frankcombe wrote: From: Terry Frankcombe Subject: Re: [OMPI users] which gcc to compile openmpi with? To: "Open MPI Users" Date: Wednesday, September 24, 2008, 7:19 PM Both of those are really ancient. Fortran in particular will not work happily with those. Why don't you install something from the current epoch? I run happily with gcc 4.3.2. On Wed, 2008-09-24 at 08:36 -0700, Shafagh Jafer wrote: > which gcc is prefered to compile openmpi with?? gcc-2.95.3 or > gcc-3.2.3 ??? > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/ listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/ mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] which gcc to compile openmpi with?
The problem is that I am building my entire MPI-based simulator with openmpi wrappers and my simulator code only compiles with gcc-2.95.3...any thought??does openmpi NOT work with 2.95.3?? --- On Wed, 9/24/08, Terry Frankcombe wrote: From: Terry Frankcombe Subject: Re: [OMPI users] which gcc to compile openmpi with? To: "Open MPI Users" List-Post: users@lists.open-mpi.org Date: Wednesday, September 24, 2008, 7:19 PM Both of those are really ancient. Fortran in particular will not work happily with those. Why don't you install something from the current epoch? I run happily with gcc 4.3.2. On Wed, 2008-09-24 at 08:36 -0700, Shafagh Jafer wrote: > which gcc is prefered to compile openmpi with?? gcc-2.95.3 or > gcc-3.2.3 ??? > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] which gcc to compile openmpi with?
Both of those are really ancient. Fortran in particular will not work happily with those. Why don't you install something from the current epoch? I run happily with gcc 4.3.2. On Wed, 2008-09-24 at 08:36 -0700, Shafagh Jafer wrote: > which gcc is prefered to compile openmpi with?? gcc-2.95.3 or > gcc-3.2.3 ??? > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Crash in code using OMPI 1.2.7 - Debugging assistance sought
Hello. I have a user running a Fortran code that can be built and run on on both 32-bit and 64-bit architectures. When this code is built for the x86-64 machines in our cluster, running on OMPI 1.2.7, it runs fine. However, if we build and run it on 32-bit x86 machines, also running the same GNU/Linux distribution and also with OMPI 1.2.7, it crashes with errors like: [node4][0,1,4][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv] [node3][0,1,3][btl_tcp_frag.c:202:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed with errno=110 mca_btl_tcp_frag_recv: readv failed with errno=104 We have tried different Fortran compilers (both PathScale and gfortran) and keep getting these crashes, which occur after varying numbers of iterations. Running on a single node using MPI seems to work OK. Are there any suggestions on how to figure out if it's a problem with the code or the OMPI installation/software on the system? We have tried "--debug-daemons" with no new/interesting information being revealed. Is there a way to trap segfault messages or more detailed MPI transaction information or anything else that could help diagnose this? Thanks. -- V. Ram v_r_...@fastmail.fm -- http://www.fastmail.fm - Access all of your messages and folders wherever you are
Re: [OMPI users] mca:base:select:( ess) No component selected!
Thank you. I was able to make everything work by using orte_launch_agent and bash's $@ to pass the necessary parameters to orted within my shell script. I needed to add additional paths to my LD_LIBRARY_PATH/PATH variables for other necessary libraries, which is why I was pushing on the orte_launch_agent solution. Is there a document that covers the design of openmpi a bit? It looks pretty interesting, and there's quite a few acronyms that I had trouble finding on the internet (e.g. "ess"). On Wed, Sep 24, 2008 at 3:40 PM, Ralph Castain wrote: > Yes - you don't want to use orte_launch_agent at all for that purpose. What > you need to set is an info_key in your comm_spawn command for "ompi_prefix", > with the value set to the install path. The ssh launcher will assemble the > launch cmd using that info. > Ralph > > > On Sep 24, 2008, at 1:28 PM, Will Portnoy wrote: > > Yes, your first sentence is correct. I intend to use the unmodified > orted, but I need to set up the unix environment after the ssh has > completed but before orted is executed. > > In particular, one of the more important tasks for me to do after ssh > connects is to set LD_LIBRARY_PATH and PATH to include the paths of > the openmpi's install lib and bin directories, respectively. > Otherwise, orted will not be on the PATH, and its dependent libraries > will not be in LD_LIBRARY_PATH. > > Is there a recommended method to set LD_LIBRARY_PATH and PATH when ssh > is used to connect to other hosts when running an mpi job? > > thank you, > > Will > > On Wed, Sep 24, 2008 at 2:36 PM, Ralph Castain wrote: > > So this is a singleton comm_spawn scenario, that requires you specify a > > launch_agent to execute? Just trying to ensure I understand. > > First, let me ensure we have a common understanding of what > > orte_launch_agent does. Basically, that param stipulates the command to be > > used in place of "orted" - it doesn't substitute for "ssh". So if you set > > -mca orte_launch_agent foo, what will happen is: "ssh nodename foo" instead > > of "ssh nodename orted". > > The intent was to provide a way to do things like run valgrind on the orted > > itself. So you could do -mca orte_launch_agent "valgrind orted", and we > > would dutifully run "ssh nodename valrind orted". > > Or if you wanted to write your own orted (e.g., bar-orted), you could > > substitute it for our "orted". > > Or if you wanted to set mca params solely to be seen on the backend > > nodes/procs, you could set -mca orte_launch_agent "orted -mca foo bar", and > > we would launch "ssh nodename orted -mca foo bar". This allows us to set mca > > params without having mpirun see them - helps us to look at debug output, > > for example, from only the backend procs. > > If what you need to do is set something in the environment for the orted, > > there are certain cmd line options that will do that for you - > > orte_launch_agent may or may not be a good method. > > Perhaps it would help if you could tell me exactly what you wanted to have > > orte_launch_agent actually do? > > Thanks > > Ralph > > On Sep 24, 2008, at 12:22 PM, Will Portnoy wrote: > > Sorry for the miscommunication: The processes are started by my > > program with MPI_Comm_spawn, so there was no mpirun involved. > > If you can suggest a test program I can use with mpirun to validate my > > openmpi environment and install, that would probably produce the > > output you would like to see. > > But I'm not sure that will make it clear how the file pointed to by > > "orte_launch_agent" in "mca-params.conf" should be written to setup an > > environment and start orted. > > Will > > On Wed, Sep 24, 2008 at 2:17 PM, Ralph Castain wrote: > > Afraid I am confused. This was the entire output from the job?? If so, > > then > > that means mpirun itself wasn't able to find a launch environment it > > could > > use, so you never got to the point of actually launching an orted. > > Do you have ssh in your path? My best immediate guess is that you don't, > > and > > that mpirun therefore doesn't see anything it can use to launch a job. We > > have discussed internally that we need to improve that error message - > > could > > be this is another case emphasizing that point. > > 1.3 is fine to use - still patching some bugs, but nothing that should > > impact this issue. > > Ralph > > On Sep 24, 2008, at 12:11 PM, Will Portnoy wrote: > > That was the output with plm_base_verbose set to 99 - it's the same > > output with 1. > > Yes, I'd like to use ssh. > > orted wasn't starting properly with orte_launch_agent (which was > > needed because my environment on the target machine wasn't set up), so > > that's why I thought I would try it directly on the command line on > > localhost. I thought this was a simpler case: to verify that orted > > could find all of its necessary components without the complexity of > > everything else I'm doing. > > If I needed to use orte_launch_agent, how should I pass the necessary > > parameters to
Re: [OMPI users] mca:base:select:( ess) No component selected!
Yes - you don't want to use orte_launch_agent at all for that purpose. What you need to set is an info_key in your comm_spawn command for "ompi_prefix", with the value set to the install path. The ssh launcher will assemble the launch cmd using that info. Ralph On Sep 24, 2008, at 1:28 PM, Will Portnoy wrote: Yes, your first sentence is correct. I intend to use the unmodified orted, but I need to set up the unix environment after the ssh has completed but before orted is executed. In particular, one of the more important tasks for me to do after ssh connects is to set LD_LIBRARY_PATH and PATH to include the paths of the openmpi's install lib and bin directories, respectively. Otherwise, orted will not be on the PATH, and its dependent libraries will not be in LD_LIBRARY_PATH. Is there a recommended method to set LD_LIBRARY_PATH and PATH when ssh is used to connect to other hosts when running an mpi job? thank you, Will On Wed, Sep 24, 2008 at 2:36 PM, Ralph Castain wrote: So this is a singleton comm_spawn scenario, that requires you specify a launch_agent to execute? Just trying to ensure I understand. First, let me ensure we have a common understanding of what orte_launch_agent does. Basically, that param stipulates the command to be used in place of "orted" - it doesn't substitute for "ssh". So if you set -mca orte_launch_agent foo, what will happen is: "ssh nodename foo" instead of "ssh nodename orted". The intent was to provide a way to do things like run valgrind on the orted itself. So you could do -mca orte_launch_agent "valgrind orted", and we would dutifully run "ssh nodename valrind orted". Or if you wanted to write your own orted (e.g., bar-orted), you could substitute it for our "orted". Or if you wanted to set mca params solely to be seen on the backend nodes/procs, you could set -mca orte_launch_agent "orted -mca foo bar", and we would launch "ssh nodename orted -mca foo bar". This allows us to set mca params without having mpirun see them - helps us to look at debug output, for example, from only the backend procs. If what you need to do is set something in the environment for the orted, there are certain cmd line options that will do that for you - orte_launch_agent may or may not be a good method. Perhaps it would help if you could tell me exactly what you wanted to have orte_launch_agent actually do? Thanks Ralph On Sep 24, 2008, at 12:22 PM, Will Portnoy wrote: Sorry for the miscommunication: The processes are started by my program with MPI_Comm_spawn, so there was no mpirun involved. If you can suggest a test program I can use with mpirun to validate my openmpi environment and install, that would probably produce the output you would like to see. But I'm not sure that will make it clear how the file pointed to by "orte_launch_agent" in "mca-params.conf" should be written to setup an environment and start orted. Will On Wed, Sep 24, 2008 at 2:17 PM, Ralph Castain wrote: Afraid I am confused. This was the entire output from the job?? If so, then that means mpirun itself wasn't able to find a launch environment it could use, so you never got to the point of actually launching an orted. Do you have ssh in your path? My best immediate guess is that you don't, and that mpirun therefore doesn't see anything it can use to launch a job. We have discussed internally that we need to improve that error message - could be this is another case emphasizing that point. 1.3 is fine to use - still patching some bugs, but nothing that should impact this issue. Ralph On Sep 24, 2008, at 12:11 PM, Will Portnoy wrote: That was the output with plm_base_verbose set to 99 - it's the same output with 1. Yes, I'd like to use ssh. orted wasn't starting properly with orte_launch_agent (which was needed because my environment on the target machine wasn't set up), so that's why I thought I would try it directly on the command line on localhost. I thought this was a simpler case: to verify that orted could find all of its necessary components without the complexity of everything else I'm doing. If I needed to use orte_launch_agent, how should I pass the necessary parameters to start orted after I set up my environment? Am I better off using trunk over 1.3? thank you, Will On Wed, Sep 24, 2008 at 2:01 PM, Ralph Castain wrote: Could you rerun that with -mca plm_base_verbose 1? What environment are you in - I assume rsh/ssh? I would like to see the cmd line being used to launch the orted. What this indicates is that we are not getting the cmd line correct. Could just be that some patch in the trunk didn't get completely applied to the 1.3 branch. BTW: you probably can't run orted directly off of the cmd line. It likely needs some cmd line params to get critical info. Ralph On Sep 24, 2008, at 9:47 AM, Will Portnoy wrote: I'm trying to use MPI_Comm_Spawn with MPI_Info's host key to
Re: [OMPI users] mca:base:select:( ess) No component selected!
Yes, your first sentence is correct. I intend to use the unmodified orted, but I need to set up the unix environment after the ssh has completed but before orted is executed. In particular, one of the more important tasks for me to do after ssh connects is to set LD_LIBRARY_PATH and PATH to include the paths of the openmpi's install lib and bin directories, respectively. Otherwise, orted will not be on the PATH, and its dependent libraries will not be in LD_LIBRARY_PATH. Is there a recommended method to set LD_LIBRARY_PATH and PATH when ssh is used to connect to other hosts when running an mpi job? thank you, Will On Wed, Sep 24, 2008 at 2:36 PM, Ralph Castain wrote: > So this is a singleton comm_spawn scenario, that requires you specify a > launch_agent to execute? Just trying to ensure I understand. > > First, let me ensure we have a common understanding of what > orte_launch_agent does. Basically, that param stipulates the command to be > used in place of "orted" - it doesn't substitute for "ssh". So if you set > -mca orte_launch_agent foo, what will happen is: "ssh nodename foo" instead > of "ssh nodename orted". > > The intent was to provide a way to do things like run valgrind on the orted > itself. So you could do -mca orte_launch_agent "valgrind orted", and we > would dutifully run "ssh nodename valrind orted". > > Or if you wanted to write your own orted (e.g., bar-orted), you could > substitute it for our "orted". > > Or if you wanted to set mca params solely to be seen on the backend > nodes/procs, you could set -mca orte_launch_agent "orted -mca foo bar", and > we would launch "ssh nodename orted -mca foo bar". This allows us to set mca > params without having mpirun see them - helps us to look at debug output, > for example, from only the backend procs. > > If what you need to do is set something in the environment for the orted, > there are certain cmd line options that will do that for you - > orte_launch_agent may or may not be a good method. > > Perhaps it would help if you could tell me exactly what you wanted to have > orte_launch_agent actually do? > > Thanks > Ralph > > On Sep 24, 2008, at 12:22 PM, Will Portnoy wrote: > >> Sorry for the miscommunication: The processes are started by my >> program with MPI_Comm_spawn, so there was no mpirun involved. >> >> If you can suggest a test program I can use with mpirun to validate my >> openmpi environment and install, that would probably produce the >> output you would like to see. >> >> But I'm not sure that will make it clear how the file pointed to by >> "orte_launch_agent" in "mca-params.conf" should be written to setup an >> environment and start orted. >> >> Will >> >> On Wed, Sep 24, 2008 at 2:17 PM, Ralph Castain wrote: >>> >>> Afraid I am confused. This was the entire output from the job?? If so, >>> then >>> that means mpirun itself wasn't able to find a launch environment it >>> could >>> use, so you never got to the point of actually launching an orted. >>> >>> Do you have ssh in your path? My best immediate guess is that you don't, >>> and >>> that mpirun therefore doesn't see anything it can use to launch a job. We >>> have discussed internally that we need to improve that error message - >>> could >>> be this is another case emphasizing that point. >>> >>> 1.3 is fine to use - still patching some bugs, but nothing that should >>> impact this issue. >>> >>> Ralph >>> >>> On Sep 24, 2008, at 12:11 PM, Will Portnoy wrote: >>> That was the output with plm_base_verbose set to 99 - it's the same output with 1. Yes, I'd like to use ssh. orted wasn't starting properly with orte_launch_agent (which was needed because my environment on the target machine wasn't set up), so that's why I thought I would try it directly on the command line on localhost. I thought this was a simpler case: to verify that orted could find all of its necessary components without the complexity of everything else I'm doing. If I needed to use orte_launch_agent, how should I pass the necessary parameters to start orted after I set up my environment? Am I better off using trunk over 1.3? thank you, Will On Wed, Sep 24, 2008 at 2:01 PM, Ralph Castain wrote: > > Could you rerun that with -mca plm_base_verbose 1? What environment are > you > in - I assume rsh/ssh? > > I would like to see the cmd line being used to launch the orted. What > this > indicates is that we are not getting the cmd line correct. Could just > be > that some patch in the trunk didn't get completely applied to the 1.3 > branch. > > BTW: you probably can't run orted directly off of the cmd line. It > likely > needs some cmd line params to get critical info. > > Ralph > > On Sep 24, 2008, at 9:47 AM, Will Portnoy wrote: > >> I'm trying to use MPI_Comm_Spawn with MPI_Info's host key to spawn >
Re: [OMPI users] mca:base:select:( ess) No component selected!
So this is a singleton comm_spawn scenario, that requires you specify a launch_agent to execute? Just trying to ensure I understand. First, let me ensure we have a common understanding of what orte_launch_agent does. Basically, that param stipulates the command to be used in place of "orted" - it doesn't substitute for "ssh". So if you set -mca orte_launch_agent foo, what will happen is: "ssh nodename foo" instead of "ssh nodename orted". The intent was to provide a way to do things like run valgrind on the orted itself. So you could do -mca orte_launch_agent "valgrind orted", and we would dutifully run "ssh nodename valrind orted". Or if you wanted to write your own orted (e.g., bar-orted), you could substitute it for our "orted". Or if you wanted to set mca params solely to be seen on the backend nodes/procs, you could set -mca orte_launch_agent "orted -mca foo bar", and we would launch "ssh nodename orted -mca foo bar". This allows us to set mca params without having mpirun see them - helps us to look at debug output, for example, from only the backend procs. If what you need to do is set something in the environment for the orted, there are certain cmd line options that will do that for you - orte_launch_agent may or may not be a good method. Perhaps it would help if you could tell me exactly what you wanted to have orte_launch_agent actually do? Thanks Ralph On Sep 24, 2008, at 12:22 PM, Will Portnoy wrote: Sorry for the miscommunication: The processes are started by my program with MPI_Comm_spawn, so there was no mpirun involved. If you can suggest a test program I can use with mpirun to validate my openmpi environment and install, that would probably produce the output you would like to see. But I'm not sure that will make it clear how the file pointed to by "orte_launch_agent" in "mca-params.conf" should be written to setup an environment and start orted. Will On Wed, Sep 24, 2008 at 2:17 PM, Ralph Castain wrote: Afraid I am confused. This was the entire output from the job?? If so, then that means mpirun itself wasn't able to find a launch environment it could use, so you never got to the point of actually launching an orted. Do you have ssh in your path? My best immediate guess is that you don't, and that mpirun therefore doesn't see anything it can use to launch a job. We have discussed internally that we need to improve that error message - could be this is another case emphasizing that point. 1.3 is fine to use - still patching some bugs, but nothing that should impact this issue. Ralph On Sep 24, 2008, at 12:11 PM, Will Portnoy wrote: That was the output with plm_base_verbose set to 99 - it's the same output with 1. Yes, I'd like to use ssh. orted wasn't starting properly with orte_launch_agent (which was needed because my environment on the target machine wasn't set up), so that's why I thought I would try it directly on the command line on localhost. I thought this was a simpler case: to verify that orted could find all of its necessary components without the complexity of everything else I'm doing. If I needed to use orte_launch_agent, how should I pass the necessary parameters to start orted after I set up my environment? Am I better off using trunk over 1.3? thank you, Will On Wed, Sep 24, 2008 at 2:01 PM, Ralph Castain wrote: Could you rerun that with -mca plm_base_verbose 1? What environment are you in - I assume rsh/ssh? I would like to see the cmd line being used to launch the orted. What this indicates is that we are not getting the cmd line correct. Could just be that some patch in the trunk didn't get completely applied to the 1.3 branch. BTW: you probably can't run orted directly off of the cmd line. It likely needs some cmd line params to get critical info. Ralph On Sep 24, 2008, at 9:47 AM, Will Portnoy wrote: I'm trying to use MPI_Comm_Spawn with MPI_Info's host key to spawn processes from a process not started with mpirun. This works with the host key set to the localhost's hostname, but it does not work when I use other hosts. I'm using version 1.3a1r19602. I need to use orte_launch_agent to set up my environment a bit before orted is started, but it fails with errors listed below. When I try to run orted directly on the command line with some of the verbosity flags turned to "11", I receive the same messages. Does anybody have any suggestions? thank you, Will [fqdn:24761] mca: base: components_open: Looking for ess components [fqdn:24761] mca: base: components_open: opening ess components [fqdn:24761] mca: base: components_open: found loaded component env [fqdn:24761] mca: base: components_open: component env has no register function [fqdn:24761] mca: base: components_open: component env open function successful [fqdn:24761] mca: base: components_open: found loaded component hnp [fqdn:24761] mca: base: components_open: component hnp
Re: [OMPI users] mca:base:select:( ess) No component selected!
Sorry for the miscommunication: The processes are started by my program with MPI_Comm_spawn, so there was no mpirun involved. If you can suggest a test program I can use with mpirun to validate my openmpi environment and install, that would probably produce the output you would like to see. But I'm not sure that will make it clear how the file pointed to by "orte_launch_agent" in "mca-params.conf" should be written to setup an environment and start orted. Will On Wed, Sep 24, 2008 at 2:17 PM, Ralph Castain wrote: > Afraid I am confused. This was the entire output from the job?? If so, then > that means mpirun itself wasn't able to find a launch environment it could > use, so you never got to the point of actually launching an orted. > > Do you have ssh in your path? My best immediate guess is that you don't, and > that mpirun therefore doesn't see anything it can use to launch a job. We > have discussed internally that we need to improve that error message - could > be this is another case emphasizing that point. > > 1.3 is fine to use - still patching some bugs, but nothing that should > impact this issue. > > Ralph > > On Sep 24, 2008, at 12:11 PM, Will Portnoy wrote: > >> That was the output with plm_base_verbose set to 99 - it's the same >> output with 1. >> >> Yes, I'd like to use ssh. >> >> orted wasn't starting properly with orte_launch_agent (which was >> needed because my environment on the target machine wasn't set up), so >> that's why I thought I would try it directly on the command line on >> localhost. I thought this was a simpler case: to verify that orted >> could find all of its necessary components without the complexity of >> everything else I'm doing. >> >> If I needed to use orte_launch_agent, how should I pass the necessary >> parameters to start orted after I set up my environment? >> >> Am I better off using trunk over 1.3? >> >> thank you, >> >> Will >> >> On Wed, Sep 24, 2008 at 2:01 PM, Ralph Castain wrote: >>> >>> Could you rerun that with -mca plm_base_verbose 1? What environment are >>> you >>> in - I assume rsh/ssh? >>> >>> I would like to see the cmd line being used to launch the orted. What >>> this >>> indicates is that we are not getting the cmd line correct. Could just be >>> that some patch in the trunk didn't get completely applied to the 1.3 >>> branch. >>> >>> BTW: you probably can't run orted directly off of the cmd line. It likely >>> needs some cmd line params to get critical info. >>> >>> Ralph >>> >>> On Sep 24, 2008, at 9:47 AM, Will Portnoy wrote: >>> I'm trying to use MPI_Comm_Spawn with MPI_Info's host key to spawn processes from a process not started with mpirun. This works with the host key set to the localhost's hostname, but it does not work when I use other hosts. I'm using version 1.3a1r19602. I need to use orte_launch_agent to set up my environment a bit before orted is started, but it fails with errors listed below. When I try to run orted directly on the command line with some of the verbosity flags turned to "11", I receive the same messages. Does anybody have any suggestions? thank you, Will [fqdn:24761] mca: base: components_open: Looking for ess components [fqdn:24761] mca: base: components_open: opening ess components [fqdn:24761] mca: base: components_open: found loaded component env [fqdn:24761] mca: base: components_open: component env has no register function [fqdn:24761] mca: base: components_open: component env open function successful [fqdn:24761] mca: base: components_open: found loaded component hnp [fqdn:24761] mca: base: components_open: component hnp has no register function [fqdn:24761] mca: base: components_open: component hnp open function successful [fqdn:24761] mca: base: components_open: found loaded component singleton [fqdn:24761] mca: base: components_open: component singleton has no register function [fqdn:24761] mca: base: components_open: component singleton open function successful [fqdn:24761] mca: base: components_open: found loaded component slurm [fqdn:24761] mca: base: components_open: component slurm has no register function [fqdn:24761] mca: base: components_open: component slurm open function successful [fqdn:24761] mca: base: components_open: found loaded component tool [fqdn:24761] mca: base: components_open: component tool has no register function [fqdn:24761] mca: base: components_open: component tool open function successful [fqdn:24761] mca:base:select: Auto-selecting ess components [fqdn:24761] mca:base:select:( ess) Querying component [env] [fqdn:24761] mca:base:select:( ess) Skipping component [env]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [hnp] [fqdn:24761] mca:base:select:( ess) Ski
Re: [OMPI users] mca:base:select:( ess) No component selected!
Afraid I am confused. This was the entire output from the job?? If so, then that means mpirun itself wasn't able to find a launch environment it could use, so you never got to the point of actually launching an orted. Do you have ssh in your path? My best immediate guess is that you don't, and that mpirun therefore doesn't see anything it can use to launch a job. We have discussed internally that we need to improve that error message - could be this is another case emphasizing that point. 1.3 is fine to use - still patching some bugs, but nothing that should impact this issue. Ralph On Sep 24, 2008, at 12:11 PM, Will Portnoy wrote: That was the output with plm_base_verbose set to 99 - it's the same output with 1. Yes, I'd like to use ssh. orted wasn't starting properly with orte_launch_agent (which was needed because my environment on the target machine wasn't set up), so that's why I thought I would try it directly on the command line on localhost. I thought this was a simpler case: to verify that orted could find all of its necessary components without the complexity of everything else I'm doing. If I needed to use orte_launch_agent, how should I pass the necessary parameters to start orted after I set up my environment? Am I better off using trunk over 1.3? thank you, Will On Wed, Sep 24, 2008 at 2:01 PM, Ralph Castain wrote: Could you rerun that with -mca plm_base_verbose 1? What environment are you in - I assume rsh/ssh? I would like to see the cmd line being used to launch the orted. What this indicates is that we are not getting the cmd line correct. Could just be that some patch in the trunk didn't get completely applied to the 1.3 branch. BTW: you probably can't run orted directly off of the cmd line. It likely needs some cmd line params to get critical info. Ralph On Sep 24, 2008, at 9:47 AM, Will Portnoy wrote: I'm trying to use MPI_Comm_Spawn with MPI_Info's host key to spawn processes from a process not started with mpirun. This works with the host key set to the localhost's hostname, but it does not work when I use other hosts. I'm using version 1.3a1r19602. I need to use orte_launch_agent to set up my environment a bit before orted is started, but it fails with errors listed below. When I try to run orted directly on the command line with some of the verbosity flags turned to "11", I receive the same messages. Does anybody have any suggestions? thank you, Will [fqdn:24761] mca: base: components_open: Looking for ess components [fqdn:24761] mca: base: components_open: opening ess components [fqdn:24761] mca: base: components_open: found loaded component env [fqdn:24761] mca: base: components_open: component env has no register function [fqdn:24761] mca: base: components_open: component env open function successful [fqdn:24761] mca: base: components_open: found loaded component hnp [fqdn:24761] mca: base: components_open: component hnp has no register function [fqdn:24761] mca: base: components_open: component hnp open function successful [fqdn:24761] mca: base: components_open: found loaded component singleton [fqdn:24761] mca: base: components_open: component singleton has no register function [fqdn:24761] mca: base: components_open: component singleton open function successful [fqdn:24761] mca: base: components_open: found loaded component slurm [fqdn:24761] mca: base: components_open: component slurm has no register function [fqdn:24761] mca: base: components_open: component slurm open function successful [fqdn:24761] mca: base: components_open: found loaded component tool [fqdn:24761] mca: base: components_open: component tool has no register function [fqdn:24761] mca: base: components_open: component tool open function successful [fqdn:24761] mca:base:select: Auto-selecting ess components [fqdn:24761] mca:base:select:( ess) Querying component [env] [fqdn:24761] mca:base:select:( ess) Skipping component [env]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [hnp] [fqdn:24761] mca:base:select:( ess) Skipping component [hnp]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [singleton] [fqdn:24761] mca:base:select:( ess) Skipping component [singleton]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [slurm] [fqdn:24761] mca:base:select:( ess) Skipping component [slurm]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [tool] [fqdn:24761] mca:base:select:( ess) Skipping component [tool]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) No component selected! [fqdn:24761] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 125 -- It looks like orte_init failed for some reason; your parallel process is likely to a
Re: [OMPI users] mca:base:select:( ess) No component selected!
That was the output with plm_base_verbose set to 99 - it's the same output with 1. Yes, I'd like to use ssh. orted wasn't starting properly with orte_launch_agent (which was needed because my environment on the target machine wasn't set up), so that's why I thought I would try it directly on the command line on localhost. I thought this was a simpler case: to verify that orted could find all of its necessary components without the complexity of everything else I'm doing. If I needed to use orte_launch_agent, how should I pass the necessary parameters to start orted after I set up my environment? Am I better off using trunk over 1.3? thank you, Will On Wed, Sep 24, 2008 at 2:01 PM, Ralph Castain wrote: > Could you rerun that with -mca plm_base_verbose 1? What environment are you > in - I assume rsh/ssh? > > I would like to see the cmd line being used to launch the orted. What this > indicates is that we are not getting the cmd line correct. Could just be > that some patch in the trunk didn't get completely applied to the 1.3 > branch. > > BTW: you probably can't run orted directly off of the cmd line. It likely > needs some cmd line params to get critical info. > > Ralph > > On Sep 24, 2008, at 9:47 AM, Will Portnoy wrote: > >> I'm trying to use MPI_Comm_Spawn with MPI_Info's host key to spawn >> processes from a process not started with mpirun. This works with the >> host key set to the localhost's hostname, but it does not work when I >> use other hosts. >> >> I'm using version 1.3a1r19602. I need to use orte_launch_agent to set >> up my environment a bit before orted is started, but it fails with >> errors listed below. >> >> When I try to run orted directly on the command line with some of the >> verbosity flags turned to "11", I receive the same messages. >> >> Does anybody have any suggestions? >> >> thank you, >> >> Will >> >> >> [fqdn:24761] mca: base: components_open: Looking for ess components >> [fqdn:24761] mca: base: components_open: opening ess components >> [fqdn:24761] mca: base: components_open: found loaded component env >> [fqdn:24761] mca: base: components_open: component env has no register >> function >> [fqdn:24761] mca: base: components_open: component env open function >> successful >> [fqdn:24761] mca: base: components_open: found loaded component hnp >> [fqdn:24761] mca: base: components_open: component hnp has no register >> function >> [fqdn:24761] mca: base: components_open: component hnp open function >> successful >> [fqdn:24761] mca: base: components_open: found loaded component singleton >> [fqdn:24761] mca: base: components_open: component singleton has no >> register function >> [fqdn:24761] mca: base: components_open: component singleton open >> function successful >> [fqdn:24761] mca: base: components_open: found loaded component slurm >> [fqdn:24761] mca: base: components_open: component slurm has no >> register function >> [fqdn:24761] mca: base: components_open: component slurm open function >> successful >> [fqdn:24761] mca: base: components_open: found loaded component tool >> [fqdn:24761] mca: base: components_open: component tool has no register >> function >> [fqdn:24761] mca: base: components_open: component tool open function >> successful >> [fqdn:24761] mca:base:select: Auto-selecting ess components >> [fqdn:24761] mca:base:select:( ess) Querying component [env] >> [fqdn:24761] mca:base:select:( ess) Skipping component [env]. Query >> failed to return a module >> [fqdn:24761] mca:base:select:( ess) Querying component [hnp] >> [fqdn:24761] mca:base:select:( ess) Skipping component [hnp]. Query >> failed to return a module >> [fqdn:24761] mca:base:select:( ess) Querying component [singleton] >> [fqdn:24761] mca:base:select:( ess) Skipping component [singleton]. >> Query failed to return a module >> [fqdn:24761] mca:base:select:( ess) Querying component [slurm] >> [fqdn:24761] mca:base:select:( ess) Skipping component [slurm]. Query >> failed to return a module >> [fqdn:24761] mca:base:select:( ess) Querying component [tool] >> [fqdn:24761] mca:base:select:( ess) Skipping component [tool]. Query >> failed to return a module >> [fqdn:24761] mca:base:select:( ess) No component selected! >> [fqdn:24761] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file >> runtime/orte_init.c at line 125 >> -- >> It looks like orte_init failed for some reason; your parallel process is >> likely to abort. There are many reasons that a parallel process can >> fail during orte_init; some of which are due to configuration or >> environment problems. This failure appears to be an internal failure; >> here's some additional information (which may only be relevant to an >> Open MPI developer): >> >> orte_ess_base_select failed >> --> Returned value Not found (-13) instead of ORTE_SUCCESS >> -- >> [fqdn:24761] [[INVALI
Re: [OMPI users] mca:base:select:( ess) No component selected!
Could you rerun that with -mca plm_base_verbose 1? What environment are you in - I assume rsh/ssh? I would like to see the cmd line being used to launch the orted. What this indicates is that we are not getting the cmd line correct. Could just be that some patch in the trunk didn't get completely applied to the 1.3 branch. BTW: you probably can't run orted directly off of the cmd line. It likely needs some cmd line params to get critical info. Ralph On Sep 24, 2008, at 9:47 AM, Will Portnoy wrote: I'm trying to use MPI_Comm_Spawn with MPI_Info's host key to spawn processes from a process not started with mpirun. This works with the host key set to the localhost's hostname, but it does not work when I use other hosts. I'm using version 1.3a1r19602. I need to use orte_launch_agent to set up my environment a bit before orted is started, but it fails with errors listed below. When I try to run orted directly on the command line with some of the verbosity flags turned to "11", I receive the same messages. Does anybody have any suggestions? thank you, Will [fqdn:24761] mca: base: components_open: Looking for ess components [fqdn:24761] mca: base: components_open: opening ess components [fqdn:24761] mca: base: components_open: found loaded component env [fqdn:24761] mca: base: components_open: component env has no register function [fqdn:24761] mca: base: components_open: component env open function successful [fqdn:24761] mca: base: components_open: found loaded component hnp [fqdn:24761] mca: base: components_open: component hnp has no register function [fqdn:24761] mca: base: components_open: component hnp open function successful [fqdn:24761] mca: base: components_open: found loaded component singleton [fqdn:24761] mca: base: components_open: component singleton has no register function [fqdn:24761] mca: base: components_open: component singleton open function successful [fqdn:24761] mca: base: components_open: found loaded component slurm [fqdn:24761] mca: base: components_open: component slurm has no register function [fqdn:24761] mca: base: components_open: component slurm open function successful [fqdn:24761] mca: base: components_open: found loaded component tool [fqdn:24761] mca: base: components_open: component tool has no register function [fqdn:24761] mca: base: components_open: component tool open function successful [fqdn:24761] mca:base:select: Auto-selecting ess components [fqdn:24761] mca:base:select:( ess) Querying component [env] [fqdn:24761] mca:base:select:( ess) Skipping component [env]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [hnp] [fqdn:24761] mca:base:select:( ess) Skipping component [hnp]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [singleton] [fqdn:24761] mca:base:select:( ess) Skipping component [singleton]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [slurm] [fqdn:24761] mca:base:select:( ess) Skipping component [slurm]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [tool] [fqdn:24761] mca:base:select:( ess) Skipping component [tool]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) No component selected! [fqdn:24761] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 125 -- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_base_select failed --> Returned value Not found (-13) instead of ORTE_SUCCESS -- [fqdn:24761] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file orted/orted_main.c at line 315 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Mpirun don't execute
I use SVN checkout. I have invoked : shell$ svn co http://svn.open-mpi.org/svn/ompi/trunk ompi I have reinstalled openmpi today. 2008/9/22 Jeff Squyres > Exactly what version of Open MPI are you using? You mentioned "1.3" -- did > you download a nightly tarball at some point, or do you have an SVN > checkout? Since you have a development copy of Open MPI, it is possible > that your copy is simply broken (sorry; we *do* break the development head > every once in a while...). Can you update? > > Note that Josh just made some FT fixes on the trunk today that aren't on > the v1.3 branch yet; they'll likely take a day or three to get there. > > > > > On Sep 22, 2008, at 5:03 PM, Santolo Felaco wrote: > > Hi, this is my openmpi-default-hostfile: >> 127.0.0.1 slots=2 >> >> If I invoke comand CTRL+C the application is not killed. >> With mpirun -np 1 uptime the comand is ever blocked. >> >> The comand is blocked with any comand, also comands not existent. >> >> Thanks. >> >> >> 2008/9/22 Jeff Squyres >> On Sep 19, 2008, at 6:50 PM, Santolo Felaco wrote: >> >> Hi, I try to be clearer: >> osa@libertas:~$ echo $LD_LIBRARY_PATH >> /usr/local/lib:/home/osa/blcr/lib >> osa@libertas:~$ echo $PATH >> >> /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/home/osa/blcr/bin >> >> I compile the file with mpicc, then: >> osa@libertas:~/prove/openmpi$ mpirun -np 2 es1 >> >> The comand is blocked. Don't run. CTRL+X does not end the program. >> >> Try ctrl-c -- that's usually the way to kill applications that appear to >> have been hung. >> >> >> This is ps output: >> >> osa@libertas:~/prove/openmpi$ mpirun -np 2 es1 & >> [1] 6151 >> osa@libertas:~/prove/openmpi$ ps >> PID TTY TIME CMD >> 6135 pts/200:00:00 bash >> 6151 pts/200:00:00 mpirun >> 6153 pts/200:00:00 ssh >> 6161 pts/200:00:00 ps >> >> >> What is your program doing? Can you tell if it's getting past MPI_INIT, >> or even launching at all? Can you mpirun non-MPI applications, such as >> "hostname" and "uptime"? >> >> Are you launching this es1 application locally or remotely? From your >> command line and previous description, I *assume* that it's local, but I see >> an "ssh" in your ps output, possibly meaning that mpirun has launched the >> application remotely (e.g., if you specified a default hostfile or >> somesuch). >> >> >> -- >> Jeff Squyres >> Cisco Systems >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > -- > Jeff Squyres > Cisco Systems > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] which gcc to compile openmpi with?
I compiled it with 2.95.3 and I tested the hello_c and ring_c example and they seemed fine. So I guess it worked with 2.95.3, am I right?? --- On Wed, 9/24/08, Jeff Squyres wrote: From: Jeff Squyres Subject: Re: [OMPI users] which gcc to compile openmpi with? To: "Open MPI Users" List-Post: users@lists.open-mpi.org Date: Wednesday, September 24, 2008, 8:52 AM I don't think we've tested with 2.95 (that's ancient). I'm pretty sure we've tested with various versions of the 3 series. Unless you have a really good reason, you should probably prefer the newer compiler (IMHO). On Sep 24, 2008, at 11:46 AM, Brock Palen wrote: > Those are really old compilers. I think 3.2 might work, 3.4 works I > know for sure, as well as 4.x+ > > Brock Palen > www.umich.edu/~brockp > Center for Advanced Computing > bro...@umich.edu > (734)936-1985 > > > > On Sep 24, 2008, at 11:36 AM, Shafagh Jafer wrote: >> which gcc is prefered to compile openmpi with?? gcc-2.95.3 or >> gcc-3.2.3 ??? >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Problem with MPI_Send and MPI_Recv
No , I do not have any ethernet device aliases. Thank you, Sofia - Original Message - From: "Jeff Squyres" To: "Open MPI Users" Sent: Wednesday, September 24, 2008 2:33 PM Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv You don't happen to have ethernet device aliases on either of these machines, do you? (we have a problem with this on the trunk/v1.3 series right now; we were under the impression that it was working fine in the v1.2 series -- but I figured I'd ask...) On Sep 24, 2008, at 3:22 AM, Sofia Aparicio Secanellas wrote: Hello Terry, I obtain the hostnames of both computers: pichurra hpl1-linux Thank you. Sofia - Original Message - From: "Terry Dontje" > To: Sent: Tuesday, September 23, 2008 6:24 PM Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv Hello Sofia, Very puzzling indeed. Can your try to run hostname or uptime with mpirun? That is something like: mpirun -np 2 --host 10.1.10.208,10.1.10.240 --mca mpi_preconnect_all 1 --prefix /usr/local -mca btl self,tcp -mca btl_tcp_if_include eth1 hostname --td Date: Tue, 23 Sep 2008 17:05:22 +0200 From: "Sofia Aparicio Secanellas" Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv To: "Open MPI Users" Message-ID: <34D2F769A7C946BF915A828A9CD7F3CC@aparicio1> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Hello Terry, Here you can find the files. Thank you very much. Sofia ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users No virus found in this incoming message Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007). http://www.pctools.com/free-antivirus/ No virus found in this outgoing message Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007). http://www.pctools.com/free-antivirus/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users No virus found in this incoming message Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007). http://www.pctools.com/free-antivirus/
Re: [OMPI users] which gcc to compile openmpi with?
I don't think we've tested with 2.95 (that's ancient). I'm pretty sure we've tested with various versions of the 3 series. Unless you have a really good reason, you should probably prefer the newer compiler (IMHO). On Sep 24, 2008, at 11:46 AM, Brock Palen wrote: Those are really old compilers. I think 3.2 might work, 3.4 works I know for sure, as well as 4.x+ Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Sep 24, 2008, at 11:36 AM, Shafagh Jafer wrote: which gcc is prefered to compile openmpi with?? gcc-2.95.3 or gcc-3.2.3 ??? ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
[OMPI users] mca:base:select:( ess) No component selected!
I'm trying to use MPI_Comm_Spawn with MPI_Info's host key to spawn processes from a process not started with mpirun. This works with the host key set to the localhost's hostname, but it does not work when I use other hosts. I'm using version 1.3a1r19602. I need to use orte_launch_agent to set up my environment a bit before orted is started, but it fails with errors listed below. When I try to run orted directly on the command line with some of the verbosity flags turned to "11", I receive the same messages. Does anybody have any suggestions? thank you, Will [fqdn:24761] mca: base: components_open: Looking for ess components [fqdn:24761] mca: base: components_open: opening ess components [fqdn:24761] mca: base: components_open: found loaded component env [fqdn:24761] mca: base: components_open: component env has no register function [fqdn:24761] mca: base: components_open: component env open function successful [fqdn:24761] mca: base: components_open: found loaded component hnp [fqdn:24761] mca: base: components_open: component hnp has no register function [fqdn:24761] mca: base: components_open: component hnp open function successful [fqdn:24761] mca: base: components_open: found loaded component singleton [fqdn:24761] mca: base: components_open: component singleton has no register function [fqdn:24761] mca: base: components_open: component singleton open function successful [fqdn:24761] mca: base: components_open: found loaded component slurm [fqdn:24761] mca: base: components_open: component slurm has no register function [fqdn:24761] mca: base: components_open: component slurm open function successful [fqdn:24761] mca: base: components_open: found loaded component tool [fqdn:24761] mca: base: components_open: component tool has no register function [fqdn:24761] mca: base: components_open: component tool open function successful [fqdn:24761] mca:base:select: Auto-selecting ess components [fqdn:24761] mca:base:select:( ess) Querying component [env] [fqdn:24761] mca:base:select:( ess) Skipping component [env]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [hnp] [fqdn:24761] mca:base:select:( ess) Skipping component [hnp]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [singleton] [fqdn:24761] mca:base:select:( ess) Skipping component [singleton]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [slurm] [fqdn:24761] mca:base:select:( ess) Skipping component [slurm]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) Querying component [tool] [fqdn:24761] mca:base:select:( ess) Skipping component [tool]. Query failed to return a module [fqdn:24761] mca:base:select:( ess) No component selected! [fqdn:24761] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file runtime/orte_init.c at line 125 -- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_ess_base_select failed --> Returned value Not found (-13) instead of ORTE_SUCCESS -- [fqdn:24761] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file orted/orted_main.c at line 315
Re: [OMPI users] which gcc to compile openmpi with?
Those are really old compilers. I think 3.2 might work, 3.4 works I know for sure, as well as 4.x+ Brock Palen www.umich.edu/~brockp Center for Advanced Computing bro...@umich.edu (734)936-1985 On Sep 24, 2008, at 11:36 AM, Shafagh Jafer wrote: which gcc is prefered to compile openmpi with?? gcc-2.95.3 or gcc-3.2.3 ??? ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] where is mpif.h ?
On Sep 24, 2008, at 10:47 AM, Shafagh Jafer wrote: Yes, I am using the wrapper compilers. But in my simulator Makefile.common I am including the files from gcc and g++. Please see my attached makefile. I am also attaching my previouse Makefile.common which I was MPICH instead of openmpi. Please see both of them and see the defferences, you will see that in the new makefile, I am only commenting the MPICH related stuff and replacing gcc and g++ with mpicc and mpic++. Is there anything else I am doing wrong or I am not supposed not have in my new Make file?? You did a few other things, too. ;-) (do a diff -u between the two files and you'll see the differences) The OMPI Makefile.common looks ok (you don't need the -L for OMPI's libs, but it's not harmful). I don't know exactly how it's used, but from the context in that file, I guess it's ok. FWIW, I'd guess that you should be able to use MPICH's wrapper compilers in the same way that you use OMPI's wrapper compilers. I don't know this for sure, but I do know that MPICH has wrapper compilers and I was under the impression that they worked pretty much like ours. As for why you're getting those STL errors, are you able to compile any C++ STL codes on your machine at all? I.e., do you know that the C ++ compiler and STL are installed and functioning properly? The OMPI v1.2 C++ bindings use the STL in a few places; it looks like that is failing to compile with some nebulous errors on your machine. (FWIW, in the upcoming Open MPI v1.3, we have removed all uses of the STL from our C++ bindings, at least partly due to the fact that we have seen multiple users that have functional C++ compilers but broken have STL installations) -- Jeff Squyres Cisco Systems
[OMPI users] which gcc to compile openmpi with?
which gcc is prefered to compile openmpi with?? gcc-2.95.3 or gcc-3.2.3 ???
Re: [OMPI users] where is mpif.h ?
Yes, I am using the wrapper compilers. But in my simulator Makefile.common I am including the files from gcc and g++. Please see my attached makefile. I am also attaching my previouse Makefile.common which I was MPICH instead of openmpi. Please see both of them and see the defferences, you will see that in the new makefile, I am only commenting the MPICH related stuff and replacing gcc and g++ with mpicc and mpic++. Is there anything else I am doing wrong or I am not supposed not have in my new Make file?? --- On Wed, 9/24/08, Jeff Squyres wrote: From: Jeff Squyres Subject: Re: [OMPI users] where is mpif.h ? To: "Open MPI Users" List-Post: users@lists.open-mpi.org Date: Wednesday, September 24, 2008, 5:14 AM On Sep 24, 2008, at 12:15 AM, Shafagh Jafer wrote: > Ok now after i have made sure that my code acutally goes and > includes the mpi.h from openmpi and not mpich, now I get really > wierd errors. Below I will paste my mpic++ configurations from -- > showme and the errors i gert from running my code. > > [sjafer@DeepThought latest_cd++_timewarp]$ /opt/openmpi/1.2.7/bin/ > mpic++ --showme:compile > -I/opt/openmpi/1.2.7/include -pthread > > [sjafer@DeepThought latest_cd++_timewarp]$ /opt/openmpi/1.2.7/bin/ > mpic++ --showme:link > -pthread -L/opt/openmpi/1.2.7/lib -lmpi_cxx -lmpi -lopen-rte -lopen- > pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl > The above looks about right. > =ERRORS=== > In file included from /usr/local/include/g++-3/stl_tree.h:57, > from /usr/local/include/g++-3/map:31, > from /opt/openmpi/1.2.7/include/openmpi/ompi/mpi/ > cxx/mpicxx.h:35, > from /opt/openmpi/1.2.7/include/mpi.h:1795, > from CommPhyMPI.cc:36: > /usr/local/include/g++-3/stl_alloc.h: At top level: > /usr/local/include/g++-3/stl_alloc.h:142: template with C linkage Are you compiling your application with the same C++ compiler that you compiled Open MPI with? If you use the --showme:compile|link flags to put OMPI's required flags into a building process (i.e., if you're not using OMPI's wrapper compilers), it is still strongly recommended that you use the same compilers that you used to compile and build Open MPI. Is there a reason you stopped using the wrapper compilers? Although some users have reported successes with mixing-n-matching compilers, it is untested by the Open MPI team and unsupported. -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users OMPIMakefile.common Description: Binary data MPICHMakefile.common Description: Binary data
Re: [OMPI users] mpirun, paths and xterm again (xserver problem solved; library problem still there)
I works find with konsole. Thank you for the advise. Thomas. Samuel Sarholz wrote: Hi, I think the problem is that xterm (probably) has the userid bit set and thus deletes the LD_LIBRARY_PATH. Try setting the path again before you start gdb, e.g: mpirun -n 2 -x DISPLAY=:0.0 xterm -e LD_LIBRARY_PATH= or use the -Wl,-rpath= to compiler the search path into the executable. best regards, Samuel P.S.: This xterm behavior causes us a lot of problems as well. Other terminals like konsole don't have that problem. Thomas Ropars wrote: Hi, I'm trying to use gdb and xterm with open mpi on my computer (Ubuntu 8.04). When I run an application without gdb on my computer in works find but if I try to use gdb in xterm I get the following error: mpirun -n 2 -x DISPLAY=:0.0 xterm -e gdb ./ring.out (gdb) run Starting program: /media/sda5/tempo/openmpi/tests/ring.out /media/sda5/tempo/openmpi/tests/ring.out: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such file or directory Program exited with code 0177. When I try to use a shell script to launch gdb as mentioned bellow, I get the same error. Thomas Jeff Squyres wrote: On Feb 7, 2008, at 10:07 AM, jody wrote: I wrote a little command called envliblist which consists of this line: printenv | grep PATH | gawk -F "_PATH=" '{ print $2 }' | gawk -F ":" '{ print $1 }' | xargs ls -al When i do mpirun -np 5 -hostfile testhosts -x DISPLAY xterm -hold -e ./ envliblist all xterms (local & remote) display the contents of the openmpi/lib directory. Ok, good. Another strange result: I have a shell script for launching the debugger in an xterm: [jody]:/mnt/data1/neander:$cat run_gdb.sh #!/bin/sh # # save the program name export PROG="$1" # shift away program name (leaves program params) shift # create a command file for gdb, to start it automatically echo run $* > gdb.cmd # do the term xterm -e gdb -x gdb.cmd $PROG exit 0 When i run mpirun -np 5 --hostfile testhosts -x DISPLAY ./run_gdb.sh ./MPITest it works! Just to compare mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e ./MPITest does not work. It seems that if you launch shell scripts, things work. But if you run xterm without a shell script, it does not work. I do not think it is a difference of -hold vs. no -hold. Indeed, I can run both of these commands just fine on my system: % mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm - hold -e gdb ~/mpi/hello % mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm -e gdb ~/mpi/hello Note that my setup is a little different than yours; I'm using a Mac laptop and ssh'ing to a server where I'm invoking mpirun. The hostfile "h" contains a 2nd server where xterm/gdb/hello are running. I notice the only difference between the to above commands is that in the run_gdb script xterm has no "-hold" parameter! Indeed, mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -e ./MPITest does work. To actually see that it works (MPITest is simple Hello MPI app) i had to do mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e "./MPITest >> output.txt" and check output.txt. Does anybody have an explanation for this weird happening? Jody ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Problem with MPI_Send and MPI_Recv
You don't happen to have ethernet device aliases on either of these machines, do you? (we have a problem with this on the trunk/v1.3 series right now; we were under the impression that it was working fine in the v1.2 series -- but I figured I'd ask...) On Sep 24, 2008, at 3:22 AM, Sofia Aparicio Secanellas wrote: Hello Terry, I obtain the hostnames of both computers: pichurra hpl1-linux Thank you. Sofia - Original Message - From: "Terry Dontje" > To: Sent: Tuesday, September 23, 2008 6:24 PM Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv Hello Sofia, Very puzzling indeed. Can your try to run hostname or uptime with mpirun? That is something like: mpirun -np 2 --host 10.1.10.208,10.1.10.240 --mca mpi_preconnect_all 1 --prefix /usr/local -mca btl self,tcp -mca btl_tcp_if_include eth1 hostname --td Date: Tue, 23 Sep 2008 17:05:22 +0200 From: "Sofia Aparicio Secanellas" Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv To: "Open MPI Users" Message-ID: <34D2F769A7C946BF915A828A9CD7F3CC@aparicio1> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Hello Terry, Here you can find the files. Thank you very much. Sofia ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users No virus found in this incoming message Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007). http://www.pctools.com/free-antivirus/ No virus found in this outgoing message Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007). http://www.pctools.com/free-antivirus/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] mpirun, paths and xterm again (xserver problem solved; library problem still there)
Hi, I think the problem is that xterm (probably) has the userid bit set and thus deletes the LD_LIBRARY_PATH. Try setting the path again before you start gdb, e.g: mpirun -n 2 -x DISPLAY=:0.0 xterm -e LD_LIBRARY_PATH= or use the -Wl,-rpath= to compiler the search path into the executable. best regards, Samuel P.S.: This xterm behavior causes us a lot of problems as well. Other terminals like konsole don't have that problem. Thomas Ropars wrote: Hi, I'm trying to use gdb and xterm with open mpi on my computer (Ubuntu 8.04). When I run an application without gdb on my computer in works find but if I try to use gdb in xterm I get the following error: mpirun -n 2 -x DISPLAY=:0.0 xterm -e gdb ./ring.out (gdb) run Starting program: /media/sda5/tempo/openmpi/tests/ring.out /media/sda5/tempo/openmpi/tests/ring.out: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such file or directory Program exited with code 0177. When I try to use a shell script to launch gdb as mentioned bellow, I get the same error. Thomas Jeff Squyres wrote: On Feb 7, 2008, at 10:07 AM, jody wrote: I wrote a little command called envliblist which consists of this line: printenv | grep PATH | gawk -F "_PATH=" '{ print $2 }' | gawk -F ":" '{ print $1 }' | xargs ls -al When i do mpirun -np 5 -hostfile testhosts -x DISPLAY xterm -hold -e ./ envliblist all xterms (local & remote) display the contents of the openmpi/lib directory. Ok, good. Another strange result: I have a shell script for launching the debugger in an xterm: [jody]:/mnt/data1/neander:$cat run_gdb.sh #!/bin/sh # # save the program name export PROG="$1" # shift away program name (leaves program params) shift # create a command file for gdb, to start it automatically echo run $* > gdb.cmd # do the term xterm -e gdb -x gdb.cmd $PROG exit 0 When i run mpirun -np 5 --hostfile testhosts -x DISPLAY ./run_gdb.sh ./MPITest it works! Just to compare mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e ./MPITest does not work. It seems that if you launch shell scripts, things work. But if you run xterm without a shell script, it does not work. I do not think it is a difference of -hold vs. no -hold. Indeed, I can run both of these commands just fine on my system: % mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm - hold -e gdb ~/mpi/hello % mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm -e gdb ~/mpi/hello Note that my setup is a little different than yours; I'm using a Mac laptop and ssh'ing to a server where I'm invoking mpirun. The hostfile "h" contains a 2nd server where xterm/gdb/hello are running. I notice the only difference between the to above commands is that in the run_gdb script xterm has no "-hold" parameter! Indeed, mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -e ./MPITest does work. To actually see that it works (MPITest is simple Hello MPI app) i had to do mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e "./MPITest >> output.txt" and check output.txt. Does anybody have an explanation for this weird happening? Jody ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] where is mpif.h ?
On Sep 24, 2008, at 12:15 AM, Shafagh Jafer wrote: Ok now after i have made sure that my code acutally goes and includes the mpi.h from openmpi and not mpich, now I get really wierd errors. Below I will paste my mpic++ configurations from -- showme and the errors i gert from running my code. [sjafer@DeepThought latest_cd++_timewarp]$ /opt/openmpi/1.2.7/bin/ mpic++ --showme:compile -I/opt/openmpi/1.2.7/include -pthread [sjafer@DeepThought latest_cd++_timewarp]$ /opt/openmpi/1.2.7/bin/ mpic++ --showme:link -pthread -L/opt/openmpi/1.2.7/lib -lmpi_cxx -lmpi -lopen-rte -lopen- pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl The above looks about right. =ERRORS=== In file included from /usr/local/include/g++-3/stl_tree.h:57, from /usr/local/include/g++-3/map:31, from /opt/openmpi/1.2.7/include/openmpi/ompi/mpi/ cxx/mpicxx.h:35, from /opt/openmpi/1.2.7/include/mpi.h:1795, from CommPhyMPI.cc:36: /usr/local/include/g++-3/stl_alloc.h: At top level: /usr/local/include/g++-3/stl_alloc.h:142: template with C linkage Are you compiling your application with the same C++ compiler that you compiled Open MPI with? If you use the --showme:compile|link flags to put OMPI's required flags into a building process (i.e., if you're not using OMPI's wrapper compilers), it is still strongly recommended that you use the same compilers that you used to compile and build Open MPI. Is there a reason you stopped using the wrapper compilers? Although some users have reported successes with mixing-n-matching compilers, it is untested by the Open MPI team and unsupported. -- Jeff Squyres Cisco Systems
Re: [OMPI users] mpirun, paths and xterm again (xserver problem solved; library problem still there)
Hi, I'm trying to use gdb and xterm with open mpi on my computer (Ubuntu 8.04). When I run an application without gdb on my computer in works find but if I try to use gdb in xterm I get the following error: mpirun -n 2 -x DISPLAY=:0.0 xterm -e gdb ./ring.out (gdb) run Starting program: /media/sda5/tempo/openmpi/tests/ring.out /media/sda5/tempo/openmpi/tests/ring.out: error while loading shared libraries: libmpi.so.0: cannot open shared object file: No such file or directory Program exited with code 0177. When I try to use a shell script to launch gdb as mentioned bellow, I get the same error. Thomas Jeff Squyres wrote: On Feb 7, 2008, at 10:07 AM, jody wrote: I wrote a little command called envliblist which consists of this line: printenv | grep PATH | gawk -F "_PATH=" '{ print $2 }' | gawk -F ":" '{ print $1 }' | xargs ls -al When i do mpirun -np 5 -hostfile testhosts -x DISPLAY xterm -hold -e ./ envliblist all xterms (local & remote) display the contents of the openmpi/lib directory. Ok, good. Another strange result: I have a shell script for launching the debugger in an xterm: [jody]:/mnt/data1/neander:$cat run_gdb.sh #!/bin/sh # # save the program name export PROG="$1" # shift away program name (leaves program params) shift # create a command file for gdb, to start it automatically echo run $* > gdb.cmd # do the term xterm -e gdb -x gdb.cmd $PROG exit 0 When i run mpirun -np 5 --hostfile testhosts -x DISPLAY ./run_gdb.sh ./MPITest it works! Just to compare mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e ./MPITest does not work. It seems that if you launch shell scripts, things work. But if you run xterm without a shell script, it does not work. I do not think it is a difference of -hold vs. no -hold. Indeed, I can run both of these commands just fine on my system: % mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm - hold -e gdb ~/mpi/hello % mpirun -np 1 --hostfile h -x DISPLAY=.cisco.com:0 xterm -e gdb ~/mpi/hello Note that my setup is a little different than yours; I'm using a Mac laptop and ssh'ing to a server where I'm invoking mpirun. The hostfile "h" contains a 2nd server where xterm/gdb/hello are running. I notice the only difference between the to above commands is that in the run_gdb script xterm has no "-hold" parameter! Indeed, mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -e ./MPITest does work. To actually see that it works (MPITest is simple Hello MPI app) i had to do mpirun -np 5 --hostfile testhosts -x DISPLAY xterm -hold -e "./MPITest >> output.txt" and check output.txt. Does anybody have an explanation for this weird happening? Jody ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Checkpointing a restarted app fails
Hi Josh! I believe this is now fixed in the trunk. I was able to reproduce with the current trunk and committed a fix a few minutes ago in r19601. So the fix should be in tonight's tarball (or you can grab it from SVN). I've made a request to have the patch applied to v1.3, but that may take a day or so to complete. I updated to 19607 and this really worked out. I'm now able to checkpoint restarted applications without any problems. Yippee! Thanks for the bug report :) Thanks for fixing it :-) Best, Matthias
Re: [OMPI users] Problem with MPI_Send and MPI_Recv
Hello Terry, I obtain the hostnames of both computers: pichurra hpl1-linux Thank you. Sofia - Original Message - From: "Terry Dontje" To: Sent: Tuesday, September 23, 2008 6:24 PM Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv Hello Sofia, Very puzzling indeed. Can your try to run hostname or uptime with mpirun? That is something like: mpirun -np 2 --host 10.1.10.208,10.1.10.240 --mca mpi_preconnect_all 1 --prefix /usr/local -mca btl self,tcp -mca btl_tcp_if_include eth1 hostname --td Date: Tue, 23 Sep 2008 17:05:22 +0200 From: "Sofia Aparicio Secanellas" Subject: Re: [OMPI users] Problem with MPI_Send and MPI_Recv To: "Open MPI Users" Message-ID: <34D2F769A7C946BF915A828A9CD7F3CC@aparicio1> Content-Type: text/plain; charset="iso-8859-1"; Format="flowed" Hello Terry, Here you can find the files. Thank you very much. Sofia ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users No virus found in this incoming message Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007). http://www.pctools.com/free-antivirus/ No virus found in this outgoing message Checked by PC Tools AntiVirus (4.0.0.26 - 10.100.007). http://www.pctools.com/free-antivirus/
Re: [OMPI users] where is mpif.h ?
Ok now after i have made sure that my code acutally goes and includes the mpi.h from openmpi and not mpich, now I get really wierd errors. Below I will paste my mpic++ configurations from --showme and the errors i gert from running my code. [sjafer@DeepThought latest_cd++_timewarp]$ /opt/openmpi/1.2.7/bin/mpic++ --showme:compile -I/opt/openmpi/1.2.7/include -pthread [sjafer@DeepThought latest_cd++_timewarp]$ /opt/openmpi/1.2.7/bin/mpic++ --showme:link -pthread -L/opt/openmpi/1.2.7/lib -lmpi_cxx -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl =ERRORS=== In file included from /usr/local/include/g++-3/stl_tree.h:57, from /usr/local/include/g++-3/map:31, from /opt/openmpi/1.2.7/include/openmpi/ompi/mpi/cxx/mpicxx.h:35, from /opt/openmpi/1.2.7/include/mpi.h:1795, from CommPhyMPI.cc:36: /usr/local/include/g++-3/stl_alloc.h: At top level: /usr/local/include/g++-3/stl_alloc.h:142: template with C linkage /usr/local/include/g++-3/stl_alloc.h:224: template with C linkage /usr/local/include/g++-3/stl_alloc.h:243: template with C linkage /usr/local/include/g++-3/stl_alloc.h:320: template with C linkage /usr/local/include/g++-3/stl_alloc.h:729: template with C linkage /usr/local/include/g++-3/stl_alloc.h:740: template with C linkage /usr/local/include/g++-3/stl_alloc.h:746: template with C linkage /usr/local/include/g++-3/stl_alloc.h: In method `allocator<_Tp>::allocator(const allocator<_Tp1> &)': /usr/local/include/g++-3/stl_alloc.h:746: template with C linkage /usr/local/include/g++-3/stl_alloc.h: At top level: /usr/local/include/g++-3/stl_alloc.h:778: template with C linkage /usr/local/include/g++-3/stl_alloc.h: In function `bool operator ==(const allocator<_Tp1> &, const allocator<_T2> &)': /usr/local/include/g++-3/stl_alloc.h:786: template with C linkage /usr/local/include/g++-3/stl_alloc.h: In function `bool operator !=(const allocator<_Tp1> &, const allocator<_T2> &)': /usr/local/include/g++-3/stl_alloc.h:792: template with C linkage /usr/local/include/g++-3/stl_alloc.h: At top level: /usr/local/include/g++-3/stl_alloc.h:804: template with C linkage /usr/local/include/g++-3/stl_alloc.h:815: template with C linkage /usr/local/include/g++-3/stl_alloc.h:824: template with C linkage /usr/local/include/g++-3/stl_alloc.h: In method `__allocator<_Tp,_Alloc>::__allocator(const __allocator<_Tp1,_Alloc> &)': /usr/local/include/g++-3/stl_alloc.h:824: template with C linkage /usr/local/include/g++-3/stl_alloc.h: At top level: ... === --- On Tue, 9/23/08, Jeff Squyres wrote: From: Jeff Squyres Subject: Re: [OMPI users] where is mpif.h ? To: "Open MPI Users" List-Post: users@lists.open-mpi.org Date: Tuesday, September 23, 2008, 2:13 PM See that FAQ entry I pointed to. ${includedir} is the default "include" directory that came in from running OMPI's configure (defaults to $prefix/include). Likewise for $ {libdir}; it's the "library" directory that came in from running OMPI's configure (defaults to $prefix/lib). On Sep 23, 2008, at 4:41 PM, Shafagh Jafer wrote: > In mpic++_wrapper-data.txt what do the following statments mean and > where do they exactly point to?? > > -- > includedir=${includedir} > libdir=${libdir} > -- > > --- On Tue, 9/23/08, Jeff Squyres wrote: > From: Jeff Squyres > Subject: Re: [OMPI users] where is mpif.h ? > To: "Open MPI Users" > Date: Tuesday, September 23, 2008, 5:11 AM > > It actually is expected behavior. Open MPI's wrappers do not > automatically add -I for /usr/include or -L for /usr/lib because these > directories are typically in the compiler's/linker's default search > path, and having the wrapper compilers manually add them tends to > screw up search ordering. > > You can change the default behavior of the wrapper compilers, though > -- see this FAQ entry for details: > > > http://www.open-mpi.org/faq/?category=mpi-apps#override-wrappers-after-v1.0 > > > On Sep 23, 2008, at 6:40 AM, Jed Brown wrote: > > > On Tue 2008-09-23 08:50, Simon Hammond wrote: > >> Yes, it should be there. > > > > Shouldn't the path be automatically included by the mpif77 > wrapper? I > > ran into this problem when building BLACS (my default OpenMPI 1.2.7 > > lives in /usr, MPICH2 is at /opt/mpich2). The build tries > > > > $ /usr/bin/mpif90 -c -I. -fPIC -Wno-unused-variable -g > > bi_f77_mpi_attr_get.f > > Error: Can't open included file 'mpif.h' > > > > but this succeeds > > > > $ /usr/bin/mpif90 -c -I. -I/usr/include -fPIC -Wno-unused-variable > > -g bi_f77_mpi_attr_get.f > > > > and this works fine as well > > > > $ /opt/mpich2/mpif90 -c -I. -fPIC -Wno-unused-variable -g > > bi_f77_mpi_attr_get.f > > > > Is this the expected behavior? > > > > Jed > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-m