Re: [OMPI users] Using OpenMPI on a network

2012-06-19 Thread VimalMathew
Just finished doing that.

Still getting the same error. How do I make sure there are no old
builds/files left?

I uninstalled everything to do with MPI, Cygwin, cleared environment
variables, did the whole Windows build again and then did the
supercomputing tutorial.

 

--

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Damien
Sent: Tuesday, June 19, 2012 1:20 PM
To: Open MPI Users
Subject: Re: [OMPI users] Using OpenMPI on a network

 

There's something else wrong, if that's the Supercomputing Blog tutorial
1 you're running.  It works happily without a hostfile.  I think you
have some borked paths there.

I don't know why a Windows version is looking for an etc directory for a
hostfile, unless there's some of your previous Cygwin builds lying
around.  The etc directory is *Nix thing.  Please make sure you've
completely deleted all your old failed OpenMPI builds, code, binaries,
everything.  Uninstall any other MPI versions you have tried, OpenMPI,
MPICH, whatever.  You need to make absolutely sure you only have one
version.  Check your paths in your environment after doing all that and
remove any remaining path references to other MPI versions.  You
shouldn't be getting that network error either, if you're running
locally it won't matter if you have a network cable or not.  That has to
be fixed before you can do anything on a cluster.

Damien



On 19/06/2012 10:53 AM, vimalmat...@eaton.com wrote:

Damien, Shiqing, Jeff?

 

--

Vimal

 

From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On Behalf Of vimalmat...@eaton.com
Sent: Monday, June 18, 2012 3:32 PM
To: us...@open-mpi.org
Subject: [OMPI users] Using OpenMPI on a network

 

So I configured and compiled a simple MPI program.

Now the issue is when I try to do the same thing on my computer
on a corporate network, I get this error:

 

C:\OpenMPI\openmpi-1.6\installed\bin>mpiexec MPI_Tutorial_1.exe



--

Open RTE was unable to open the hostfile:


C:\OpenMPI\openmpi-1.6\installed\bin/../etc/openmpi-default-hostfile

Check to make sure the path and filename are correct.



--

[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found
in file C:\OpenM

PI\openmpi-1.6\orte\mca\ras\base\ras_base_allocate.c at line 200

[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found
in file C:\OpenM

PI\openmpi-1.6\orte\mca\plm\base\plm_base_launch_support.c at
line 99

[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found
in file C:\OpenM

PI\openmpi-1.6\orte\mca\plm\process\plm_process_module.c at line
996

 

What network settings should I be using? I'm sure this is
because of the network because when I unplug the network cable, I get
the error message I got below.

 

Thanks,

Vimal

 

From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On Behalf Of Damien
Sent: Friday, June 15, 2012 3:15 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building MPI on Windows

 

OK, that's what orte_rml_base_select failed means, no TCP
connection.  But you should be able to make OpenMPI & mpiexec work
without a network if you're just running in local memory.  There's
probably a runtime parameter to set but I don't know what it is.  Maybe
Jeff or Shiqing can weigh in with what that is.

Damien

On 15/06/2012 1:10 PM, vimalmat...@eaton.com wrote: 

Just figured it out.

The only thing different from when it ran yesterday to today was
I was connected to a network. So I connected my laptop to a network and
it worked again.

 

Thanks for all your help, Damien!

I'm sure I'm gonna get stuck more along the way so hoping you
can help.

 

--

Vimal

 

From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On Behalf Of Damien
Sent: Friday, June 15, 2012 2:57 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building MPI on Windows

 

Hmmm.  Two things.  Can you run helloworldMPI.exe on it's own?
It should output "Number of threads = 1, My rank = 0"

Also, can you post the output of ompi_info ?  I think you might
still have some path mixups.  A successful OpenMPI build with this
simple program should just work.

If you still have the other OpenMPIs installed from the
binaries, you might want to try uninstalling all of them and rebooting.
Also if you rebuilt OpenMPI and helloworldMPI with VS 2010, make sure
that helloworldMPI is actually linked to 

Re: [OMPI users] 2012/06/18 14:35:07 自动保存草稿

2012-06-19 Thread Ralph Castain
That's a little bit strong - OMPI still supports checkpoint/restart as a fault 
tolerance mechanism. There really isn't anything the sys admin has to do, 
though - what is required is that users periodically order their programs to 
checkpoint so they can be restarted after a failure.

Checkpointing is typically done either by the app itself (say, when it reaches 
some point it feels is a good one to save), or using a script that just orders 
a checkpoint every so many seconds.

What we have said is that we don't believe the FT "run thru failure" position 
pushed by UTK is particularly required at this time. Partly a question of 
impact vs benefit, mostly due to competing approaches offering equivalent fault 
recovery capability with less impact. But that's a separate discussion.


On Jun 19, 2012, at 11:16 AM, George Bosilca wrote:

> It has been clearly stated that the official position pushed forward by a 
> majority of the Open MPI developer community is that fault tolerance is not 
> needed so we (read this as the official version of Open MPI) do not support 
> it.
> 
> However, a group of researchers have been working toward a version of Open 
> MPI that supports the last fault tolerance proposal submitted for 
> consideration to the MPI Forum. You can access it at 
> https://bitbucket.org/jjhursey/ompi-ulfm-rts.
> 
>   george. 
> 
> On Jun 19, 2012, at 09:58 , 陈松 wrote:
> 
>> Hi all,
>> 
>> Can anyone explain me the fault tolerant features in OpenMPI? I've read the 
>> FAQs and some papers about this topic listed in open-mpi.org, but still 
>> can't figure out when one node of my supercomputer system fails down during 
>> computing, what would happen with the fault tolerant mechanism in OpenMPI, 
>> and what should we system administrator do after the failure (or before). 
>> 
>> Can anyone help me? My boss want me to deploy OpenMPI in our system cuz he 
>> want the fault tolerant feature.
>> 
>> Thanks very much.
>> 
>> 
>> 
>> ---
>> CHEN Song
>> R Department
>> National Supercomputer Center in Tianjin
>> Binhai New Area, Tianjin, China
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Using OpenMPI on a network

2012-06-19 Thread Damien
There's something else wrong, if that's the Supercomputing Blog tutorial 
1 you're running.  It works happily without a hostfile.  I think you 
have some borked paths there.


I don't know why a Windows version is looking for an etc directory for a 
hostfile, unless there's some of your previous Cygwin builds lying 
around.  The etc directory is *Nix thing.  Please make sure you've 
completely deleted all your old failed OpenMPI builds, code, binaries, 
everything.  Uninstall any other MPI versions you have tried, OpenMPI, 
MPICH, whatever.  You need to make absolutely sure you only have one 
version.  Check your paths in your environment after doing all that and 
remove any remaining path references to other MPI versions.  You 
shouldn't be getting that network error either, if you're running 
locally it won't matter if you have a network cable or not.  That has to 
be fixed before you can do anything on a cluster.


Damien


On 19/06/2012 10:53 AM, vimalmat...@eaton.com wrote:


Damien, Shiqing, Jeff?

--

Vimal

*From:*users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] 
*On Behalf Of *vimalmat...@eaton.com

*Sent:* Monday, June 18, 2012 3:32 PM
*To:* us...@open-mpi.org
*Subject:* [OMPI users] Using OpenMPI on a network

So I configured and compiled a simple MPI program.

Now the issue is when I try to do the same thing on my computer on a 
corporate network, I get this error:


C:\OpenMPI\openmpi-1.6\installed\bin>mpiexec MPI_Tutorial_1.exe

--

*Open RTE was unable to open the hostfile:*

*C:\OpenMPI\openmpi-1.6\installed\bin/../etc/openmpi-default-hostfile*

*Check to make sure the path and filename are correct.*

*--*

*[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in 
file C:\OpenM*


*PI\openmpi-1.6\orte\mca\ras\base\ras_base_allocate.c at line 200*

*[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in 
file C:\OpenM*


*PI\openmpi-1.6\orte\mca\plm\base\plm_base_launch_support.c at line 99*

*[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in 
file C:\OpenM*


*PI\openmpi-1.6\orte\mca\plm\process\plm_process_module.c at line 996*

**

What network settings should I be using? I'm sure this is because of 
the network because when I unplug the network cable, I get the error 
message I got below.


Thanks,

Vimal

*From:*users-boun...@open-mpi.org  
[mailto:users-boun...@open-mpi.org] 
 *On Behalf Of *Damien

*Sent:* Friday, June 15, 2012 3:15 PM
*To:* Open MPI Users
*Subject:* Re: [OMPI users] Building MPI on Windows

OK, that's what orte_rml_base_select failed means, no TCP connection.  
But you should be able to make OpenMPI & mpiexec work without a 
network if you're just running in local memory.  There's probably a 
runtime parameter to set but I don't know what it is.  Maybe Jeff or 
Shiqing can weigh in with what that is.


Damien

On 15/06/2012 1:10 PM, vimalmat...@eaton.com 
 wrote:


Just figured it out.

The only thing different from when it ran yesterday to today was I was 
connected to a network. So I connected my laptop to a network and it 
worked again.


Thanks for all your help, Damien!

I'm sure I'm gonna get stuck more along the way so hoping you can help.

--

Vimal

*From:*users-boun...@open-mpi.org  
[mailto:users-boun...@open-mpi.org] *On Behalf Of *Damien

*Sent:* Friday, June 15, 2012 2:57 PM
*To:* Open MPI Users
*Subject:* Re: [OMPI users] Building MPI on Windows

Hmmm.  Two things.  Can you run helloworldMPI.exe on it's own?  It 
should output "Number of threads = 1, My rank = 0"


Also, can you post the output of ompi_info ?  I think you might still 
have some path mixups.  A successful OpenMPI build with this simple 
program should just work.


If you still have the other OpenMPIs installed from the binaries, you 
might want to try uninstalling all of them and rebooting.  Also if you 
rebuilt OpenMPI and helloworldMPI with VS 2010, make sure that 
helloworldMPI is actually linked to those VS2010 OpenMPI libs by 
setting the right lib path in the Linker options. Linking to VS2008 
libs and trying to run with VS2010 dlls/exes could cause problems too.


Damien

On 15/06/2012 11:44 AM, vimalmat...@eaton.com 
 wrote:


Hi Damien,

I installed MS Visual Studio 2010 and tried the whole procedure again 
and it worked!


That's the great news.

Now the bad news is that I'm trying to run the program again using 
mpiexec and it won't!


I get these error messages:

orte_rml_base_select failed

orte_ess_set_name failed, with a bunch of text saying it could be due 
to configuration or environment problems and will make sense only to 
an OpenMPI developer.


Help!

--

Vimal

*From:*users-boun...@open-mpi.org 

Re: [OMPI users] 2012/06/18 14:35:07 自动保存草稿

2012-06-19 Thread George Bosilca
It has been clearly stated that the official position pushed forward by a 
majority of the Open MPI developer community is that fault tolerance is not 
needed so we (read this as the official version of Open MPI) do not support it.

However, a group of researchers have been working toward a version of Open MPI 
that supports the last fault tolerance proposal submitted for consideration to 
the MPI Forum. You can access it at 
https://bitbucket.org/jjhursey/ompi-ulfm-rts.

  george. 

On Jun 19, 2012, at 09:58 , 陈松 wrote:

> Hi all,
> 
> Can anyone explain me the fault tolerant features in OpenMPI? I've read the 
> FAQs and some papers about this topic listed in open-mpi.org, but still can't 
> figure out when one node of my supercomputer system fails down during 
> computing, what would happen with the fault tolerant mechanism in OpenMPI, 
> and what should we system administrator do after the failure (or before). 
> 
> Can anyone help me? My boss want me to deploy OpenMPI in our system cuz he 
> want the fault tolerant feature.
> 
> Thanks very much.
> 
> 
> 
> ---
> CHEN Song
> R Department
> National Supercomputer Center in Tianjin
> Binhai New Area, Tianjin, China
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Using OpenMPI on a network

2012-06-19 Thread VimalMathew
Is hostname the name of the system I'm running it on?

 

Just tried that. Got the same error message

 

--

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Ralph Castain
Sent: Tuesday, June 19, 2012 1:03 PM
To: Open MPI Users
Subject: Re: [OMPI users] Using OpenMPI on a network

 

You're getting that error because you failed to specify any hosts on
your cmd line - so OMPI has no idea where to launch the procs. It looked
for a default hostfile, but didn't find that either.

 

Just add a -host  option to your command line and tell it
where you want the procs to run.

 

 

On Jun 19, 2012, at 10:53 AM,  wrote:





Damien, Shiqing, Jeff?

 

--

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of vimalmat...@eaton.com
Sent: Monday, June 18, 2012 3:32 PM
To: us...@open-mpi.org
Subject: [OMPI users] Using OpenMPI on a network

 

So I configured and compiled a simple MPI program.

Now the issue is when I try to do the same thing on my computer on a
corporate network, I get this error:

 

C:\OpenMPI\openmpi-1.6\installed\bin>mpiexec MPI_Tutorial_1.exe


--

Open RTE was unable to open the hostfile:

C:\OpenMPI\openmpi-1.6\installed\bin/../etc/openmpi-default-hostfile

Check to make sure the path and filename are correct.


--

[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in file
C:\OpenM

PI\openmpi-1.6\orte\mca\ras\base\ras_base_allocate.c at line 200

[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in file
C:\OpenM

PI\openmpi-1.6\orte\mca\plm\base\plm_base_launch_support.c at line 99

[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in file
C:\OpenM

PI\openmpi-1.6\orte\mca\plm\process\plm_process_module.c at line 996

 

What network settings should I be using? I'm sure this is because of the
network because when I unplug the network cable, I get the error message
I got below.

 

Thanks,

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Damien
Sent: Friday, June 15, 2012 3:15 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building MPI on Windows

 

OK, that's what orte_rml_base_select failed means, no TCP connection.
But you should be able to make OpenMPI & mpiexec work without a network
if you're just running in local memory.  There's probably a runtime
parameter to set but I don't know what it is.  Maybe Jeff or Shiqing can
weigh in with what that is.

Damien

On 15/06/2012 1:10 PM, vimalmat...@eaton.com wrote:

Just figured it out.

The only thing different from when it ran yesterday to today was I was
connected to a network. So I connected my laptop to a network and it
worked again.

 

Thanks for all your help, Damien!

I'm sure I'm gonna get stuck more along the way so hoping you can help.

 

--

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Damien
Sent: Friday, June 15, 2012 2:57 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building MPI on Windows

 

Hmmm.  Two things.  Can you run helloworldMPI.exe on it's own?  It
should output "Number of threads = 1, My rank = 0"

Also, can you post the output of ompi_info ?  I think you might still
have some path mixups.  A successful OpenMPI build with this simple
program should just work.

If you still have the other OpenMPIs installed from the binaries, you
might want to try uninstalling all of them and rebooting.  Also if you
rebuilt OpenMPI and helloworldMPI with VS 2010, make sure that
helloworldMPI is actually linked to those VS2010 OpenMPI libs by setting
the right lib path in the Linker options.  Linking to VS2008 libs and
trying to run with VS2010 dlls/exes could cause problems too.

Damien   

On 15/06/2012 11:44 AM, vimalmat...@eaton.com wrote:

Hi Damien,

 

I installed MS Visual Studio 2010 and tried the whole procedure again
and it worked!

That's the great news.

Now the bad news is that I'm trying to run the program again using
mpiexec and it won't!

 

I get these error messages:

orte_rml_base_select failed

orte_ess_set_name failed, with a bunch of text saying it could be due to
configuration or environment problems and will make sense only to an
OpenMPI developer.

 

Help!

 

--

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Damien
Sent: Thursday, June 14, 2012 4:55 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building MPI on Windows

 

You did build the project, right?  The helloworldMPI.exe is in the Debug
directory?

On 14/06/2012 1:49 PM, vimalmat...@eaton.com wrote:

No luck.

Output:

 

Microsoft Windows [Version 6.1.7601]

Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

 

C:\Users\...>cd "C:\Users\C9995799\Downloads\helloworldMPI\Debug"

 


Re: [OMPI users] Using OpenMPI on a network

2012-06-19 Thread Ralph Castain
You're getting that error because you failed to specify any hosts on your cmd 
line - so OMPI has no idea where to launch the procs. It looked for a default 
hostfile, but didn't find that either.

Just add a -host  option to your command line and tell it where you 
want the procs to run.


On Jun 19, 2012, at 10:53 AM,  wrote:

> Damien, Shiqing, Jeff?
>  
> --
> Vimal
>  
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of vimalmat...@eaton.com
> Sent: Monday, June 18, 2012 3:32 PM
> To: us...@open-mpi.org
> Subject: [OMPI users] Using OpenMPI on a network
>  
> So I configured and compiled a simple MPI program.
> Now the issue is when I try to do the same thing on my computer on a 
> corporate network, I get this error:
>  
> C:\OpenMPI\openmpi-1.6\installed\bin>mpiexec MPI_Tutorial_1.exe
> --
> Open RTE was unable to open the hostfile:
> C:\OpenMPI\openmpi-1.6\installed\bin/../etc/openmpi-default-hostfile
> Check to make sure the path and filename are correct.
> --
> [SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in file 
> C:\OpenM
> PI\openmpi-1.6\orte\mca\ras\base\ras_base_allocate.c at line 200
> [SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in file 
> C:\OpenM
> PI\openmpi-1.6\orte\mca\plm\base\plm_base_launch_support.c at line 99
> [SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in file 
> C:\OpenM
> PI\openmpi-1.6\orte\mca\plm\process\plm_process_module.c at line 996
>  
> What network settings should I be using? I’m sure this is because of the 
> network because when I unplug the network cable, I get the error message I 
> got below.
>  
> Thanks,
> Vimal
>  
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Damien
> Sent: Friday, June 15, 2012 3:15 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Building MPI on Windows
>  
> OK, that's what orte_rml_base_select failed means, no TCP connection.  But 
> you should be able to make OpenMPI & mpiexec work without a network if you're 
> just running in local memory.  There's probably a runtime parameter to set 
> but I don't know what it is.  Maybe Jeff or Shiqing can weigh in with what 
> that is.
> 
> Damien
> 
> On 15/06/2012 1:10 PM, vimalmat...@eaton.com wrote:
> Just figured it out.
> The only thing different from when it ran yesterday to today was I was 
> connected to a network. So I connected my laptop to a network and it worked 
> again.
>  
> Thanks for all your help, Damien!
> I’m sure I’m gonna get stuck more along the way so hoping you can help.
>  
> --
> Vimal
>  
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Damien
> Sent: Friday, June 15, 2012 2:57 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Building MPI on Windows
>  
> Hmmm.  Two things.  Can you run helloworldMPI.exe on it's own?  It should 
> output "Number of threads = 1, My rank = 0"
> 
> Also, can you post the output of ompi_info ?  I think you might still have 
> some path mixups.  A successful OpenMPI build with this simple program should 
> just work.
> 
> If you still have the other OpenMPIs installed from the binaries, you might 
> want to try uninstalling all of them and rebooting.  Also if you rebuilt 
> OpenMPI and helloworldMPI with VS 2010, make sure that helloworldMPI is 
> actually linked to those VS2010 OpenMPI libs by setting the right lib path in 
> the Linker options.  Linking to VS2008 libs and trying to run with VS2010 
> dlls/exes could cause problems too.
> 
> Damien   
> 
> On 15/06/2012 11:44 AM, vimalmat...@eaton.com wrote:
> Hi Damien,
>  
> I installed MS Visual Studio 2010 and tried the whole procedure again and it 
> worked!
> That’s the great news.
> Now the bad news is that I’m trying to run the program again using mpiexec 
> and it won’t!
>  
> I get these error messages:
> orte_rml_base_select failed
> orte_ess_set_name failed, with a bunch of text saying it could be due to 
> configuration or environment problems and will make sense only to an OpenMPI 
> developer.
>  
> Help!
>  
> --
> Vimal
>  
> From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On 
> Behalf Of Damien
> Sent: Thursday, June 14, 2012 4:55 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] Building MPI on Windows
>  
> You did build the project, right?  The helloworldMPI.exe is in the Debug 
> directory?
> 
> On 14/06/2012 1:49 PM, vimalmat...@eaton.com wrote:
> No luck.
> Output:
>  
> Microsoft Windows [Version 6.1.7601]
> Copyright (c) 2009 Microsoft Corporation.  All rights reserved.
>  
> C:\Users\...>cd "C:\Users\C9995799\Downloads\helloworldMPI\Debug"
>  
> C:\Users\...\Downloads\helloworldMPI\Debug>mpiexec -n 2 helloworldMPI.exe
> --
> 

Re: [OMPI users] Using OpenMPI on a network

2012-06-19 Thread VimalMathew
Damien, Shiqing, Jeff?

 

--

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of vimalmat...@eaton.com
Sent: Monday, June 18, 2012 3:32 PM
To: us...@open-mpi.org
Subject: [OMPI users] Using OpenMPI on a network

 

So I configured and compiled a simple MPI program.

Now the issue is when I try to do the same thing on my computer on a corporate 
network, I get this error:

 

C:\OpenMPI\openmpi-1.6\installed\bin>mpiexec MPI_Tutorial_1.exe

--

Open RTE was unable to open the hostfile:

C:\OpenMPI\openmpi-1.6\installed\bin/../etc/openmpi-default-hostfile

Check to make sure the path and filename are correct.

--

[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in file C:\OpenM

PI\openmpi-1.6\orte\mca\ras\base\ras_base_allocate.c at line 200

[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in file C:\OpenM

PI\openmpi-1.6\orte\mca\plm\base\plm_base_launch_support.c at line 99

[SOUMIWHP5003567:01884] [[37936,0],0] ORTE_ERROR_LOG: Not found in file C:\OpenM

PI\openmpi-1.6\orte\mca\plm\process\plm_process_module.c at line 996

 

What network settings should I be using? I’m sure this is because of the 
network because when I unplug the network cable, I get the error message I got 
below.

 

Thanks,

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Damien
Sent: Friday, June 15, 2012 3:15 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building MPI on Windows

 

OK, that's what orte_rml_base_select failed means, no TCP connection.  But you 
should be able to make OpenMPI & mpiexec work without a network if you're just 
running in local memory.  There's probably a runtime parameter to set but I 
don't know what it is.  Maybe Jeff or Shiqing can weigh in with what that is.

Damien

On 15/06/2012 1:10 PM, vimalmat...@eaton.com wrote: 

Just figured it out.

The only thing different from when it ran yesterday to today was I was 
connected to a network. So I connected my laptop to a network and it worked 
again.

 

Thanks for all your help, Damien!

I’m sure I’m gonna get stuck more along the way so hoping you can help.

 

--

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Damien
Sent: Friday, June 15, 2012 2:57 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building MPI on Windows

 

Hmmm.  Two things.  Can you run helloworldMPI.exe on it's own?  It should 
output "Number of threads = 1, My rank = 0"

Also, can you post the output of ompi_info ?  I think you might still have some 
path mixups.  A successful OpenMPI build with this simple program should just 
work.

If you still have the other OpenMPIs installed from the binaries, you might 
want to try uninstalling all of them and rebooting.  Also if you rebuilt 
OpenMPI and helloworldMPI with VS 2010, make sure that helloworldMPI is 
actually linked to those VS2010 OpenMPI libs by setting the right lib path in 
the Linker options.  Linking to VS2008 libs and trying to run with VS2010 
dlls/exes could cause problems too.

Damien   

On 15/06/2012 11:44 AM, vimalmat...@eaton.com wrote: 

Hi Damien,

 

I installed MS Visual Studio 2010 and tried the whole procedure again and it 
worked!

That’s the great news.

Now the bad news is that I’m trying to run the program again using mpiexec and 
it won’t!

 

I get these error messages: 

orte_rml_base_select failed

orte_ess_set_name failed, with a bunch of text saying it could be due to 
configuration or environment problems and will make sense only to an OpenMPI 
developer.

 

Help!

 

--

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Damien
Sent: Thursday, June 14, 2012 4:55 PM
To: Open MPI Users
Subject: Re: [OMPI users] Building MPI on Windows

 

You did build the project, right?  The helloworldMPI.exe is in the Debug 
directory?

On 14/06/2012 1:49 PM, vimalmat...@eaton.com wrote: 

No luck.

Output:

 

Microsoft Windows [Version 6.1.7601]

Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

 

C:\Users\...>cd "C:\Users\C9995799\Downloads\helloworldMPI\Debug"

 

C:\Users\...\Downloads\helloworldMPI\Debug>mpiexec -n 2 helloworldMPI.exe

--

mpiexec was unable to launch the specified application as it could not find an e

xecutable:

 

Executable: helloworldMPI.exe

Node: SOUMIWHP5003567

 

while attempting to start process rank 0.

--

2 total processes failed to start

 

C:\Users\...\Downloads\helloworldMPI\Debug>

 

--

Vimal

 

From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf 
Of Damien
Sent: Thursday, June 14, 2012 3:38 PM
To: Open MPI Users
Subject: Re: 

Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__" does not take arguments

2012-06-19 Thread Dmitry N. Mikushin
Dear Rolf,

I compiled openmpi-trunk with $ ../configure --prefix=/opt/openmpi-trunk
--disable-mpi-interface-warning --with-cuda=/opt/cuda
And that error is now gone!

Thanks a lot for your assistance,
- D.

2012/6/19 Rolf vandeVaart 

> Dmitry:
>
> ** **
>
> It turns out that by default in Open MPI 1.7, configure enables warnings
> for deprecated MPI functionality.  In Open MPI 1.6, these warnings were
> disabled by default.
>
> That explains why you would not see this issue in the earlier versions of
> Open MPI.
>
> ** **
>
> I assume that gcc must have added support for
> __attribute__((__deprecated__)) and then later on
> __attribute__((__deprecated__(msg))) and your version of gcc supports both
> of these.  (My version of gcc, 4.5.1 does not support the msg in the
> attribute)
>
> ** **
>
> The version of nvcc you have does not support the "msg" argument so
> everything blows up.
>
> ** **
>
> I suggest you configure with -disable-mpi-interface-warning which will
> prevent any of the deprecated attributes from being used and then things
> should work fine.
>
> ** **
>
> Let me know if this fixes your problem.
>
> ** **
>
> Rolf
>
> ** **
>
> *From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On
> Behalf Of *Rolf vandeVaart
> *Sent:* Monday, June 18, 2012 11:00 AM
>
> *To:* Open MPI Users
> *Cc:* Олег Рябков
> *Subject:* Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__"
> does not take arguments
>
> ** **
>
> Hi Dmitry:
>
> Let me look into this.
>
> ** **
>
> Rol*f*
>
> ** **
>
> *From:* users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] *On
> Behalf Of *Dmitry N. Mikushin
> *Sent:* Monday, June 18, 2012 10:56 AM
> *To:* Open MPI Users
> *Cc:* Олег Рябков
> *Subject:* Re: [OMPI users] NVCC mpi.h: error: attribute "__deprecated__"
> does not take arguments
>
> ** **
>
> Yeah, definitely. Thank you, Jeff.
>
> - D.
>
> 2012/6/18 Jeff Squyres 
>
> On Jun 18, 2012, at 10:41 AM, Dmitry N. Mikushin wrote:
>
> > No, I'm configuring with gcc, and for openmpi-1.6 it works with nvcc
> without a problem.
>
> Then I think Rolf (from Nvidia) should figure this out; I don't have
> access to nvcc.  :-)
>
>
> > Actually, nvcc always meant to be more or less compatible with gcc, as
> far as I know. I'm guessing in case of trunk nvcc is the source of the
> issue.
> >
> > And with ./configure CC=nvcc etc. it won't build:
> >
> /home/dmikushin/forge/openmpi-trunk/opal/mca/event/libevent2019/libevent/include/event2/util.h:126:2:
> error: #error "No way to define ev_uint64_t"
>
> You should complain to Nvidia about that.
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
> ** **
> --
>
> This email message is for the sole use of the intended recipient(s) and
> may contain confidential information.  Any unauthorized review, use,
> disclosure or distribution is prohibited.  If you are not the intended
> recipient, please contact the sender by reply email and destroy all copies
> of the original message. 
> --
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


[OMPI users] 2012/06/18 14:35:07 自动保存草稿

2012-06-19 Thread 陈松
Hi all,Can anyone explain me the fault tolerant features in OpenMPI? I've read 
the FAQs and some papers about this topic listed in open-mpi.org, but still 
can't figure out when one node of my supercomputer system fails down during 
computing, what would happen with the fault tolerant mechanism in OpenMPI, and 
what should we system administrator do after the failure (or before).Can 
anyone help me? My boss want me to deploy OpenMPI in our system cuz he want the 
fault tolerant feature.Thanks very much.---CHEN SongRD 
DepartmentNational Supercomputer Center in TianjinBinhai New Area, Tianjin, 
China

[OMPI users] checkpointing of NPB

2012-06-19 Thread Ifeanyi
Dear,

Please help.

I configured the open mpi and it can checkpoint HPL.

However, whenever I want to checkpoint NAS parallel benchmark it kills the
application without informative message.

Please how do I configure the openmpi 1.6 to checkpoint NPB? I really need
a help, I have been on this issue for the past few days without solution

Regards,
Ifeanyi