Re: [OMPI users] open MPI please recommend a debugger for open MPI

2010-10-29 Thread Ashley Pittman

Not without a list of hostnames it's not any use no, if you can get that, then 
I have something to work with.  From looking around on google -n might help 
here.  Once I have this info you'll need to verify that you are able to ssh to 
these nodes without a password, that pdsh is installed and give me the names of 
an environment variable that pbs sets for ranks within a job.

I'm sure we can get something working but it might be better to take this 
off-list or to the padb-users list to avoid spamming the Open-MPI users list.

Ashley.

On 29 Oct 2010, at 18:44, Jack Bryan wrote:

> Hi, 
> 
> this is what I got :
> 
> -bash-3.2$ qstat -n -u myName
> 
> clsuter:
>  
> Req'd  Req'd   Elap
> Job ID   Username QueueJobname  SessID NDS   TSK 
> Memory Time  S Time
>     -- - --- 
> -- - - -
> 48933.cluster.e myName   develmyJob  107835 1  ----  
> 00:02 C 00:00
>n20/0
> 
> Any help is appreciated. 
> 
> thanks
> 
> > From: ash...@pittman.co.uk
> > Date: Fri, 29 Oct 2010 18:38:25 +0100
> > To: us...@open-mpi.org
> > Subject: Re: [OMPI users] open MPI please recommend a debugger for open MPI
> > 
> > 
> > Can you try the following and send me the output.
> > 
> > qstat -n -u `whoami` @clusterName
> > 
> > The output sent before implies that your cluster is called "clusterName" 
> > rather than "cluster" which is a little surprising but let's see what it 
> > gives us if we query on that basis.
> > 
> > Ashley.
> > 
> > On 29 Oct 2010, at 18:29, Jack Bryan wrote:
> > 
> > > thanks
> > > 
> > > I have run padb (the new one with your patch) on my system and got :
> > > 
> > > -bash-3.2$ padb -Ormgr=pbs -Q 48516.cluster
> > > $VAR1 = {};
> > > Job 48516.cluster is not active
> > > 
> > > Actually, the job is running. 
> > > 
> > > How to check whether my system has pbs_pro ?
> > > 
> > > Any help is appreciated. 
> > > 
> > > thanks
> > > Jinxu Ding
> > > 
> > > Oct. 29 2010
> > > 
> > > 
> > > > From: ash...@pittman.co.uk
> > > > Date: Fri, 29 Oct 2010 18:21:46 +0100
> > > > To: us...@open-mpi.org
> > > > Subject: Re: [OMPI users] open MPI please recommend a debugger for open 
> > > > MPI
> > > > 
> > > > 
> > > > On 29 Oct 2010, at 12:06, Jeremy Roberts wrote:
> > > > 
> > > > > I'd suggest looking into TotalView (http://www.totalviewtech.com) 
> > > > > and/or DDT (http://www.allinea.com/). I've used TotalView pretty 
> > > > > extensively and found it to be pretty easy to use. They are both 
> > > > > commercial, however, and not cheap. 
> > > > > 
> > > > > As far as I know, there isn't a whole lot of open source support for 
> > > > > parallel debugging. The Parallel Tools Platform of Eclipse claims to 
> > > > > provide a parallel debugger, though I have yet to try it 
> > > > > (http://www.eclipse.org/ptp/).
> > > > 
> > > > Jeremy has covered the graphical parallel debuggers that I'm aware of, 
> > > > for a different approach there is padb which isn't a "parallel 
> > > > debugger" in the traditional model but is able to show you the same 
> > > > type of information, it won't allow you to point-and-click through the 
> > > > source or single step through the code but it is lightweight and will 
> > > > show you the information which you need to know. 
> > > > 
> > > > Padb needs to integrate with the resource manager, I know it works with 
> > > > pbs_pro but it seems there are a few issues on your system which is pbs 
> > > > (without the pro). I can help you with this and work through the 
> > > > problems but only if you work with me and provide details of the 
> > > > integration, in particular I've sent you a version which has a small 
> > > > patch and some debug printfs added, if you could send me the output 
> > > > from this I'd be able to tell you if it was likely to work and how to 
> > > > go about making it do so.
> > > > 
> > > > Ashley.
> > > > 
> > > > -- 
> > > > 
> > > > Ashley Pittman, Bath, UK.
> > > > 
> > > > Padb - A parallel job inspection tool for cluster computing
> > > > http://padb.pittman.org.uk
> > > > 
> > > > 
> > > > ___
> > > > users mailing list
> > > > us...@open-mpi.org
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > 
> > -- 
> > 
> > Ashley Pittman, Bath, UK.
> > 
> > Padb - A parallel job inspection tool for cluster computing
> > http://padb.pittman.org.uk
> > 
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--

Re: [OMPI users] open MPI please recommend a debugger for open MPI

2010-10-29 Thread Jack Bryan

Hi, 
this is what I got :
-bash-3.2$ qstat -n -u myName
clsuter:
 Req'd  Req'd   ElapJob ID   Username QueueJobname  
SessID NDS   TSK Memory Time  S Time   
 -- - --- -- - - -48933.cluster.e 
myName   develmyJob  107835 1  ----  00:02 C 00:00   n20/0
Any help is appreciated. 
thanks
> From: ash...@pittman.co.uk
> Date: Fri, 29 Oct 2010 18:38:25 +0100
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] open MPI please recommend a debugger for open MPI
> 
> 
> Can you try the following and send me the output.
> 
> qstat -n -u `whoami` @clusterName
> 
> The output sent before implies that your cluster is called "clusterName" 
> rather than "cluster" which is a little surprising but let's see what it 
> gives us if we query on that basis.
> 
> Ashley.
> 
> On 29 Oct 2010, at 18:29, Jack Bryan wrote:
> 
> > thanks
> > 
> > I have run padb (the new one with your patch) on my system and got :
> > 
> > -bash-3.2$ padb -Ormgr=pbs -Q 48516.cluster
> > $VAR1 = {};
> > Job 48516.cluster  is not active
> > 
> > Actually, the job is running. 
> > 
> > How to check whether my system has pbs_pro ?
> > 
> > Any help is appreciated. 
> > 
> > thanks
> > Jinxu Ding
> > 
> > Oct. 29 2010
> > 
> > 
> > > From: ash...@pittman.co.uk
> > > Date: Fri, 29 Oct 2010 18:21:46 +0100
> > > To: us...@open-mpi.org
> > > Subject: Re: [OMPI users] open MPI please recommend a debugger for open 
> > > MPI
> > > 
> > > 
> > > On 29 Oct 2010, at 12:06, Jeremy Roberts wrote:
> > > 
> > > > I'd suggest looking into TotalView (http://www.totalviewtech.com) 
> > > > and/or DDT (http://www.allinea.com/). I've used TotalView pretty 
> > > > extensively and found it to be pretty easy to use. They are both 
> > > > commercial, however, and not cheap. 
> > > > 
> > > > As far as I know, there isn't a whole lot of open source support for 
> > > > parallel debugging. The Parallel Tools Platform of Eclipse claims to 
> > > > provide a parallel debugger, though I have yet to try it 
> > > > (http://www.eclipse.org/ptp/).
> > > 
> > > Jeremy has covered the graphical parallel debuggers that I'm aware of, 
> > > for a different approach there is padb which isn't a "parallel debugger" 
> > > in the traditional model but is able to show you the same type of 
> > > information, it won't allow you to point-and-click through the source or 
> > > single step through the code but it is lightweight and will show you the 
> > > information which you need to know. 
> > > 
> > > Padb needs to integrate with the resource manager, I know it works with 
> > > pbs_pro but it seems there are a few issues on your system which is pbs 
> > > (without the pro). I can help you with this and work through the problems 
> > > but only if you work with me and provide details of the integration, in 
> > > particular I've sent you a version which has a small patch and some debug 
> > > printfs added, if you could send me the output from this I'd be able to 
> > > tell you if it was likely to work and how to go about making it do so.
> > > 
> > > Ashley.
> > > 
> > > -- 
> > > 
> > > Ashley Pittman, Bath, UK.
> > > 
> > > Padb - A parallel job inspection tool for cluster computing
> > > http://padb.pittman.org.uk
> > > 
> > > 
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> -- 
> 
> Ashley Pittman, Bath, UK.
> 
> Padb - A parallel job inspection tool for cluster computing
> http://padb.pittman.org.uk
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
  

Re: [OMPI users] open MPI please recommend a debugger for open MPI

2010-10-29 Thread Ashley Pittman

Can you try the following and send me the output.

qstat -n -u `whoami` @clusterName

The output sent before implies that your cluster is called "clusterName" rather 
than "cluster" which is a little surprising but let's see what it gives us if 
we query on that basis.

Ashley.

On 29 Oct 2010, at 18:29, Jack Bryan wrote:

> thanks
> 
> I have run padb (the new one with your patch) on my system and got :
> 
> -bash-3.2$ padb -Ormgr=pbs -Q 48516.cluster
> $VAR1 = {};
> Job 48516.cluster  is not active
> 
> Actually, the job is running. 
> 
> How to check whether my system has pbs_pro ?
> 
> Any help is appreciated. 
> 
> thanks
> Jinxu Ding
> 
> Oct. 29 2010
> 
> 
> > From: ash...@pittman.co.uk
> > Date: Fri, 29 Oct 2010 18:21:46 +0100
> > To: us...@open-mpi.org
> > Subject: Re: [OMPI users] open MPI please recommend a debugger for open MPI
> > 
> > 
> > On 29 Oct 2010, at 12:06, Jeremy Roberts wrote:
> > 
> > > I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or 
> > > DDT (http://www.allinea.com/). I've used TotalView pretty extensively and 
> > > found it to be pretty easy to use. They are both commercial, however, and 
> > > not cheap. 
> > > 
> > > As far as I know, there isn't a whole lot of open source support for 
> > > parallel debugging. The Parallel Tools Platform of Eclipse claims to 
> > > provide a parallel debugger, though I have yet to try it 
> > > (http://www.eclipse.org/ptp/).
> > 
> > Jeremy has covered the graphical parallel debuggers that I'm aware of, for 
> > a different approach there is padb which isn't a "parallel debugger" in the 
> > traditional model but is able to show you the same type of information, it 
> > won't allow you to point-and-click through the source or single step 
> > through the code but it is lightweight and will show you the information 
> > which you need to know. 
> > 
> > Padb needs to integrate with the resource manager, I know it works with 
> > pbs_pro but it seems there are a few issues on your system which is pbs 
> > (without the pro). I can help you with this and work through the problems 
> > but only if you work with me and provide details of the integration, in 
> > particular I've sent you a version which has a small patch and some debug 
> > printfs added, if you could send me the output from this I'd be able to 
> > tell you if it was likely to work and how to go about making it do so.
> > 
> > Ashley.
> > 
> > -- 
> > 
> > Ashley Pittman, Bath, UK.
> > 
> > Padb - A parallel job inspection tool for cluster computing
> > http://padb.pittman.org.uk
> > 
> > 
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk




Re: [OMPI users] open MPI please recommend a debugger for open MPI

2010-10-29 Thread Jack Bryan

thanksI have run padb (the new one with your patch) on my system and got 
:-bash-3.2$ padb -Ormgr=pbs -Q 48516.cluster$VAR1 = {};Job 48516.cluster  is 
not activeActually, the job is running. 
How to check whether my system has pbs_pro ?
Any help is appreciated. thanksJinxu DingOct. 29 2010

> From: ash...@pittman.co.uk
> Date: Fri, 29 Oct 2010 18:21:46 +0100
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] open MPI please recommend a debugger for open MPI
> 
> 
> On 29 Oct 2010, at 12:06, Jeremy Roberts wrote:
> 
> > I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or 
> > DDT (http://www.allinea.com/).  I've used TotalView pretty extensively and 
> > found it to be pretty easy to use.  They are both commercial, however, and 
> > not cheap.  
> > 
> > As far as I know, there isn't a whole lot of open source support for 
> > parallel debugging. The Parallel Tools Platform of Eclipse claims to 
> > provide a parallel debugger, though I have yet to try it 
> > (http://www.eclipse.org/ptp/).
> 
> Jeremy has covered the graphical parallel debuggers that I'm aware of, for a 
> different approach there is padb which isn't a "parallel debugger" in the 
> traditional model but is able to show you the same type of information, it 
> won't allow you to point-and-click through the source or single step through 
> the code but it is lightweight and will show you the information which you 
> need to know. 
> 
> Padb needs to integrate with the resource manager, I know it works with 
> pbs_pro but it seems there are a few issues on your system which is pbs 
> (without the pro).  I can help you with this and work through the problems 
> but only if you work with me and provide details of the integration, in 
> particular I've sent you a version which has a small patch and some debug 
> printfs added, if you could send me the output from this I'd be able to tell 
> you if it was likely to work and how to go about making it do so.
> 
> Ashley.
> 
> -- 
> 
> Ashley Pittman, Bath, UK.
> 
> Padb - A parallel job inspection tool for cluster computing
> http://padb.pittman.org.uk
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
  

Re: [OMPI users] open MPI please recommend a debugger for open MPI

2010-10-29 Thread Ashley Pittman

On 29 Oct 2010, at 12:06, Jeremy Roberts wrote:

> I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or DDT 
> (http://www.allinea.com/).  I've used TotalView pretty extensively and found 
> it to be pretty easy to use.  They are both commercial, however, and not 
> cheap.  
> 
> As far as I know, there isn't a whole lot of open source support for parallel 
> debugging. The Parallel Tools Platform of Eclipse claims to provide a 
> parallel debugger, though I have yet to try it (http://www.eclipse.org/ptp/).

Jeremy has covered the graphical parallel debuggers that I'm aware of, for a 
different approach there is padb which isn't a "parallel debugger" in the 
traditional model but is able to show you the same type of information, it 
won't allow you to point-and-click through the source or single step through 
the code but it is lightweight and will show you the information which you need 
to know. 

Padb needs to integrate with the resource manager, I know it works with pbs_pro 
but it seems there are a few issues on your system which is pbs (without the 
pro).  I can help you with this and work through the problems but only if you 
work with me and provide details of the integration, in particular I've sent 
you a version which has a small patch and some debug printfs added, if you 
could send me the output from this I'd be able to tell you if it was likely to 
work and how to go about making it do so.

Ashley.

-- 

Ashley Pittman, Bath, UK.

Padb - A parallel job inspection tool for cluster computing
http://padb.pittman.org.uk




Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)

2010-10-29 Thread Reuti
Am 29.10.2010 um 18:47 schrieb Jeff Squyres:

> On Oct 29, 2010, at 12:40 PM, Reuti wrote:
> 
>>> I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally 
>>> *NOT* assume that different versions like this are compatible.
>> 
>> I'm getting confused, as these versions are exactly fitting "x.(y+1).*" 
>> which you mention below. So they should work together by design.
> 
> It depends on what you mean by "work together".
> 
> 1. OMPI provides an ABI guarantee for x.y.* and x.(y+1).*, where y is odd.  
> So if you compile your MPI app with Open MPI v1.4.1, it'll work just fine 
> with 1.4.3.  (the only disclaimer is that this guarantee started with 
> v1.3.2).  Note that y must be odd -- so if you compile your MPI app with 
> v1.4.1, it does *not* necessarily work with v1.5.  Indeed, we broken ABI 
> between the v1.3/v1.4 series and the v1.5 series (our ABI guarantee allows us 
> to do this).

Yep, I read it this way in your first reply.


> 2. OMPI does *not* provide multi-version *interoperability* guarantees.  Say 
> you compile your MPI app against OMPI v1.4.1.  Then you run it across a bunch 
> of nodes, but some nodes have OMPI v1.4.1 on them and others have OMPI v1.4.3 
> (i.e., your app gets libmpi.so from v1.4.1 on some nodes and libmpi.so from 
> v1.4.3 on other nodes).  This is absolutely not guaranteed to work -- we 
> don't even try to maintain this kind of compatibility.

Aha, now I see. When all are the same, it's for sure no problem, but with 
different ones on different nodes you get a mixture of libraries then of course 
for one and the same execution. So, when e.g. the protocol for the message 
which is send to another node changed, it will break.


NB: If I would upgrade my cluster in two steps, I would for a short time adjust 
the queuing system to get nodes for each parallel job where all have the same 
version then.

-- Reuti


> 
> Does that make sense?
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)

2010-10-29 Thread guillaume ranquet
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

I guess we will play it safe and upgrade every cluster at once so that
we won't get bad surprises.

thank you Jeff.
On 10/29/2010 06:40 PM, Reuti wrote:
> Hi,
> 
> Am 29.10.2010 um 18:27 schrieb Jeff Squyres:
> 
>> I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally *NOT* 
>> assume that different versions like this are compatible.
> 
> I'm getting confused, as these versions are exactly fitting "x.(y+1).*" which 
> you mention below. So they should work together by design.
> 
> -- Reuti
> 
> 
>> Open MPI makes an ABI promise (that started with version 1.3.2) that all the 
>> releases in a given feature series and its corresponding super-stable series 
>> (i.e., x.y.* and x.(y+1).*, where y is odd) are ABI compatible.  But we make 
>> no guarantees about wire protocols being compatible, or other things like 
>> that.  
>>
>> So in general, it's "pleasantly surprising" if the different releases work 
>> together, but I wouldn't rely on it *at all*.  :-)
>>

If I get it well, ABI compatible means something compiled with x.y.*
will run on x.y+1.* without the need for you to recompile.
mixing x.y and x.y+1 on the same machinefile (and that's what we are
talking about) can only work by accident, not by design.

>>
>> On Oct 29, 2010, at 12:12 PM, guillaume ranquet wrote:
>>
> Hi list,
> I'm sorry to bother you with a stupid question.
> 
> we intend to have for a short period of time, some nodes with 1.4.3 and
> others with 1.4.1 (before upgrading everyone to 1.4.3).
> 
> I made various test and found both versions to be running together quite
> well with a mixed set of nodes.
> 
> my tests were quite simple, I compiled and ran mpi hello_worlds with
> both versions.
> It wouldn't be serious for me to assume both versions fully compatible
> after these tests -and I must admit I lack the time and technical
> knowledge to run further testing.
> 
> has anyone any insight on what have changed that would break compatibility?
> I guess nothing, since they are the same major.minor :)
> 
> 
> regards,
> Guillaume Ranquet.
>>>
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> -- 
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users

> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJMyvxLAAoJEEzIl7PMEAliTnoH/R4GhehUiFZo6eeSh/Rv9KJc
ZhAJIRTFH0z7+R2V4ggDyIWFVEv0mktQq/WEqQTbGNyVVvhWVFjCxrI7deZ+FkZS
EFv9oIlKM6gNR+cFdoN4xW4ZfiIoCTGddG6XOxLXkZQnhaG30s5UUmIuoBLvgQhb
mTq43WdEPpWsSuyMzo48hizT1PFqpPR101ITnIa2y4T5FC5QktJhbp85HbPaNE2Z
ej7kwXcgLEnTDk9wF4rZRah8vdIdtxwghwGhytVLqMFBCB4MR8hWMYTakJbIOt/7
GkFtOv0D7hruHhl9dNk+o8VyHMQq6bzlqs3UdQxW1Hx1N2w0ngHK6fzfUnYRVVY=
=TsJh
-END PGP SIGNATURE-



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)

2010-10-29 Thread Jeff Squyres
On Oct 29, 2010, at 12:40 PM, Reuti wrote:

>> I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally *NOT* 
>> assume that different versions like this are compatible.
> 
> I'm getting confused, as these versions are exactly fitting "x.(y+1).*" which 
> you mention below. So they should work together by design.

It depends on what you mean by "work together".

1. OMPI provides an ABI guarantee for x.y.* and x.(y+1).*, where y is odd.  So 
if you compile your MPI app with Open MPI v1.4.1, it'll work just fine with 
1.4.3.  (the only disclaimer is that this guarantee started with v1.3.2).  Note 
that y must be odd -- so if you compile your MPI app with v1.4.1, it does *not* 
necessarily work with v1.5.  Indeed, we broken ABI between the v1.3/v1.4 series 
and the v1.5 series (our ABI guarantee allows us to do this).

2. OMPI does *not* provide multi-version *interoperability* guarantees.  Say 
you compile your MPI app against OMPI v1.4.1.  Then you run it across a bunch 
of nodes, but some nodes have OMPI v1.4.1 on them and others have OMPI v1.4.3 
(i.e., your app gets libmpi.so from v1.4.1 on some nodes and libmpi.so from 
v1.4.3 on other nodes).  This is absolutely not guaranteed to work -- we don't 
even try to maintain this kind of compatibility.

Does that make sense?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)

2010-10-29 Thread Reuti
Hi,

Am 29.10.2010 um 18:27 schrieb Jeff Squyres:

> I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally *NOT* 
> assume that different versions like this are compatible.

I'm getting confused, as these versions are exactly fitting "x.(y+1).*" which 
you mention below. So they should work together by design.

-- Reuti


> Open MPI makes an ABI promise (that started with version 1.3.2) that all the 
> releases in a given feature series and its corresponding super-stable series 
> (i.e., x.y.* and x.(y+1).*, where y is odd) are ABI compatible.  But we make 
> no guarantees about wire protocols being compatible, or other things like 
> that.  
> 
> So in general, it's "pleasantly surprising" if the different releases work 
> together, but I wouldn't rely on it *at all*.  :-)
> 
> 
> On Oct 29, 2010, at 12:12 PM, guillaume ranquet wrote:
> 
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>> 
>> Hi list,
>> I'm sorry to bother you with a stupid question.
>> 
>> we intend to have for a short period of time, some nodes with 1.4.3 and
>> others with 1.4.1 (before upgrading everyone to 1.4.3).
>> 
>> I made various test and found both versions to be running together quite
>> well with a mixed set of nodes.
>> 
>> my tests were quite simple, I compiled and ran mpi hello_worlds with
>> both versions.
>> It wouldn't be serious for me to assume both versions fully compatible
>> after these tests -and I must admit I lack the time and technical
>> knowledge to run further testing.
>> 
>> has anyone any insight on what have changed that would break compatibility?
>> I guess nothing, since they are the same major.minor :)
>> 
>> 
>> regards,
>> Guillaume Ranquet.
>> -BEGIN PGP SIGNATURE-
>> Version: GnuPG v2.0.16 (GNU/Linux)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
>> 
>> iQEcBAEBAgAGBQJMyvJTAAoJEEzIl7PMEAliT4kH/RY4WXhEO5R8H3DNIWW7Y91z
>> 6q4BrymLrBSl7rnnEgALBMiPGK9lQgqEtv9k3xCFbfOfWXIFreIHH8ZFDzt1BjQI
>> TZ58SwVE9CIMmESoJ1P52R+WCbKYur3U2eda//1cfnZ28ZYjnKN/xYlT/wv8hqg3
>> GsW+seMR8X+1nNFkH1UQHIBVO2cXaK24BtSe4cvDFaMaUbe0Qlmxg55BbCSYB4ED
>> VBbplp/ty0tojmZdJLqSsp7nZ84oCfvAfZf16fJTDHNYhUSvNz/fldnxWrm7WTUb
>> VzM94yJf0IHfNAB/YvpXECGFL9cPWeG/F6Bm+r6GSMRvd0MeLbp1HWJTbVYlCwo=
>> =NwEP
>> -END PGP SIGNATURE-
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Need Help for understand heat equation 2D mpi solving version

2010-10-29 Thread Eugene Loh




christophe petit wrote:

  
  i
am still trying to understand the parallelized version of the heat
equation 2D solving that we saw at school.
I am confused between the shift of the values near to the bounds done
by the "updateBound" routine  and the main loop (at line 161 in main
code)  which calls the routine "Explicit".
  
  

Each process "owns" a subdomain of cells, for which it will compute
updated values.  The process has storage not only for these cells,
which it owns, but also for a perimeter of cells, whose values need to
be fetched from nearby processes.  So, there are two steps.  In
"updateBound", processes communicate so that each supplies boundary
values to neighbors and gets boundary values from neighbors.  In
"Explicit", the computation (stencil operation) is performed.

  
  For
a given process (say number 1) ( i use 4 here for execution), i send to
the east process (3) the penultimate
column left column, to the north process (0) the penultimate row top ,and to the others
(mpi_proc_null=-2) 
the penultimate right
column and the bottom row. But how the 4  processes are
synchronous ?
  

When UpdateBound is called, neighboring processes are implicitly
synchronized via the MPI_Sendrecv() calls.

  
  I
don't understand too why all the
processes go through the solving piece of code calling
the "Explicit" routine.
  
  

The computational domain is distributed among all processes.  Each cell
must be updated with the stencil operation.  So, each process calls
that computation for the cells that it owns.

You should be able to get better interactivity at your school than on
this mailing list.  Further, your questions at school would help the
instructor get feedback from the students.




Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)

2010-10-29 Thread Jeff Squyres
I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally *NOT* 
assume that different versions like this are compatible.

Open MPI makes an ABI promise (that started with version 1.3.2) that all the 
releases in a given feature series and its corresponding super-stable series 
(i.e., x.y.* and x.(y+1).*, where y is odd) are ABI compatible.  But we make no 
guarantees about wire protocols being compatible, or other things like that.  

So in general, it's "pleasantly surprising" if the different releases work 
together, but I wouldn't rely on it *at all*.  :-)


On Oct 29, 2010, at 12:12 PM, guillaume ranquet wrote:

> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
> 
> Hi list,
> I'm sorry to bother you with a stupid question.
> 
> we intend to have for a short period of time, some nodes with 1.4.3 and
> others with 1.4.1 (before upgrading everyone to 1.4.3).
> 
> I made various test and found both versions to be running together quite
> well with a mixed set of nodes.
> 
> my tests were quite simple, I compiled and ran mpi hello_worlds with
> both versions.
> It wouldn't be serious for me to assume both versions fully compatible
> after these tests -and I must admit I lack the time and technical
> knowledge to run further testing.
> 
> has anyone any insight on what have changed that would break compatibility?
> I guess nothing, since they are the same major.minor :)
> 
> 
> regards,
> Guillaume Ranquet.
> -BEGIN PGP SIGNATURE-
> Version: GnuPG v2.0.16 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
> 
> iQEcBAEBAgAGBQJMyvJTAAoJEEzIl7PMEAliT4kH/RY4WXhEO5R8H3DNIWW7Y91z
> 6q4BrymLrBSl7rnnEgALBMiPGK9lQgqEtv9k3xCFbfOfWXIFreIHH8ZFDzt1BjQI
> TZ58SwVE9CIMmESoJ1P52R+WCbKYur3U2eda//1cfnZ28ZYjnKN/xYlT/wv8hqg3
> GsW+seMR8X+1nNFkH1UQHIBVO2cXaK24BtSe4cvDFaMaUbe0Qlmxg55BbCSYB4ED
> VBbplp/ty0tojmZdJLqSsp7nZ84oCfvAfZf16fJTDHNYhUSvNz/fldnxWrm7WTUb
> VzM94yJf0IHfNAB/YvpXECGFL9cPWeG/F6Bm+r6GSMRvd0MeLbp1HWJTbVYlCwo=
> =NwEP
> -END PGP SIGNATURE-
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)

2010-10-29 Thread guillaume ranquet
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi list,
I'm sorry to bother you with a stupid question.

we intend to have for a short period of time, some nodes with 1.4.3 and
others with 1.4.1 (before upgrading everyone to 1.4.3).

I made various test and found both versions to be running together quite
well with a mixed set of nodes.

my tests were quite simple, I compiled and ran mpi hello_worlds with
both versions.
It wouldn't be serious for me to assume both versions fully compatible
after these tests -and I must admit I lack the time and technical
knowledge to run further testing.

has anyone any insight on what have changed that would break compatibility?
I guess nothing, since they are the same major.minor :)


regards,
Guillaume Ranquet.
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.16 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJMyvJTAAoJEEzIl7PMEAliT4kH/RY4WXhEO5R8H3DNIWW7Y91z
6q4BrymLrBSl7rnnEgALBMiPGK9lQgqEtv9k3xCFbfOfWXIFreIHH8ZFDzt1BjQI
TZ58SwVE9CIMmESoJ1P52R+WCbKYur3U2eda//1cfnZ28ZYjnKN/xYlT/wv8hqg3
GsW+seMR8X+1nNFkH1UQHIBVO2cXaK24BtSe4cvDFaMaUbe0Qlmxg55BbCSYB4ED
VBbplp/ty0tojmZdJLqSsp7nZ84oCfvAfZf16fJTDHNYhUSvNz/fldnxWrm7WTUb
VzM94yJf0IHfNAB/YvpXECGFL9cPWeG/F6Bm+r6GSMRvd0MeLbp1HWJTbVYlCwo=
=NwEP
-END PGP SIGNATURE-



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [OMPI users] open MPI please recommend a debugger for open MPI

2010-10-29 Thread Brian Austin
I find that using mpirun to launch multiple instance of a serial
debugger is fairly usable (but not perfect) for jobs with fewer than
about four processes.
A description of how to do this is here:
http://www.open-mpi.org/faq/?category=debugging

The biggest drawbacks to this approach are that
a) setting breakpoints and stepping between lines must be controlled
separately for each process
b) restarting the job requires ending all of your debugger sessions.

-Brian

On Fri, Oct 29, 2010 at 4:06 AM, Jeremy Roberts
 wrote:
> I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or DDT
> (http://www.allinea.com/).  I've used TotalView pretty extensively and found
> it to be pretty easy to use.  They are both commercial, however, and not
> cheap.
>
> As far as I know, there isn't a whole lot of open source support for
> parallel debugging. The Parallel Tools Platform of Eclipse claims to provide
> a parallel debugger, though I have yet to try it
> (http://www.eclipse.org/ptp/).
>
> Jeremy
>
> On Fri, Oct 29, 2010 at 12:55 AM, Jack Bryan  wrote:
>>
>> Hi,
>> Would you please recommend a debugger, which can do debugging for parallel
>> processes
>> on Open MPI systems ?
>> I hope that it can be installed without root right because I am not a root
>> user for our
>> MPI cluster.
>> Any help is appreciated.
>> Thanks
>> Jack
>> Oct. 28 2010
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] failed to install openmpi trunk

2010-10-29 Thread Ralph Castain
Couple of things stand out:

1. you definitely don't want to use a copy of the trunk beyond r23924. The 
developer's trunk is undergoing some major change and orcm no longer is in-sync 
with it. I probably won't update orcm to match until later this year (will 
freeze integration at r23924).

2. the configure options don't look right to me - they should simply be:
./configure --prefix= --with-platform=contrib/platform/cisco/linux

I believe the errors are caused by confusion due to the various configure 
options.

HTH
Ralph

On Oct 29, 2010, at 3:32 AM, Vasiliy G Tolstov wrote:

> Hello. I'm try to build orcm , in dependencies it need openmpi trunk
> with some options have been enabled.
> 
> Install fails with message:
> Creating orte-migrate.1 man page...
> x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../../opal/include
> -I../../../orte/include
> -I../../../opal/mca/paffinity/hwloc/hwloc/include/private
> -I../../../opal/mca/paffinity/hwloc/hwloc/include/hwloc   -I../../..
> -march=native -pipe -O2 -g -Wall -Wundef -Wno-long-long -Wsign-compare
> -Wmissing-prototypes -Wstrict-prototypes -Wcomment -pedantic
> -Werror-implicit-function-declaration -finline-functions
> -fno-strict-aliasing -pthread
> -I/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/opal/mca/paffinity/hwloc/hwloc/include
>  -fvisibility=hidden -c -o orte-migrate.o orte-migrate.c
> make[2]: Leaving directory
> `/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/orte/tools/orte-migrate'
> make[1]: Leaving directory
> `/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/orte'
> orte-migrate.c:101:39: error: 'ORTE_ERRMGR_MIGRATE_STATE_NONE'
> undeclared here (not in a function)
> orte-migrate.c: In function 'main':
> orte-migrate.c:221:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_FINISH'
> undeclared (first use in this function)
> orte-migrate.c:221:12: note: each undeclared identifier is reported only
> once for each function it appears in
> orte-migrate.c:222:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERROR'
> undeclared (first use in this function)
> orte-migrate.c:223:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERR_INPROGRESS'
> undeclared (first use in this function)
> orte-migrate.c: In function 'hnp_receiver':
> orte-migrate.c:531:5: error: 'orte_errmgr_tool_cmd_flag_t' undeclared
> (first use in this function)
> orte-migrate.c:531:33: error: expected ';' before 'command'
> orte-migrate.c:532:5: warning: ISO C90 forbids mixed declarations and
> code
> orte-migrate.c:542:56: error: 'command' undeclared (first use in this
> function)
> orte-migrate.c:542:73: error: 'ORTE_ERRMGR_MIGRATE_TOOL_CMD' undeclared
> (first use in this function)
> orte-migrate.c:548:14: error: 'ORTE_ERRMGR_MIGRATE_TOOL_UPDATE_CMD'
> undeclared (first use in this function)
> orte-migrate.c:555:14: error: 'ORTE_ERRMGR_MIGRATE_TOOL_INIT_CMD'
> undeclared (first use in this function)
> orte-migrate.c: In function 'process_ckpt_update_cmd':
> orte-migrate.c:597:9: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERR_INPROGRESS'
> undeclared (first use in this function)
> orte-migrate.c:609:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_FINISH'
> undeclared (first use in this function)
> orte-migrate.c: In function 'notify_hnp':
> orte-migrate.c:622:5: error: 'orte_errmgr_tool_cmd_flag_t' undeclared
> (first use in this function)
> orte-migrate.c:622:33: error: expected ';' before 'command'
> orte-migrate.c:643:55: error: 'command' undeclared (first use in this
> function)
> orte-migrate.c:643:67: error: 'ORTE_ERRMGR_MIGRATE_TOOL_CMD' undeclared
> (first use in this function)
> orte-migrate.c: In function 'pretty_print_status':
> orte-migrate.c:710:5: error: implicit declaration of function
> 'orte_errmgr_base_migrate_state_str'
> make[2]: *** [orte-migrate.o] Error 1
> make[1]: *** [all-recursive] Error 1
> make: *** [all-recursive] Error 1
> 
> 
> Configured with flags:
> configure: OMPI configuring in opal/libltdl
> configure: running /bin/sh './configure'  '--prefix=/usr'
> '--host=x86_64-pc-linux-gnu' '--build=x86_64-pc-linux-gnu'
> '--mandir=/usr/share/man' '--infodir=/usr/share/info'
> '--datadir=/usr/share' '--docdir=/usr/share/doc/openmpi-scm'
> '--sysconfdir=/etc' '--localstatedir=/var/lib'
> '--disable-dependency-tracking' '--disable-silent-rules'
> '--enable-fast-install' '--libdir=/usr/lib64'
> '--sysconfdir=/etc/openmpi' '--enable-pretty-print-stacktrace'
> '--enable-orterun-prefix-by-default'
> '--with-platform=contrib/platform/cisco/ebuild/native'
> '--enable-multicast' '--with-ft=orcm' '--enable-sensors'
> '--enable-heartbeat' '--enable-mpi-threads' '--enable-progress-threads'
> '--disable-mpi-f90' '--disable-mpi-f77' '--enable-contrib-no-build=vt'
> '--disable-io-romio' '--enable-heterogeneous' '--enable-ipv6'
> 'build_alias=x86_64-pc-linux-gnu' 'host_alias=x86_64-pc-linux-gnu'
> 'CC=x86_64-pc-linux-gnu-gcc' 'CFLAGS=-march=native -pipe -O2' 'CPP=cpp'
> --enable-ltdl-convenience --disable-ltdl-install --enable-shared
> --disable-static --cache-file=/d

Re: [OMPI users] open MPI please recommend a debugger for open MPI

2010-10-29 Thread Jeremy Roberts
I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or DDT
(http://www.allinea.com/).  I've used TotalView pretty extensively and found
it to be pretty easy to use.  They are both commercial, however, and not
cheap.

As far as I know, there isn't a whole lot of open source support for
parallel debugging. The Parallel Tools Platform of Eclipse claims to provide
a parallel debugger, though I have yet to try it (
http://www.eclipse.org/ptp/).

Jeremy

On Fri, Oct 29, 2010 at 12:55 AM, Jack Bryan  wrote:

>  Hi,
>
> Would you please recommend a debugger, which can do debugging for parallel
> processes
> on Open MPI systems ?
>
> I hope that it can be installed without root right because I am not a root
> user for our
> MPI cluster.
>
> Any help is appreciated.
>
> Thanks
>
> Jack
>
> Oct. 28 2010
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] cannot install Open MPI 1.5 on Solaris x86_64 with Oracle/Sun C 5.11

2010-10-29 Thread Terry Dontje
 Sorry, but can you give us the config line, the config.log and the 
full output of make preferrably with make V=1?


--td
On 10/29/2010 04:30 AM, Siegmar Gross wrote:

Hi,

I tried to build Open MPI 1.5 on Solaris X86 and x86_64 with Oracle
Studio 12.2. I can compile Open MPI with thread support, but I can
only partly install it because "libtool" will not find "f95" although
it is available. "make check" shows no failures.

tyr openmpi-1.5-SunOS.x86_64.32_cc 188 ssh sunpc4 cc -V
cc: Sun C 5.11 SunOS_i386 145355-01 2010/10/11
usage: cc [ options ] files.  Use 'cc -flags' for details

No suspicious warnings or errors in log.configure.SunOS.x86_64.32_cc.

tyr openmpi-1.5-SunOS.x86_64.32_cc 182 grep -i warning:
   log.make.SunOS.x86_64.32_cc | more

".../opal/mca/crs/none/crs_none_module.c", line 136:
   warning: statement not reached

".../orte/mca/errmgr/errmgr.h", line 135: warning: attribute
   "noreturn" may not be applied to variable, ignored
(a lot of these warnings)

".../orte/mca/rmcast/tcp/rmcast_tcp.c", line 982: warning:
   assignment type mismatch:
".../orte/mca/rmcast/tcp/rmcast_tcp.c", line 1023: warning:
   assignment type mismatch:
".../orte/mca/rmcast/udp/rmcast_udp.c", line 877: warning:
   assignment type mismatch:
".../orte/mca/rmcast/udp/rmcast_udp.c", line 918: warning:
   assignment type mismatch:

".../orte/tools/orte-ps/orte-ps.c", line 288: warning:
   initializer does not fit or is out of range: 0xfffe
".../orte/tools/orte-ps/orte-ps.c", line 289: warning:
   initializer does not fit or is out of range: 0xfffe

grep -i error: log.make.SunOS.x86_64.32_cc | more

tyr openmpi-1.5-SunOS.x86_64.32_cc 185 grep -i FAIL
   log.make-check.SunOS.x86_64.32_cc
tyr openmpi-1.5-SunOS.x86_64.32_cc 186 grep -i SKIP
   log.make-check.SunOS.x86_64.32_cc
tyr openmpi-1.5-SunOS.x86_64.32_cc 187 grep -i PASS
   log.make-check.SunOS.x86_64.32_cc
PASS: predefined_gap_test
File opened with dladvise_local, all passed
PASS: dlopen_test
All 2 tests passed
 - 1 threads: Passed
 - 2 threads: Passed
 - 4 threads: Passed
 - 5 threads: Passed
 - 8 threads: Passed
PASS: atomic_barrier
 - 1 threads: Passed
 - 2 threads: Passed
 - 4 threads: Passed
 - 5 threads: Passed
 - 8 threads: Passed
PASS: atomic_barrier_noinline
 - 1 threads: Passed
 - 2 threads: Passed
 - 4 threads: Passed
 - 5 threads: Passed
 - 8 threads: Passed
PASS: atomic_spinlock
 - 1 threads: Passed
 - 2 threads: Passed
 - 4 threads: Passed
 - 5 threads: Passed
 - 8 threads: Passed
PASS: atomic_spinlock_noinline
 - 1 threads: Passed
 - 2 threads: Passed
 - 4 threads: Passed
 - 5 threads: Passed
 - 8 threads: Passed
PASS: atomic_math
 - 1 threads: Passed
 - 2 threads: Passed
 - 4 threads: Passed
 - 5 threads: Passed
 - 8 threads: Passed
PASS: atomic_math_noinline
 - 1 threads: Passed
 - 2 threads: Passed
 - 4 threads: Passed
 - 5 threads: Passed
 - 8 threads: Passed
PASS: atomic_cmpset
 - 1 threads: Passed
 - 2 threads: Passed
 - 4 threads: Passed
 - 5 threads: Passed
 - 8 threads: Passed
PASS: atomic_cmpset_noinline
All 8 tests passed
All 0 tests passed
All 0 tests passed
decode [PASSED]
PASS: opal_datatype_test
PASS: checksum
PASS: position
decode [PASSED]
PASS: ddt_test
decode [PASSED]
PASS: ddt_raw
All 5 tests passed
SUPPORT: OMPI Test Passed: opal_path_nfs(): (0 tests)
PASS: opal_path_nfs
1 test passed


tyr openmpi-1.5-SunOS.x86_64.32_cc 190 grep -i warning:
   log.make-install.SunOS.x86_64.32_cc | more
libtool: install: warning: relinking `libmpi_cxx.la'
libtool: install: warning: relinking `libmpi_f77.la'
libtool: install: warning: relinking `libmpi_f90.la'

tyr openmpi-1.5-SunOS.x86_64.32_cc 191 grep -i error:
   log.make-install.SunOS.x86_64.32_cc | more
libtool: install: error: relink `libmpi_f90.la' with the above
   command before installing it

tyr openmpi-1.5-SunOS.x86_64.32_cc 194 tail -20
   log.make-install.SunOS.x86_64.32_cc
make[4]: Leaving directory `.../ompi/mpi/f90/scripts'
make[4]: Entering directory `.../ompi/mpi/f90'
make[5]: Entering directory `.../ompi/mpi/f90'
test -z "/usr/local/openmpi-1.5_32_cc/lib" ||
   /usr/local/bin/mkdir -p "/usr/local/openmpi-1.5_32_cc/lib"
  /bin/bash ../../../libtool   --mode=install /usr/local/bin/install -c
libmpi_f90.la '/usr/local/openmpi-1.5_32_cc/lib'
libtool: install: warning: relinking `libmpi_f90.la'
libtool: install: (cd
/export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/ompi/mpi/f90; /bin/bash
/export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/libtool  --silent --tag 
FC
--mode=relink f95 -I../../../ompi/include 
-I../../../../openmpi-1.5/ompi/include -I.
-I../../../../openmpi-1.5/ompi/mpi/f90 -I../../../ompi/mpi/f90 -m32 
-version-info 1:0:0
-export-dynamic -m32 -o libmpi_f90.la -rpath /usr/local/openmpi-1.5_32_cc/lib 
mpi.lo
mpi_sizeof.lo mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo 
mpi_testsome_f90.

[OMPI users] failed to install openmpi trunk

2010-10-29 Thread Vasiliy G Tolstov
Hello. I'm try to build orcm , in dependencies it need openmpi trunk
with some options have been enabled.

Install fails with message:
Creating orte-migrate.1 man page...
x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../../opal/include
-I../../../orte/include
-I../../../opal/mca/paffinity/hwloc/hwloc/include/private
-I../../../opal/mca/paffinity/hwloc/hwloc/include/hwloc   -I../../..
-march=native -pipe -O2 -g -Wall -Wundef -Wno-long-long -Wsign-compare
-Wmissing-prototypes -Wstrict-prototypes -Wcomment -pedantic
-Werror-implicit-function-declaration -finline-functions
-fno-strict-aliasing -pthread
-I/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/opal/mca/paffinity/hwloc/hwloc/include
 -fvisibility=hidden -c -o orte-migrate.o orte-migrate.c
make[2]: Leaving directory
`/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/orte/tools/orte-migrate'
make[1]: Leaving directory
`/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/orte'
orte-migrate.c:101:39: error: 'ORTE_ERRMGR_MIGRATE_STATE_NONE'
undeclared here (not in a function)
orte-migrate.c: In function 'main':
orte-migrate.c:221:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_FINISH'
undeclared (first use in this function)
orte-migrate.c:221:12: note: each undeclared identifier is reported only
once for each function it appears in
orte-migrate.c:222:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERROR'
undeclared (first use in this function)
orte-migrate.c:223:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERR_INPROGRESS'
undeclared (first use in this function)
orte-migrate.c: In function 'hnp_receiver':
orte-migrate.c:531:5: error: 'orte_errmgr_tool_cmd_flag_t' undeclared
(first use in this function)
orte-migrate.c:531:33: error: expected ';' before 'command'
orte-migrate.c:532:5: warning: ISO C90 forbids mixed declarations and
code
orte-migrate.c:542:56: error: 'command' undeclared (first use in this
function)
orte-migrate.c:542:73: error: 'ORTE_ERRMGR_MIGRATE_TOOL_CMD' undeclared
(first use in this function)
orte-migrate.c:548:14: error: 'ORTE_ERRMGR_MIGRATE_TOOL_UPDATE_CMD'
undeclared (first use in this function)
orte-migrate.c:555:14: error: 'ORTE_ERRMGR_MIGRATE_TOOL_INIT_CMD'
undeclared (first use in this function)
orte-migrate.c: In function 'process_ckpt_update_cmd':
orte-migrate.c:597:9: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERR_INPROGRESS'
undeclared (first use in this function)
orte-migrate.c:609:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_FINISH'
undeclared (first use in this function)
orte-migrate.c: In function 'notify_hnp':
orte-migrate.c:622:5: error: 'orte_errmgr_tool_cmd_flag_t' undeclared
(first use in this function)
orte-migrate.c:622:33: error: expected ';' before 'command'
orte-migrate.c:643:55: error: 'command' undeclared (first use in this
function)
orte-migrate.c:643:67: error: 'ORTE_ERRMGR_MIGRATE_TOOL_CMD' undeclared
(first use in this function)
orte-migrate.c: In function 'pretty_print_status':
orte-migrate.c:710:5: error: implicit declaration of function
'orte_errmgr_base_migrate_state_str'
make[2]: *** [orte-migrate.o] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all-recursive] Error 1


Configured with flags:
configure: OMPI configuring in opal/libltdl
configure: running /bin/sh './configure'  '--prefix=/usr'
'--host=x86_64-pc-linux-gnu' '--build=x86_64-pc-linux-gnu'
'--mandir=/usr/share/man' '--infodir=/usr/share/info'
'--datadir=/usr/share' '--docdir=/usr/share/doc/openmpi-scm'
'--sysconfdir=/etc' '--localstatedir=/var/lib'
'--disable-dependency-tracking' '--disable-silent-rules'
'--enable-fast-install' '--libdir=/usr/lib64'
'--sysconfdir=/etc/openmpi' '--enable-pretty-print-stacktrace'
'--enable-orterun-prefix-by-default'
'--with-platform=contrib/platform/cisco/ebuild/native'
'--enable-multicast' '--with-ft=orcm' '--enable-sensors'
'--enable-heartbeat' '--enable-mpi-threads' '--enable-progress-threads'
'--disable-mpi-f90' '--disable-mpi-f77' '--enable-contrib-no-build=vt'
'--disable-io-romio' '--enable-heterogeneous' '--enable-ipv6'
'build_alias=x86_64-pc-linux-gnu' 'host_alias=x86_64-pc-linux-gnu'
'CC=x86_64-pc-linux-gnu-gcc' 'CFLAGS=-march=native -pipe -O2' 'CPP=cpp'
--enable-ltdl-convenience --disable-ltdl-install --enable-shared
--disable-static --cache-file=/dev/null --srcdir=.
--disable-option-checking


-- 
Vasiliy G Tolstov 
Selfip.Ru



[OMPI users] cannot install Open MPI 1.5 on Solaris x86_64 with Oracle/Sun C 5.11

2010-10-29 Thread Siegmar Gross
Hi,

I tried to build Open MPI 1.5 on Solaris X86 and x86_64 with Oracle
Studio 12.2. I can compile Open MPI with thread support, but I can
only partly install it because "libtool" will not find "f95" although
it is available. "make check" shows no failures.

tyr openmpi-1.5-SunOS.x86_64.32_cc 188 ssh sunpc4 cc -V
cc: Sun C 5.11 SunOS_i386 145355-01 2010/10/11
usage: cc [ options ] files.  Use 'cc -flags' for details

No suspicious warnings or errors in log.configure.SunOS.x86_64.32_cc.

tyr openmpi-1.5-SunOS.x86_64.32_cc 182 grep -i warning:
  log.make.SunOS.x86_64.32_cc | more

".../opal/mca/crs/none/crs_none_module.c", line 136:
  warning: statement not reached

".../orte/mca/errmgr/errmgr.h", line 135: warning: attribute
  "noreturn" may not be applied to variable, ignored
(a lot of these warnings)

".../orte/mca/rmcast/tcp/rmcast_tcp.c", line 982: warning:
  assignment type mismatch:
".../orte/mca/rmcast/tcp/rmcast_tcp.c", line 1023: warning:
  assignment type mismatch:
".../orte/mca/rmcast/udp/rmcast_udp.c", line 877: warning:
  assignment type mismatch:
".../orte/mca/rmcast/udp/rmcast_udp.c", line 918: warning:
  assignment type mismatch:

".../orte/tools/orte-ps/orte-ps.c", line 288: warning:
  initializer does not fit or is out of range: 0xfffe
".../orte/tools/orte-ps/orte-ps.c", line 289: warning:
  initializer does not fit or is out of range: 0xfffe

grep -i error: log.make.SunOS.x86_64.32_cc | more

tyr openmpi-1.5-SunOS.x86_64.32_cc 185 grep -i FAIL
  log.make-check.SunOS.x86_64.32_cc
tyr openmpi-1.5-SunOS.x86_64.32_cc 186 grep -i SKIP
  log.make-check.SunOS.x86_64.32_cc
tyr openmpi-1.5-SunOS.x86_64.32_cc 187 grep -i PASS
  log.make-check.SunOS.x86_64.32_cc
PASS: predefined_gap_test
File opened with dladvise_local, all passed
PASS: dlopen_test
All 2 tests passed
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_barrier
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_barrier_noinline
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_spinlock
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_spinlock_noinline
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_math
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_math_noinline
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_cmpset
- 1 threads: Passed
- 2 threads: Passed
- 4 threads: Passed
- 5 threads: Passed
- 8 threads: Passed
PASS: atomic_cmpset_noinline
All 8 tests passed
All 0 tests passed
All 0 tests passed
decode [PASSED]
PASS: opal_datatype_test
PASS: checksum
PASS: position
decode [PASSED]
PASS: ddt_test
decode [PASSED]
PASS: ddt_raw
All 5 tests passed
SUPPORT: OMPI Test Passed: opal_path_nfs(): (0 tests)
PASS: opal_path_nfs
1 test passed


tyr openmpi-1.5-SunOS.x86_64.32_cc 190 grep -i warning:
  log.make-install.SunOS.x86_64.32_cc | more
libtool: install: warning: relinking `libmpi_cxx.la'
libtool: install: warning: relinking `libmpi_f77.la'
libtool: install: warning: relinking `libmpi_f90.la'

tyr openmpi-1.5-SunOS.x86_64.32_cc 191 grep -i error:
  log.make-install.SunOS.x86_64.32_cc | more
libtool: install: error: relink `libmpi_f90.la' with the above
  command before installing it

tyr openmpi-1.5-SunOS.x86_64.32_cc 194 tail -20
  log.make-install.SunOS.x86_64.32_cc
make[4]: Leaving directory `.../ompi/mpi/f90/scripts'
make[4]: Entering directory `.../ompi/mpi/f90'
make[5]: Entering directory `.../ompi/mpi/f90'
test -z "/usr/local/openmpi-1.5_32_cc/lib" ||
  /usr/local/bin/mkdir -p "/usr/local/openmpi-1.5_32_cc/lib"
 /bin/bash ../../../libtool   --mode=install /usr/local/bin/install -c
   libmpi_f90.la '/usr/local/openmpi-1.5_32_cc/lib'
libtool: install: warning: relinking `libmpi_f90.la'
libtool: install: (cd 
/export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/ompi/mpi/f90; /bin/bash 
/export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/libtool  --silent --tag 
FC 
--mode=relink f95 -I../../../ompi/include 
-I../../../../openmpi-1.5/ompi/include -I. 
-I../../../../openmpi-1.5/ompi/mpi/f90 -I../../../ompi/mpi/f90 -m32 
-version-info 1:0:0 
-export-dynamic -m32 -o libmpi_f90.la -rpath /usr/local/openmpi-1.5_32_cc/lib 
mpi.lo 
mpi_sizeof.lo mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo 
mpi_testsome_f90.lo 
mpi_waitall_f90.lo mpi_waitsome_f90.lo mpi_wtick_f90.lo mpi_wtime_f90.lo 
../../../ompi/mpi/f77/libmpi_f77.la -lsocket -lnsl -lrt -lm )
/export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/libtool:
  line 7846: f95:

[OMPI users] Need Help for understand heat equation 2D mpi solving version

2010-10-29 Thread christophe petit
Hello,

>
> i am still trying to understand the parallelized version of the heat
> equation 2D solving that we saw at school. In order to explain my problem, i
> need to list the main code :
>
>   9   program heat
>  10
> !**
>  11 !
>  12 !   This program solves the heat equation on the unit square
> [0,1]x[0,1]
>  13 !| du/dt - Delta(u) = 0
>  14 !|  u/gamma = cste
>  15 !   by implementing a explicit scheme.
>  16 !   The discretization is done using a 5 point finite difference scheme
>  17 !   and the domain is decomposed into sub-domains.
>  18 !   The PDE is discretized using a 5 point finite difference scheme
>  19 !   over a (x_dim+2)*(x_dim+2) grid including the end points
>  20 !   correspond to the boundary points that are stored.
>  21 !
>  22 !   The data on the whole domain are stored in
>  23 !   the following way :
>  24 !
>  25 !y
>  26 !   
>  27 !d  |  |
>  28 !i  |  |
>  29 !r  |  |
>  30 !e  |  |
>  31 !c  |  |
>  32 !t  |  |
>  33 !i  | x20  |
>  34 !o /\   |  |
>  35 !n  |   | x10  |
>  36 !   |   |  |
>  37 !   |   | x00  x01 x02 ... |
>  38 !   |   
>  39 !---> x direction  x(*,j)
>  40 !
>  41 !   The boundary conditions are stored in the following submatrices
>  42 !
>  43 !
>  44 !x(1:x_dim, 0)  ---> left   temperature
>  45 !x(1:x_dim, x_dim+1)---> right  temperature
>  46 !x(0, 1:x_dim)  ---> toptemperature
>  47 !x(x_dim+1, 1:x_dim)---> bottom temperature
>  48 !
>  49
> !**
>  50   implicit none
>  51   include 'mpif.h'
>  52 ! size of the discretization
>  53   integer :: x_dim, nb_iter
>  54   double precision, allocatable :: x(:,:),b(:,:),x0(:,:)
>  55   double precision  :: dt, h, epsilon
>  56   double precision  :: resLoc, result, t, tstart, tend
>  57 !
>  58   integer :: i,j
>  59   integer :: step, maxStep
>  60   integer :: size_x, size_y, me, x_domains,y_domains
>  61   integer :: iconf(5), size_x_glo
>  62   double precision conf(2)
>  63 !
>  64 ! MPI variables
>  65   integer :: nproc, infompi, comm, comm2d, lda, ndims
>  66   INTEGER, DIMENSION(2)  :: dims
>  67   LOGICAL, DIMENSION(2)  :: periods
>  68   LOGICAL, PARAMETER :: reorganisation = .false.
>  69   integer :: row_type
>  70   integer, parameter :: nbvi=4
>  71   integer, parameter :: S=1, E=2, N=3, W=4
>  72   integer, dimension(4) :: neighBor
>  73
>  74 !
>  75   intrinsic abs
>  76 !
>  77 !
>  78   call MPI_INIT(infompi)
>  79   comm = MPI_COMM_WORLD
>  80   call MPI_COMM_SIZE(comm,nproc,infompi)
>  81   call MPI_COMM_RANK(comm,me,infompi)
>  82 !
>  83 !
>  84   if (me.eq.0) then
>  85   call readparam(iconf, conf)
>  86   endif
>  87   call MPI_BCAST(iconf,5,MPI_INTEGER,0,comm,infompi)
>  88   call MPI_BCAST(conf,2,MPI_DOUBLE_PRECISION,0,comm,infompi)
>  89 !
>  90   size_x= iconf(1)
>  91   size_y= iconf(1)
>  92   x_domains = iconf(3)
>  93   y_domains = iconf(4)
>  94   maxStep   = iconf(5)
>  95   dt= conf(1)
>  96   epsilon   = conf(2)
>  97 !
>  98   size_x_glo = x_domains*size_x+2
>  99   h  = 1.0d0/dble(size_x_glo)
> 100   dt = 0.25*h*h
> 101 !
> 102 !
> 103   lda = size_y+2
> 104   allocate(x(0:size_y+1,0:size_x+1))
> 105   allocate(x0(0:size_y+1,0:size_x+1))
> 106   allocate(b(0:size_y+1,0:size_x+1))
> 107 !
> 108 ! Create 2D cartesian grid
> 109   periods(:) = .false.
> 110
> 111   ndims = 2
> 112   dims(1)=x_domains
> 113   dims(2)=y_domains
> 114   CALL MPI_CART_CREATE(MPI_COMM_WORLD, ndims, dims, periods, &
> 115 reorganisation,comm2d,infompi)
> 116 !
> 117 ! Identify neighbors
> 118 !
> 119   NeighBor(:) = MPI_PROC_NULL
> 120 ! Left/West and right/Est neigbors
> 121   CALL MPI_CART_SHIFT(comm2d,0,1,NeighBor(W),NeighBor(E),infompi)
> 122
> 123   print *,'mpi_proc_null=', MPI_PROC_NULL
> 124   print *,'rang=', me
> 125   print *, 'ici premier mpi_cart_shift : neighbor(w)=',NeighBor(W)
> 126   print *, 'ici premier mpi_cart_shift : neighbor(e)=',NeighBor(E)
> 127
> 128 ! Bottom/South and Upper/North neigbors
> 129   CALL MPI_CART_SHIFT(comm2d,1,1,NeighBor(S),NeighBor(N),infompi)
> 130
> 131
> 132   print *, '

[OMPI users] open MPI please recommend a debugger for open MPI

2010-10-29 Thread Jack Bryan

Hi,
Would you please recommend a debugger, which can do debugging for parallel 
processes on Open MPI systems ? 
I hope that it can be installed without root right because I am not a root user 
for ourMPI cluster. 
Any help is appreciated. 
Thanks
Jack
Oct. 28 2010