Re: [OMPI users] open MPI please recommend a debugger for open MPI
Not without a list of hostnames it's not any use no, if you can get that, then I have something to work with. From looking around on google -n might help here. Once I have this info you'll need to verify that you are able to ssh to these nodes without a password, that pdsh is installed and give me the names of an environment variable that pbs sets for ranks within a job. I'm sure we can get something working but it might be better to take this off-list or to the padb-users list to avoid spamming the Open-MPI users list. Ashley. On 29 Oct 2010, at 18:44, Jack Bryan wrote: > Hi, > > this is what I got : > > -bash-3.2$ qstat -n -u myName > > clsuter: > > Req'd Req'd Elap > Job ID Username QueueJobname SessID NDS TSK > Memory Time S Time > -- - --- > -- - - - > 48933.cluster.e myName develmyJob 107835 1 ---- > 00:02 C 00:00 >n20/0 > > Any help is appreciated. > > thanks > > > From: ash...@pittman.co.uk > > Date: Fri, 29 Oct 2010 18:38:25 +0100 > > To: us...@open-mpi.org > > Subject: Re: [OMPI users] open MPI please recommend a debugger for open MPI > > > > > > Can you try the following and send me the output. > > > > qstat -n -u `whoami` @clusterName > > > > The output sent before implies that your cluster is called "clusterName" > > rather than "cluster" which is a little surprising but let's see what it > > gives us if we query on that basis. > > > > Ashley. > > > > On 29 Oct 2010, at 18:29, Jack Bryan wrote: > > > > > thanks > > > > > > I have run padb (the new one with your patch) on my system and got : > > > > > > -bash-3.2$ padb -Ormgr=pbs -Q 48516.cluster > > > $VAR1 = {}; > > > Job 48516.cluster is not active > > > > > > Actually, the job is running. > > > > > > How to check whether my system has pbs_pro ? > > > > > > Any help is appreciated. > > > > > > thanks > > > Jinxu Ding > > > > > > Oct. 29 2010 > > > > > > > > > > From: ash...@pittman.co.uk > > > > Date: Fri, 29 Oct 2010 18:21:46 +0100 > > > > To: us...@open-mpi.org > > > > Subject: Re: [OMPI users] open MPI please recommend a debugger for open > > > > MPI > > > > > > > > > > > > On 29 Oct 2010, at 12:06, Jeremy Roberts wrote: > > > > > > > > > I'd suggest looking into TotalView (http://www.totalviewtech.com) > > > > > and/or DDT (http://www.allinea.com/). I've used TotalView pretty > > > > > extensively and found it to be pretty easy to use. They are both > > > > > commercial, however, and not cheap. > > > > > > > > > > As far as I know, there isn't a whole lot of open source support for > > > > > parallel debugging. The Parallel Tools Platform of Eclipse claims to > > > > > provide a parallel debugger, though I have yet to try it > > > > > (http://www.eclipse.org/ptp/). > > > > > > > > Jeremy has covered the graphical parallel debuggers that I'm aware of, > > > > for a different approach there is padb which isn't a "parallel > > > > debugger" in the traditional model but is able to show you the same > > > > type of information, it won't allow you to point-and-click through the > > > > source or single step through the code but it is lightweight and will > > > > show you the information which you need to know. > > > > > > > > Padb needs to integrate with the resource manager, I know it works with > > > > pbs_pro but it seems there are a few issues on your system which is pbs > > > > (without the pro). I can help you with this and work through the > > > > problems but only if you work with me and provide details of the > > > > integration, in particular I've sent you a version which has a small > > > > patch and some debug printfs added, if you could send me the output > > > > from this I'd be able to tell you if it was likely to work and how to > > > > go about making it do so. > > > > > > > > Ashley. > > > > > > > > -- > > > > > > > > Ashley Pittman, Bath, UK. > > > > > > > > Padb - A parallel job inspection tool for cluster computing > > > > http://padb.pittman.org.uk > > > > > > > > > > > > ___ > > > > users mailing list > > > > us...@open-mpi.org > > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > > > > Ashley Pittman, Bath, UK. > > > > Padb - A parallel job inspection tool for cluster computing > > http://padb.pittman.org.uk > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users --
Re: [OMPI users] open MPI please recommend a debugger for open MPI
Hi, this is what I got : -bash-3.2$ qstat -n -u myName clsuter: Req'd Req'd ElapJob ID Username QueueJobname SessID NDS TSK Memory Time S Time -- - --- -- - - -48933.cluster.e myName develmyJob 107835 1 ---- 00:02 C 00:00 n20/0 Any help is appreciated. thanks > From: ash...@pittman.co.uk > Date: Fri, 29 Oct 2010 18:38:25 +0100 > To: us...@open-mpi.org > Subject: Re: [OMPI users] open MPI please recommend a debugger for open MPI > > > Can you try the following and send me the output. > > qstat -n -u `whoami` @clusterName > > The output sent before implies that your cluster is called "clusterName" > rather than "cluster" which is a little surprising but let's see what it > gives us if we query on that basis. > > Ashley. > > On 29 Oct 2010, at 18:29, Jack Bryan wrote: > > > thanks > > > > I have run padb (the new one with your patch) on my system and got : > > > > -bash-3.2$ padb -Ormgr=pbs -Q 48516.cluster > > $VAR1 = {}; > > Job 48516.cluster is not active > > > > Actually, the job is running. > > > > How to check whether my system has pbs_pro ? > > > > Any help is appreciated. > > > > thanks > > Jinxu Ding > > > > Oct. 29 2010 > > > > > > > From: ash...@pittman.co.uk > > > Date: Fri, 29 Oct 2010 18:21:46 +0100 > > > To: us...@open-mpi.org > > > Subject: Re: [OMPI users] open MPI please recommend a debugger for open > > > MPI > > > > > > > > > On 29 Oct 2010, at 12:06, Jeremy Roberts wrote: > > > > > > > I'd suggest looking into TotalView (http://www.totalviewtech.com) > > > > and/or DDT (http://www.allinea.com/). I've used TotalView pretty > > > > extensively and found it to be pretty easy to use. They are both > > > > commercial, however, and not cheap. > > > > > > > > As far as I know, there isn't a whole lot of open source support for > > > > parallel debugging. The Parallel Tools Platform of Eclipse claims to > > > > provide a parallel debugger, though I have yet to try it > > > > (http://www.eclipse.org/ptp/). > > > > > > Jeremy has covered the graphical parallel debuggers that I'm aware of, > > > for a different approach there is padb which isn't a "parallel debugger" > > > in the traditional model but is able to show you the same type of > > > information, it won't allow you to point-and-click through the source or > > > single step through the code but it is lightweight and will show you the > > > information which you need to know. > > > > > > Padb needs to integrate with the resource manager, I know it works with > > > pbs_pro but it seems there are a few issues on your system which is pbs > > > (without the pro). I can help you with this and work through the problems > > > but only if you work with me and provide details of the integration, in > > > particular I've sent you a version which has a small patch and some debug > > > printfs added, if you could send me the output from this I'd be able to > > > tell you if it was likely to work and how to go about making it do so. > > > > > > Ashley. > > > > > > -- > > > > > > Ashley Pittman, Bath, UK. > > > > > > Padb - A parallel job inspection tool for cluster computing > > > http://padb.pittman.org.uk > > > > > > > > > ___ > > > users mailing list > > > us...@open-mpi.org > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > -- > > Ashley Pittman, Bath, UK. > > Padb - A parallel job inspection tool for cluster computing > http://padb.pittman.org.uk > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] open MPI please recommend a debugger for open MPI
Can you try the following and send me the output. qstat -n -u `whoami` @clusterName The output sent before implies that your cluster is called "clusterName" rather than "cluster" which is a little surprising but let's see what it gives us if we query on that basis. Ashley. On 29 Oct 2010, at 18:29, Jack Bryan wrote: > thanks > > I have run padb (the new one with your patch) on my system and got : > > -bash-3.2$ padb -Ormgr=pbs -Q 48516.cluster > $VAR1 = {}; > Job 48516.cluster is not active > > Actually, the job is running. > > How to check whether my system has pbs_pro ? > > Any help is appreciated. > > thanks > Jinxu Ding > > Oct. 29 2010 > > > > From: ash...@pittman.co.uk > > Date: Fri, 29 Oct 2010 18:21:46 +0100 > > To: us...@open-mpi.org > > Subject: Re: [OMPI users] open MPI please recommend a debugger for open MPI > > > > > > On 29 Oct 2010, at 12:06, Jeremy Roberts wrote: > > > > > I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or > > > DDT (http://www.allinea.com/). I've used TotalView pretty extensively and > > > found it to be pretty easy to use. They are both commercial, however, and > > > not cheap. > > > > > > As far as I know, there isn't a whole lot of open source support for > > > parallel debugging. The Parallel Tools Platform of Eclipse claims to > > > provide a parallel debugger, though I have yet to try it > > > (http://www.eclipse.org/ptp/). > > > > Jeremy has covered the graphical parallel debuggers that I'm aware of, for > > a different approach there is padb which isn't a "parallel debugger" in the > > traditional model but is able to show you the same type of information, it > > won't allow you to point-and-click through the source or single step > > through the code but it is lightweight and will show you the information > > which you need to know. > > > > Padb needs to integrate with the resource manager, I know it works with > > pbs_pro but it seems there are a few issues on your system which is pbs > > (without the pro). I can help you with this and work through the problems > > but only if you work with me and provide details of the integration, in > > particular I've sent you a version which has a small patch and some debug > > printfs added, if you could send me the output from this I'd be able to > > tell you if it was likely to work and how to go about making it do so. > > > > Ashley. > > > > -- > > > > Ashley Pittman, Bath, UK. > > > > Padb - A parallel job inspection tool for cluster computing > > http://padb.pittman.org.uk > > > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [OMPI users] open MPI please recommend a debugger for open MPI
thanksI have run padb (the new one with your patch) on my system and got :-bash-3.2$ padb -Ormgr=pbs -Q 48516.cluster$VAR1 = {};Job 48516.cluster is not activeActually, the job is running. How to check whether my system has pbs_pro ? Any help is appreciated. thanksJinxu DingOct. 29 2010 > From: ash...@pittman.co.uk > Date: Fri, 29 Oct 2010 18:21:46 +0100 > To: us...@open-mpi.org > Subject: Re: [OMPI users] open MPI please recommend a debugger for open MPI > > > On 29 Oct 2010, at 12:06, Jeremy Roberts wrote: > > > I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or > > DDT (http://www.allinea.com/). I've used TotalView pretty extensively and > > found it to be pretty easy to use. They are both commercial, however, and > > not cheap. > > > > As far as I know, there isn't a whole lot of open source support for > > parallel debugging. The Parallel Tools Platform of Eclipse claims to > > provide a parallel debugger, though I have yet to try it > > (http://www.eclipse.org/ptp/). > > Jeremy has covered the graphical parallel debuggers that I'm aware of, for a > different approach there is padb which isn't a "parallel debugger" in the > traditional model but is able to show you the same type of information, it > won't allow you to point-and-click through the source or single step through > the code but it is lightweight and will show you the information which you > need to know. > > Padb needs to integrate with the resource manager, I know it works with > pbs_pro but it seems there are a few issues on your system which is pbs > (without the pro). I can help you with this and work through the problems > but only if you work with me and provide details of the integration, in > particular I've sent you a version which has a small patch and some debug > printfs added, if you could send me the output from this I'd be able to tell > you if it was likely to work and how to go about making it do so. > > Ashley. > > -- > > Ashley Pittman, Bath, UK. > > Padb - A parallel job inspection tool for cluster computing > http://padb.pittman.org.uk > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] open MPI please recommend a debugger for open MPI
On 29 Oct 2010, at 12:06, Jeremy Roberts wrote: > I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or DDT > (http://www.allinea.com/). I've used TotalView pretty extensively and found > it to be pretty easy to use. They are both commercial, however, and not > cheap. > > As far as I know, there isn't a whole lot of open source support for parallel > debugging. The Parallel Tools Platform of Eclipse claims to provide a > parallel debugger, though I have yet to try it (http://www.eclipse.org/ptp/). Jeremy has covered the graphical parallel debuggers that I'm aware of, for a different approach there is padb which isn't a "parallel debugger" in the traditional model but is able to show you the same type of information, it won't allow you to point-and-click through the source or single step through the code but it is lightweight and will show you the information which you need to know. Padb needs to integrate with the resource manager, I know it works with pbs_pro but it seems there are a few issues on your system which is pbs (without the pro). I can help you with this and work through the problems but only if you work with me and provide details of the integration, in particular I've sent you a version which has a small patch and some debug printfs added, if you could send me the output from this I'd be able to tell you if it was likely to work and how to go about making it do so. Ashley. -- Ashley Pittman, Bath, UK. Padb - A parallel job inspection tool for cluster computing http://padb.pittman.org.uk
Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)
Am 29.10.2010 um 18:47 schrieb Jeff Squyres: > On Oct 29, 2010, at 12:40 PM, Reuti wrote: > >>> I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally >>> *NOT* assume that different versions like this are compatible. >> >> I'm getting confused, as these versions are exactly fitting "x.(y+1).*" >> which you mention below. So they should work together by design. > > It depends on what you mean by "work together". > > 1. OMPI provides an ABI guarantee for x.y.* and x.(y+1).*, where y is odd. > So if you compile your MPI app with Open MPI v1.4.1, it'll work just fine > with 1.4.3. (the only disclaimer is that this guarantee started with > v1.3.2). Note that y must be odd -- so if you compile your MPI app with > v1.4.1, it does *not* necessarily work with v1.5. Indeed, we broken ABI > between the v1.3/v1.4 series and the v1.5 series (our ABI guarantee allows us > to do this). Yep, I read it this way in your first reply. > 2. OMPI does *not* provide multi-version *interoperability* guarantees. Say > you compile your MPI app against OMPI v1.4.1. Then you run it across a bunch > of nodes, but some nodes have OMPI v1.4.1 on them and others have OMPI v1.4.3 > (i.e., your app gets libmpi.so from v1.4.1 on some nodes and libmpi.so from > v1.4.3 on other nodes). This is absolutely not guaranteed to work -- we > don't even try to maintain this kind of compatibility. Aha, now I see. When all are the same, it's for sure no problem, but with different ones on different nodes you get a mixture of libraries then of course for one and the same execution. So, when e.g. the protocol for the message which is send to another node changed, it will break. NB: If I would upgrade my cluster in two steps, I would for a short time adjust the queuing system to get nodes for each parallel job where all have the same version then. -- Reuti > > Does that make sense? > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I guess we will play it safe and upgrade every cluster at once so that we won't get bad surprises. thank you Jeff. On 10/29/2010 06:40 PM, Reuti wrote: > Hi, > > Am 29.10.2010 um 18:27 schrieb Jeff Squyres: > >> I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally *NOT* >> assume that different versions like this are compatible. > > I'm getting confused, as these versions are exactly fitting "x.(y+1).*" which > you mention below. So they should work together by design. > > -- Reuti > > >> Open MPI makes an ABI promise (that started with version 1.3.2) that all the >> releases in a given feature series and its corresponding super-stable series >> (i.e., x.y.* and x.(y+1).*, where y is odd) are ABI compatible. But we make >> no guarantees about wire protocols being compatible, or other things like >> that. >> >> So in general, it's "pleasantly surprising" if the different releases work >> together, but I wouldn't rely on it *at all*. :-) >> If I get it well, ABI compatible means something compiled with x.y.* will run on x.y+1.* without the need for you to recompile. mixing x.y and x.y+1 on the same machinefile (and that's what we are talking about) can only work by accident, not by design. >> >> On Oct 29, 2010, at 12:12 PM, guillaume ranquet wrote: >> > Hi list, > I'm sorry to bother you with a stupid question. > > we intend to have for a short period of time, some nodes with 1.4.3 and > others with 1.4.1 (before upgrading everyone to 1.4.3). > > I made various test and found both versions to be running together quite > well with a mixed set of nodes. > > my tests were quite simple, I compiled and ran mpi hello_worlds with > both versions. > It wouldn't be serious for me to assume both versions fully compatible > after these tests -and I must admit I lack the time and technical > knowledge to run further testing. > > has anyone any insight on what have changed that would break compatibility? > I guess nothing, since they are the same major.minor :) > > > regards, > Guillaume Ranquet. >>> ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJMyvxLAAoJEEzIl7PMEAliTnoH/R4GhehUiFZo6eeSh/Rv9KJc ZhAJIRTFH0z7+R2V4ggDyIWFVEv0mktQq/WEqQTbGNyVVvhWVFjCxrI7deZ+FkZS EFv9oIlKM6gNR+cFdoN4xW4ZfiIoCTGddG6XOxLXkZQnhaG30s5UUmIuoBLvgQhb mTq43WdEPpWsSuyMzo48hizT1PFqpPR101ITnIa2y4T5FC5QktJhbp85HbPaNE2Z ej7kwXcgLEnTDk9wF4rZRah8vdIdtxwghwGhytVLqMFBCB4MR8hWMYTakJbIOt/7 GkFtOv0D7hruHhl9dNk+o8VyHMQq6bzlqs3UdQxW1Hx1N2w0ngHK6fzfUnYRVVY= =TsJh -END PGP SIGNATURE- smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)
On Oct 29, 2010, at 12:40 PM, Reuti wrote: >> I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally *NOT* >> assume that different versions like this are compatible. > > I'm getting confused, as these versions are exactly fitting "x.(y+1).*" which > you mention below. So they should work together by design. It depends on what you mean by "work together". 1. OMPI provides an ABI guarantee for x.y.* and x.(y+1).*, where y is odd. So if you compile your MPI app with Open MPI v1.4.1, it'll work just fine with 1.4.3. (the only disclaimer is that this guarantee started with v1.3.2). Note that y must be odd -- so if you compile your MPI app with v1.4.1, it does *not* necessarily work with v1.5. Indeed, we broken ABI between the v1.3/v1.4 series and the v1.5 series (our ABI guarantee allows us to do this). 2. OMPI does *not* provide multi-version *interoperability* guarantees. Say you compile your MPI app against OMPI v1.4.1. Then you run it across a bunch of nodes, but some nodes have OMPI v1.4.1 on them and others have OMPI v1.4.3 (i.e., your app gets libmpi.so from v1.4.1 on some nodes and libmpi.so from v1.4.3 on other nodes). This is absolutely not guaranteed to work -- we don't even try to maintain this kind of compatibility. Does that make sense? -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)
Hi, Am 29.10.2010 um 18:27 schrieb Jeff Squyres: > I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally *NOT* > assume that different versions like this are compatible. I'm getting confused, as these versions are exactly fitting "x.(y+1).*" which you mention below. So they should work together by design. -- Reuti > Open MPI makes an ABI promise (that started with version 1.3.2) that all the > releases in a given feature series and its corresponding super-stable series > (i.e., x.y.* and x.(y+1).*, where y is odd) are ABI compatible. But we make > no guarantees about wire protocols being compatible, or other things like > that. > > So in general, it's "pleasantly surprising" if the different releases work > together, but I wouldn't rely on it *at all*. :-) > > > On Oct 29, 2010, at 12:12 PM, guillaume ranquet wrote: > >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> Hi list, >> I'm sorry to bother you with a stupid question. >> >> we intend to have for a short period of time, some nodes with 1.4.3 and >> others with 1.4.1 (before upgrading everyone to 1.4.3). >> >> I made various test and found both versions to be running together quite >> well with a mixed set of nodes. >> >> my tests were quite simple, I compiled and ran mpi hello_worlds with >> both versions. >> It wouldn't be serious for me to assume both versions fully compatible >> after these tests -and I must admit I lack the time and technical >> knowledge to run further testing. >> >> has anyone any insight on what have changed that would break compatibility? >> I guess nothing, since they are the same major.minor :) >> >> >> regards, >> Guillaume Ranquet. >> -BEGIN PGP SIGNATURE- >> Version: GnuPG v2.0.16 (GNU/Linux) >> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ >> >> iQEcBAEBAgAGBQJMyvJTAAoJEEzIl7PMEAliT4kH/RY4WXhEO5R8H3DNIWW7Y91z >> 6q4BrymLrBSl7rnnEgALBMiPGK9lQgqEtv9k3xCFbfOfWXIFreIHH8ZFDzt1BjQI >> TZ58SwVE9CIMmESoJ1P52R+WCbKYur3U2eda//1cfnZ28ZYjnKN/xYlT/wv8hqg3 >> GsW+seMR8X+1nNFkH1UQHIBVO2cXaK24BtSe4cvDFaMaUbe0Qlmxg55BbCSYB4ED >> VBbplp/ty0tojmZdJLqSsp7nZ84oCfvAfZf16fJTDHNYhUSvNz/fldnxWrm7WTUb >> VzM94yJf0IHfNAB/YvpXECGFL9cPWeG/F6Bm+r6GSMRvd0MeLbp1HWJTbVYlCwo= >> =NwEP >> -END PGP SIGNATURE- >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Need Help for understand heat equation 2D mpi solving version
christophe petit wrote: i am still trying to understand the parallelized version of the heat equation 2D solving that we saw at school. I am confused between the shift of the values near to the bounds done by the "updateBound" routine and the main loop (at line 161 in main code) which calls the routine "Explicit". Each process "owns" a subdomain of cells, for which it will compute updated values. The process has storage not only for these cells, which it owns, but also for a perimeter of cells, whose values need to be fetched from nearby processes. So, there are two steps. In "updateBound", processes communicate so that each supplies boundary values to neighbors and gets boundary values from neighbors. In "Explicit", the computation (stencil operation) is performed. For a given process (say number 1) ( i use 4 here for execution), i send to the east process (3) the penultimate column left column, to the north process (0) the penultimate row top ,and to the others (mpi_proc_null=-2) the penultimate right column and the bottom row. But how the 4 processes are synchronous ? When UpdateBound is called, neighboring processes are implicitly synchronized via the MPI_Sendrecv() calls. I don't understand too why all the processes go through the solving piece of code calling the "Explicit" routine. The computational domain is distributed among all processes. Each cell must be updated with the stencil operation. So, each process calls that computation for the cells that it owns. You should be able to get better interactivity at your school than on this mailing list. Further, your questions at school would help the instructor get feedback from the students.
Re: [OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)
I'd have to go check 1.4.3 and 1.4.1 to be sure, but I would generally *NOT* assume that different versions like this are compatible. Open MPI makes an ABI promise (that started with version 1.3.2) that all the releases in a given feature series and its corresponding super-stable series (i.e., x.y.* and x.(y+1).*, where y is odd) are ABI compatible. But we make no guarantees about wire protocols being compatible, or other things like that. So in general, it's "pleasantly surprising" if the different releases work together, but I wouldn't rely on it *at all*. :-) On Oct 29, 2010, at 12:12 PM, guillaume ranquet wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Hi list, > I'm sorry to bother you with a stupid question. > > we intend to have for a short period of time, some nodes with 1.4.3 and > others with 1.4.1 (before upgrading everyone to 1.4.3). > > I made various test and found both versions to be running together quite > well with a mixed set of nodes. > > my tests were quite simple, I compiled and ran mpi hello_worlds with > both versions. > It wouldn't be serious for me to assume both versions fully compatible > after these tests -and I must admit I lack the time and technical > knowledge to run further testing. > > has anyone any insight on what have changed that would break compatibility? > I guess nothing, since they are the same major.minor :) > > > regards, > Guillaume Ranquet. > -BEGIN PGP SIGNATURE- > Version: GnuPG v2.0.16 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ > > iQEcBAEBAgAGBQJMyvJTAAoJEEzIl7PMEAliT4kH/RY4WXhEO5R8H3DNIWW7Y91z > 6q4BrymLrBSl7rnnEgALBMiPGK9lQgqEtv9k3xCFbfOfWXIFreIHH8ZFDzt1BjQI > TZ58SwVE9CIMmESoJ1P52R+WCbKYur3U2eda//1cfnZ28ZYjnKN/xYlT/wv8hqg3 > GsW+seMR8X+1nNFkH1UQHIBVO2cXaK24BtSe4cvDFaMaUbe0Qlmxg55BbCSYB4ED > VBbplp/ty0tojmZdJLqSsp7nZ84oCfvAfZf16fJTDHNYhUSvNz/fldnxWrm7WTUb > VzM94yJf0IHfNAB/YvpXECGFL9cPWeG/F6Bm+r6GSMRvd0MeLbp1HWJTbVYlCwo= > =NwEP > -END PGP SIGNATURE- > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI users] mixed versions of openmpi ? (1.4.1 and 1.4.3)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi list, I'm sorry to bother you with a stupid question. we intend to have for a short period of time, some nodes with 1.4.3 and others with 1.4.1 (before upgrading everyone to 1.4.3). I made various test and found both versions to be running together quite well with a mixed set of nodes. my tests were quite simple, I compiled and ran mpi hello_worlds with both versions. It wouldn't be serious for me to assume both versions fully compatible after these tests -and I must admit I lack the time and technical knowledge to run further testing. has anyone any insight on what have changed that would break compatibility? I guess nothing, since they are the same major.minor :) regards, Guillaume Ranquet. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.16 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJMyvJTAAoJEEzIl7PMEAliT4kH/RY4WXhEO5R8H3DNIWW7Y91z 6q4BrymLrBSl7rnnEgALBMiPGK9lQgqEtv9k3xCFbfOfWXIFreIHH8ZFDzt1BjQI TZ58SwVE9CIMmESoJ1P52R+WCbKYur3U2eda//1cfnZ28ZYjnKN/xYlT/wv8hqg3 GsW+seMR8X+1nNFkH1UQHIBVO2cXaK24BtSe4cvDFaMaUbe0Qlmxg55BbCSYB4ED VBbplp/ty0tojmZdJLqSsp7nZ84oCfvAfZf16fJTDHNYhUSvNz/fldnxWrm7WTUb VzM94yJf0IHfNAB/YvpXECGFL9cPWeG/F6Bm+r6GSMRvd0MeLbp1HWJTbVYlCwo= =NwEP -END PGP SIGNATURE- smime.p7s Description: S/MIME Cryptographic Signature
Re: [OMPI users] open MPI please recommend a debugger for open MPI
I find that using mpirun to launch multiple instance of a serial debugger is fairly usable (but not perfect) for jobs with fewer than about four processes. A description of how to do this is here: http://www.open-mpi.org/faq/?category=debugging The biggest drawbacks to this approach are that a) setting breakpoints and stepping between lines must be controlled separately for each process b) restarting the job requires ending all of your debugger sessions. -Brian On Fri, Oct 29, 2010 at 4:06 AM, Jeremy Roberts wrote: > I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or DDT > (http://www.allinea.com/). I've used TotalView pretty extensively and found > it to be pretty easy to use. They are both commercial, however, and not > cheap. > > As far as I know, there isn't a whole lot of open source support for > parallel debugging. The Parallel Tools Platform of Eclipse claims to provide > a parallel debugger, though I have yet to try it > (http://www.eclipse.org/ptp/). > > Jeremy > > On Fri, Oct 29, 2010 at 12:55 AM, Jack Bryan wrote: >> >> Hi, >> Would you please recommend a debugger, which can do debugging for parallel >> processes >> on Open MPI systems ? >> I hope that it can be installed without root right because I am not a root >> user for our >> MPI cluster. >> Any help is appreciated. >> Thanks >> Jack >> Oct. 28 2010 >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] failed to install openmpi trunk
Couple of things stand out: 1. you definitely don't want to use a copy of the trunk beyond r23924. The developer's trunk is undergoing some major change and orcm no longer is in-sync with it. I probably won't update orcm to match until later this year (will freeze integration at r23924). 2. the configure options don't look right to me - they should simply be: ./configure --prefix= --with-platform=contrib/platform/cisco/linux I believe the errors are caused by confusion due to the various configure options. HTH Ralph On Oct 29, 2010, at 3:32 AM, Vasiliy G Tolstov wrote: > Hello. I'm try to build orcm , in dependencies it need openmpi trunk > with some options have been enabled. > > Install fails with message: > Creating orte-migrate.1 man page... > x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../../opal/include > -I../../../orte/include > -I../../../opal/mca/paffinity/hwloc/hwloc/include/private > -I../../../opal/mca/paffinity/hwloc/hwloc/include/hwloc -I../../.. > -march=native -pipe -O2 -g -Wall -Wundef -Wno-long-long -Wsign-compare > -Wmissing-prototypes -Wstrict-prototypes -Wcomment -pedantic > -Werror-implicit-function-declaration -finline-functions > -fno-strict-aliasing -pthread > -I/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/opal/mca/paffinity/hwloc/hwloc/include > -fvisibility=hidden -c -o orte-migrate.o orte-migrate.c > make[2]: Leaving directory > `/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/orte/tools/orte-migrate' > make[1]: Leaving directory > `/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/orte' > orte-migrate.c:101:39: error: 'ORTE_ERRMGR_MIGRATE_STATE_NONE' > undeclared here (not in a function) > orte-migrate.c: In function 'main': > orte-migrate.c:221:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_FINISH' > undeclared (first use in this function) > orte-migrate.c:221:12: note: each undeclared identifier is reported only > once for each function it appears in > orte-migrate.c:222:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERROR' > undeclared (first use in this function) > orte-migrate.c:223:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERR_INPROGRESS' > undeclared (first use in this function) > orte-migrate.c: In function 'hnp_receiver': > orte-migrate.c:531:5: error: 'orte_errmgr_tool_cmd_flag_t' undeclared > (first use in this function) > orte-migrate.c:531:33: error: expected ';' before 'command' > orte-migrate.c:532:5: warning: ISO C90 forbids mixed declarations and > code > orte-migrate.c:542:56: error: 'command' undeclared (first use in this > function) > orte-migrate.c:542:73: error: 'ORTE_ERRMGR_MIGRATE_TOOL_CMD' undeclared > (first use in this function) > orte-migrate.c:548:14: error: 'ORTE_ERRMGR_MIGRATE_TOOL_UPDATE_CMD' > undeclared (first use in this function) > orte-migrate.c:555:14: error: 'ORTE_ERRMGR_MIGRATE_TOOL_INIT_CMD' > undeclared (first use in this function) > orte-migrate.c: In function 'process_ckpt_update_cmd': > orte-migrate.c:597:9: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERR_INPROGRESS' > undeclared (first use in this function) > orte-migrate.c:609:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_FINISH' > undeclared (first use in this function) > orte-migrate.c: In function 'notify_hnp': > orte-migrate.c:622:5: error: 'orte_errmgr_tool_cmd_flag_t' undeclared > (first use in this function) > orte-migrate.c:622:33: error: expected ';' before 'command' > orte-migrate.c:643:55: error: 'command' undeclared (first use in this > function) > orte-migrate.c:643:67: error: 'ORTE_ERRMGR_MIGRATE_TOOL_CMD' undeclared > (first use in this function) > orte-migrate.c: In function 'pretty_print_status': > orte-migrate.c:710:5: error: implicit declaration of function > 'orte_errmgr_base_migrate_state_str' > make[2]: *** [orte-migrate.o] Error 1 > make[1]: *** [all-recursive] Error 1 > make: *** [all-recursive] Error 1 > > > Configured with flags: > configure: OMPI configuring in opal/libltdl > configure: running /bin/sh './configure' '--prefix=/usr' > '--host=x86_64-pc-linux-gnu' '--build=x86_64-pc-linux-gnu' > '--mandir=/usr/share/man' '--infodir=/usr/share/info' > '--datadir=/usr/share' '--docdir=/usr/share/doc/openmpi-scm' > '--sysconfdir=/etc' '--localstatedir=/var/lib' > '--disable-dependency-tracking' '--disable-silent-rules' > '--enable-fast-install' '--libdir=/usr/lib64' > '--sysconfdir=/etc/openmpi' '--enable-pretty-print-stacktrace' > '--enable-orterun-prefix-by-default' > '--with-platform=contrib/platform/cisco/ebuild/native' > '--enable-multicast' '--with-ft=orcm' '--enable-sensors' > '--enable-heartbeat' '--enable-mpi-threads' '--enable-progress-threads' > '--disable-mpi-f90' '--disable-mpi-f77' '--enable-contrib-no-build=vt' > '--disable-io-romio' '--enable-heterogeneous' '--enable-ipv6' > 'build_alias=x86_64-pc-linux-gnu' 'host_alias=x86_64-pc-linux-gnu' > 'CC=x86_64-pc-linux-gnu-gcc' 'CFLAGS=-march=native -pipe -O2' 'CPP=cpp' > --enable-ltdl-convenience --disable-ltdl-install --enable-shared > --disable-static --cache-file=/d
Re: [OMPI users] open MPI please recommend a debugger for open MPI
I'd suggest looking into TotalView (http://www.totalviewtech.com) and/or DDT (http://www.allinea.com/). I've used TotalView pretty extensively and found it to be pretty easy to use. They are both commercial, however, and not cheap. As far as I know, there isn't a whole lot of open source support for parallel debugging. The Parallel Tools Platform of Eclipse claims to provide a parallel debugger, though I have yet to try it ( http://www.eclipse.org/ptp/). Jeremy On Fri, Oct 29, 2010 at 12:55 AM, Jack Bryan wrote: > Hi, > > Would you please recommend a debugger, which can do debugging for parallel > processes > on Open MPI systems ? > > I hope that it can be installed without root right because I am not a root > user for our > MPI cluster. > > Any help is appreciated. > > Thanks > > Jack > > Oct. 28 2010 > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] cannot install Open MPI 1.5 on Solaris x86_64 with Oracle/Sun C 5.11
Sorry, but can you give us the config line, the config.log and the full output of make preferrably with make V=1? --td On 10/29/2010 04:30 AM, Siegmar Gross wrote: Hi, I tried to build Open MPI 1.5 on Solaris X86 and x86_64 with Oracle Studio 12.2. I can compile Open MPI with thread support, but I can only partly install it because "libtool" will not find "f95" although it is available. "make check" shows no failures. tyr openmpi-1.5-SunOS.x86_64.32_cc 188 ssh sunpc4 cc -V cc: Sun C 5.11 SunOS_i386 145355-01 2010/10/11 usage: cc [ options ] files. Use 'cc -flags' for details No suspicious warnings or errors in log.configure.SunOS.x86_64.32_cc. tyr openmpi-1.5-SunOS.x86_64.32_cc 182 grep -i warning: log.make.SunOS.x86_64.32_cc | more ".../opal/mca/crs/none/crs_none_module.c", line 136: warning: statement not reached ".../orte/mca/errmgr/errmgr.h", line 135: warning: attribute "noreturn" may not be applied to variable, ignored (a lot of these warnings) ".../orte/mca/rmcast/tcp/rmcast_tcp.c", line 982: warning: assignment type mismatch: ".../orte/mca/rmcast/tcp/rmcast_tcp.c", line 1023: warning: assignment type mismatch: ".../orte/mca/rmcast/udp/rmcast_udp.c", line 877: warning: assignment type mismatch: ".../orte/mca/rmcast/udp/rmcast_udp.c", line 918: warning: assignment type mismatch: ".../orte/tools/orte-ps/orte-ps.c", line 288: warning: initializer does not fit or is out of range: 0xfffe ".../orte/tools/orte-ps/orte-ps.c", line 289: warning: initializer does not fit or is out of range: 0xfffe grep -i error: log.make.SunOS.x86_64.32_cc | more tyr openmpi-1.5-SunOS.x86_64.32_cc 185 grep -i FAIL log.make-check.SunOS.x86_64.32_cc tyr openmpi-1.5-SunOS.x86_64.32_cc 186 grep -i SKIP log.make-check.SunOS.x86_64.32_cc tyr openmpi-1.5-SunOS.x86_64.32_cc 187 grep -i PASS log.make-check.SunOS.x86_64.32_cc PASS: predefined_gap_test File opened with dladvise_local, all passed PASS: dlopen_test All 2 tests passed - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_barrier - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_barrier_noinline - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_spinlock - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_spinlock_noinline - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_math - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_math_noinline - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_cmpset - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_cmpset_noinline All 8 tests passed All 0 tests passed All 0 tests passed decode [PASSED] PASS: opal_datatype_test PASS: checksum PASS: position decode [PASSED] PASS: ddt_test decode [PASSED] PASS: ddt_raw All 5 tests passed SUPPORT: OMPI Test Passed: opal_path_nfs(): (0 tests) PASS: opal_path_nfs 1 test passed tyr openmpi-1.5-SunOS.x86_64.32_cc 190 grep -i warning: log.make-install.SunOS.x86_64.32_cc | more libtool: install: warning: relinking `libmpi_cxx.la' libtool: install: warning: relinking `libmpi_f77.la' libtool: install: warning: relinking `libmpi_f90.la' tyr openmpi-1.5-SunOS.x86_64.32_cc 191 grep -i error: log.make-install.SunOS.x86_64.32_cc | more libtool: install: error: relink `libmpi_f90.la' with the above command before installing it tyr openmpi-1.5-SunOS.x86_64.32_cc 194 tail -20 log.make-install.SunOS.x86_64.32_cc make[4]: Leaving directory `.../ompi/mpi/f90/scripts' make[4]: Entering directory `.../ompi/mpi/f90' make[5]: Entering directory `.../ompi/mpi/f90' test -z "/usr/local/openmpi-1.5_32_cc/lib" || /usr/local/bin/mkdir -p "/usr/local/openmpi-1.5_32_cc/lib" /bin/bash ../../../libtool --mode=install /usr/local/bin/install -c libmpi_f90.la '/usr/local/openmpi-1.5_32_cc/lib' libtool: install: warning: relinking `libmpi_f90.la' libtool: install: (cd /export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/ompi/mpi/f90; /bin/bash /export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/libtool --silent --tag FC --mode=relink f95 -I../../../ompi/include -I../../../../openmpi-1.5/ompi/include -I. -I../../../../openmpi-1.5/ompi/mpi/f90 -I../../../ompi/mpi/f90 -m32 -version-info 1:0:0 -export-dynamic -m32 -o libmpi_f90.la -rpath /usr/local/openmpi-1.5_32_cc/lib mpi.lo mpi_sizeof.lo mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo mpi_testsome_f90.
[OMPI users] failed to install openmpi trunk
Hello. I'm try to build orcm , in dependencies it need openmpi trunk with some options have been enabled. Install fails with message: Creating orte-migrate.1 man page... x86_64-pc-linux-gnu-gcc -DHAVE_CONFIG_H -I. -I../../../opal/include -I../../../orte/include -I../../../opal/mca/paffinity/hwloc/hwloc/include/private -I../../../opal/mca/paffinity/hwloc/hwloc/include/hwloc -I../../.. -march=native -pipe -O2 -g -Wall -Wundef -Wno-long-long -Wsign-compare -Wmissing-prototypes -Wstrict-prototypes -Wcomment -pedantic -Werror-implicit-function-declaration -finline-functions -fno-strict-aliasing -pthread -I/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/opal/mca/paffinity/hwloc/hwloc/include -fvisibility=hidden -c -o orte-migrate.o orte-migrate.c make[2]: Leaving directory `/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/orte/tools/orte-migrate' make[1]: Leaving directory `/var/tmp/paludis/build/sys-cluster-openmpi-scm/work/openmpi-scm/orte' orte-migrate.c:101:39: error: 'ORTE_ERRMGR_MIGRATE_STATE_NONE' undeclared here (not in a function) orte-migrate.c: In function 'main': orte-migrate.c:221:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_FINISH' undeclared (first use in this function) orte-migrate.c:221:12: note: each undeclared identifier is reported only once for each function it appears in orte-migrate.c:222:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERROR' undeclared (first use in this function) orte-migrate.c:223:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERR_INPROGRESS' undeclared (first use in this function) orte-migrate.c: In function 'hnp_receiver': orte-migrate.c:531:5: error: 'orte_errmgr_tool_cmd_flag_t' undeclared (first use in this function) orte-migrate.c:531:33: error: expected ';' before 'command' orte-migrate.c:532:5: warning: ISO C90 forbids mixed declarations and code orte-migrate.c:542:56: error: 'command' undeclared (first use in this function) orte-migrate.c:542:73: error: 'ORTE_ERRMGR_MIGRATE_TOOL_CMD' undeclared (first use in this function) orte-migrate.c:548:14: error: 'ORTE_ERRMGR_MIGRATE_TOOL_UPDATE_CMD' undeclared (first use in this function) orte-migrate.c:555:14: error: 'ORTE_ERRMGR_MIGRATE_TOOL_INIT_CMD' undeclared (first use in this function) orte-migrate.c: In function 'process_ckpt_update_cmd': orte-migrate.c:597:9: error: 'ORTE_ERRMGR_MIGRATE_STATE_ERR_INPROGRESS' undeclared (first use in this function) orte-migrate.c:609:12: error: 'ORTE_ERRMGR_MIGRATE_STATE_FINISH' undeclared (first use in this function) orte-migrate.c: In function 'notify_hnp': orte-migrate.c:622:5: error: 'orte_errmgr_tool_cmd_flag_t' undeclared (first use in this function) orte-migrate.c:622:33: error: expected ';' before 'command' orte-migrate.c:643:55: error: 'command' undeclared (first use in this function) orte-migrate.c:643:67: error: 'ORTE_ERRMGR_MIGRATE_TOOL_CMD' undeclared (first use in this function) orte-migrate.c: In function 'pretty_print_status': orte-migrate.c:710:5: error: implicit declaration of function 'orte_errmgr_base_migrate_state_str' make[2]: *** [orte-migrate.o] Error 1 make[1]: *** [all-recursive] Error 1 make: *** [all-recursive] Error 1 Configured with flags: configure: OMPI configuring in opal/libltdl configure: running /bin/sh './configure' '--prefix=/usr' '--host=x86_64-pc-linux-gnu' '--build=x86_64-pc-linux-gnu' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--datadir=/usr/share' '--docdir=/usr/share/doc/openmpi-scm' '--sysconfdir=/etc' '--localstatedir=/var/lib' '--disable-dependency-tracking' '--disable-silent-rules' '--enable-fast-install' '--libdir=/usr/lib64' '--sysconfdir=/etc/openmpi' '--enable-pretty-print-stacktrace' '--enable-orterun-prefix-by-default' '--with-platform=contrib/platform/cisco/ebuild/native' '--enable-multicast' '--with-ft=orcm' '--enable-sensors' '--enable-heartbeat' '--enable-mpi-threads' '--enable-progress-threads' '--disable-mpi-f90' '--disable-mpi-f77' '--enable-contrib-no-build=vt' '--disable-io-romio' '--enable-heterogeneous' '--enable-ipv6' 'build_alias=x86_64-pc-linux-gnu' 'host_alias=x86_64-pc-linux-gnu' 'CC=x86_64-pc-linux-gnu-gcc' 'CFLAGS=-march=native -pipe -O2' 'CPP=cpp' --enable-ltdl-convenience --disable-ltdl-install --enable-shared --disable-static --cache-file=/dev/null --srcdir=. --disable-option-checking -- Vasiliy G Tolstov Selfip.Ru
[OMPI users] cannot install Open MPI 1.5 on Solaris x86_64 with Oracle/Sun C 5.11
Hi, I tried to build Open MPI 1.5 on Solaris X86 and x86_64 with Oracle Studio 12.2. I can compile Open MPI with thread support, but I can only partly install it because "libtool" will not find "f95" although it is available. "make check" shows no failures. tyr openmpi-1.5-SunOS.x86_64.32_cc 188 ssh sunpc4 cc -V cc: Sun C 5.11 SunOS_i386 145355-01 2010/10/11 usage: cc [ options ] files. Use 'cc -flags' for details No suspicious warnings or errors in log.configure.SunOS.x86_64.32_cc. tyr openmpi-1.5-SunOS.x86_64.32_cc 182 grep -i warning: log.make.SunOS.x86_64.32_cc | more ".../opal/mca/crs/none/crs_none_module.c", line 136: warning: statement not reached ".../orte/mca/errmgr/errmgr.h", line 135: warning: attribute "noreturn" may not be applied to variable, ignored (a lot of these warnings) ".../orte/mca/rmcast/tcp/rmcast_tcp.c", line 982: warning: assignment type mismatch: ".../orte/mca/rmcast/tcp/rmcast_tcp.c", line 1023: warning: assignment type mismatch: ".../orte/mca/rmcast/udp/rmcast_udp.c", line 877: warning: assignment type mismatch: ".../orte/mca/rmcast/udp/rmcast_udp.c", line 918: warning: assignment type mismatch: ".../orte/tools/orte-ps/orte-ps.c", line 288: warning: initializer does not fit or is out of range: 0xfffe ".../orte/tools/orte-ps/orte-ps.c", line 289: warning: initializer does not fit or is out of range: 0xfffe grep -i error: log.make.SunOS.x86_64.32_cc | more tyr openmpi-1.5-SunOS.x86_64.32_cc 185 grep -i FAIL log.make-check.SunOS.x86_64.32_cc tyr openmpi-1.5-SunOS.x86_64.32_cc 186 grep -i SKIP log.make-check.SunOS.x86_64.32_cc tyr openmpi-1.5-SunOS.x86_64.32_cc 187 grep -i PASS log.make-check.SunOS.x86_64.32_cc PASS: predefined_gap_test File opened with dladvise_local, all passed PASS: dlopen_test All 2 tests passed - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_barrier - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_barrier_noinline - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_spinlock - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_spinlock_noinline - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_math - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_math_noinline - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_cmpset - 1 threads: Passed - 2 threads: Passed - 4 threads: Passed - 5 threads: Passed - 8 threads: Passed PASS: atomic_cmpset_noinline All 8 tests passed All 0 tests passed All 0 tests passed decode [PASSED] PASS: opal_datatype_test PASS: checksum PASS: position decode [PASSED] PASS: ddt_test decode [PASSED] PASS: ddt_raw All 5 tests passed SUPPORT: OMPI Test Passed: opal_path_nfs(): (0 tests) PASS: opal_path_nfs 1 test passed tyr openmpi-1.5-SunOS.x86_64.32_cc 190 grep -i warning: log.make-install.SunOS.x86_64.32_cc | more libtool: install: warning: relinking `libmpi_cxx.la' libtool: install: warning: relinking `libmpi_f77.la' libtool: install: warning: relinking `libmpi_f90.la' tyr openmpi-1.5-SunOS.x86_64.32_cc 191 grep -i error: log.make-install.SunOS.x86_64.32_cc | more libtool: install: error: relink `libmpi_f90.la' with the above command before installing it tyr openmpi-1.5-SunOS.x86_64.32_cc 194 tail -20 log.make-install.SunOS.x86_64.32_cc make[4]: Leaving directory `.../ompi/mpi/f90/scripts' make[4]: Entering directory `.../ompi/mpi/f90' make[5]: Entering directory `.../ompi/mpi/f90' test -z "/usr/local/openmpi-1.5_32_cc/lib" || /usr/local/bin/mkdir -p "/usr/local/openmpi-1.5_32_cc/lib" /bin/bash ../../../libtool --mode=install /usr/local/bin/install -c libmpi_f90.la '/usr/local/openmpi-1.5_32_cc/lib' libtool: install: warning: relinking `libmpi_f90.la' libtool: install: (cd /export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/ompi/mpi/f90; /bin/bash /export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/libtool --silent --tag FC --mode=relink f95 -I../../../ompi/include -I../../../../openmpi-1.5/ompi/include -I. -I../../../../openmpi-1.5/ompi/mpi/f90 -I../../../ompi/mpi/f90 -m32 -version-info 1:0:0 -export-dynamic -m32 -o libmpi_f90.la -rpath /usr/local/openmpi-1.5_32_cc/lib mpi.lo mpi_sizeof.lo mpi_comm_spawn_multiple_f90.lo mpi_testall_f90.lo mpi_testsome_f90.lo mpi_waitall_f90.lo mpi_waitsome_f90.lo mpi_wtick_f90.lo mpi_wtime_f90.lo ../../../ompi/mpi/f77/libmpi_f77.la -lsocket -lnsl -lrt -lm ) /export2/src/openmpi-1.5/openmpi-1.5-SunOS.x86_64.32_cc/libtool: line 7846: f95:
[OMPI users] Need Help for understand heat equation 2D mpi solving version
Hello, > > i am still trying to understand the parallelized version of the heat > equation 2D solving that we saw at school. In order to explain my problem, i > need to list the main code : > > 9 program heat > 10 > !** > 11 ! > 12 ! This program solves the heat equation on the unit square > [0,1]x[0,1] > 13 !| du/dt - Delta(u) = 0 > 14 !| u/gamma = cste > 15 ! by implementing a explicit scheme. > 16 ! The discretization is done using a 5 point finite difference scheme > 17 ! and the domain is decomposed into sub-domains. > 18 ! The PDE is discretized using a 5 point finite difference scheme > 19 ! over a (x_dim+2)*(x_dim+2) grid including the end points > 20 ! correspond to the boundary points that are stored. > 21 ! > 22 ! The data on the whole domain are stored in > 23 ! the following way : > 24 ! > 25 !y > 26 ! > 27 !d | | > 28 !i | | > 29 !r | | > 30 !e | | > 31 !c | | > 32 !t | | > 33 !i | x20 | > 34 !o /\ | | > 35 !n | | x10 | > 36 ! | | | > 37 ! | | x00 x01 x02 ... | > 38 ! | > 39 !---> x direction x(*,j) > 40 ! > 41 ! The boundary conditions are stored in the following submatrices > 42 ! > 43 ! > 44 !x(1:x_dim, 0) ---> left temperature > 45 !x(1:x_dim, x_dim+1)---> right temperature > 46 !x(0, 1:x_dim) ---> toptemperature > 47 !x(x_dim+1, 1:x_dim)---> bottom temperature > 48 ! > 49 > !** > 50 implicit none > 51 include 'mpif.h' > 52 ! size of the discretization > 53 integer :: x_dim, nb_iter > 54 double precision, allocatable :: x(:,:),b(:,:),x0(:,:) > 55 double precision :: dt, h, epsilon > 56 double precision :: resLoc, result, t, tstart, tend > 57 ! > 58 integer :: i,j > 59 integer :: step, maxStep > 60 integer :: size_x, size_y, me, x_domains,y_domains > 61 integer :: iconf(5), size_x_glo > 62 double precision conf(2) > 63 ! > 64 ! MPI variables > 65 integer :: nproc, infompi, comm, comm2d, lda, ndims > 66 INTEGER, DIMENSION(2) :: dims > 67 LOGICAL, DIMENSION(2) :: periods > 68 LOGICAL, PARAMETER :: reorganisation = .false. > 69 integer :: row_type > 70 integer, parameter :: nbvi=4 > 71 integer, parameter :: S=1, E=2, N=3, W=4 > 72 integer, dimension(4) :: neighBor > 73 > 74 ! > 75 intrinsic abs > 76 ! > 77 ! > 78 call MPI_INIT(infompi) > 79 comm = MPI_COMM_WORLD > 80 call MPI_COMM_SIZE(comm,nproc,infompi) > 81 call MPI_COMM_RANK(comm,me,infompi) > 82 ! > 83 ! > 84 if (me.eq.0) then > 85 call readparam(iconf, conf) > 86 endif > 87 call MPI_BCAST(iconf,5,MPI_INTEGER,0,comm,infompi) > 88 call MPI_BCAST(conf,2,MPI_DOUBLE_PRECISION,0,comm,infompi) > 89 ! > 90 size_x= iconf(1) > 91 size_y= iconf(1) > 92 x_domains = iconf(3) > 93 y_domains = iconf(4) > 94 maxStep = iconf(5) > 95 dt= conf(1) > 96 epsilon = conf(2) > 97 ! > 98 size_x_glo = x_domains*size_x+2 > 99 h = 1.0d0/dble(size_x_glo) > 100 dt = 0.25*h*h > 101 ! > 102 ! > 103 lda = size_y+2 > 104 allocate(x(0:size_y+1,0:size_x+1)) > 105 allocate(x0(0:size_y+1,0:size_x+1)) > 106 allocate(b(0:size_y+1,0:size_x+1)) > 107 ! > 108 ! Create 2D cartesian grid > 109 periods(:) = .false. > 110 > 111 ndims = 2 > 112 dims(1)=x_domains > 113 dims(2)=y_domains > 114 CALL MPI_CART_CREATE(MPI_COMM_WORLD, ndims, dims, periods, & > 115 reorganisation,comm2d,infompi) > 116 ! > 117 ! Identify neighbors > 118 ! > 119 NeighBor(:) = MPI_PROC_NULL > 120 ! Left/West and right/Est neigbors > 121 CALL MPI_CART_SHIFT(comm2d,0,1,NeighBor(W),NeighBor(E),infompi) > 122 > 123 print *,'mpi_proc_null=', MPI_PROC_NULL > 124 print *,'rang=', me > 125 print *, 'ici premier mpi_cart_shift : neighbor(w)=',NeighBor(W) > 126 print *, 'ici premier mpi_cart_shift : neighbor(e)=',NeighBor(E) > 127 > 128 ! Bottom/South and Upper/North neigbors > 129 CALL MPI_CART_SHIFT(comm2d,1,1,NeighBor(S),NeighBor(N),infompi) > 130 > 131 > 132 print *, '
[OMPI users] open MPI please recommend a debugger for open MPI
Hi, Would you please recommend a debugger, which can do debugging for parallel processes on Open MPI systems ? I hope that it can be installed without root right because I am not a root user for ourMPI cluster. Any help is appreciated. Thanks Jack Oct. 28 2010