Re: [OMPI users] is there an equiv of iprove for bcast?

2011-05-03 Thread Jeff Squyres
I don't quite understand your architecture enough to answer your question.  
E.g., someone pointed out to me off-list that if you only have 1 listener, a 
send is effectively the same thing as a broadcast (for which you could 
test/wait on a non-blocking receive, for example).

MPI broadcasts only work on fixed communicators -- meaning that you effectively 
have to know the root and the receivers ahead of time.  If the receivers don't 
know who the root will be beforehand, that's unfortunately not a good match for 
the MPI_Bcast operation.



On May 3, 2011, at 4:07 AM, Randolph Pullen wrote:

> 
> From: Randolph Pullen 
> Subject: Re: Re: [OMPI users] is there an equiv of iprove for bcast?
> To: us...@open-mpi.or
> Received: Monday, 2 May, 2011, 12:53 PM
> 
> Non blocking Bcasts or tests would do it.
> I currently have the clearing-house solution working but it is unsatisfying 
> because of its serial node. - As it scales it will overload this node.
> 
> The problem rephrased:
> Instead of n*2 processes, I am having to use n*2+1 with the extra process 
> serially receiving listener messages on behalf of the workers before 
> transmitting these messages to workers in its comm_group.
> 
> Is there a way to Bcast directly from each listener to the worker pool?  
> (listeners must monitor their ports most of the time and cant participate in 
> global bcasts)
> Not knowing which listener is going to transmit prevents the correct 
> comm_group being used with Bcast calls.
> 
> --- On Sat, 30/4/11, Jeff Squyres  wrote:
> 
> From: Jeff Squyres 
> Subject: Re: [OMPI users] is there an equiv of iprove for bcast?
> To: randolph_pul...@yahoo.com.au, "Open MPI Users" 
> Received: Saturday, 30 April, 2011, 7:17 AM
> 
> On Apr 29, 2011, at 1:21 AM, Randolph Pullen wrote:
> 
> > I am having a design issue:
> > My server application has 2 processes per node, 1 listener and 1 worker.
> > Each listener monitors a specified port for incoming TCP connections with 
> > the goal that on receipt of a request it will distribute it over the 
> > workers in a SIMD fashion.
> > 
> > My problem is this: how can I get the workers to accept work from any of 
> > the listeners?
> > Making a separate communicator does not help as the sender is unknown.  
> > Other than making a serial 'clearing house' process I cant think of a way  
> > - Iprobe for Bcast would be useful.
> 
> I'm not quite sure I understand your question.
> 
> There currently is no probe for collectives, but MPI-3 has non-blocking 
> collectives which you could MPI_Test for.  There's a 3rd party library 
> implementation called libNBC (non-blocking collectives) that you could use 
> until such things become natively available.
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Open MPI 1.4.3 - Mac OS X 10.6.7

2011-05-03 Thread Jeff Squyres
On May 3, 2011, at 5:29 PM, Paul Cizmas wrote:

> I have installed Gfortran GCC 4.4.4 and Absoft11.0.
> 
> It appears that I have i686-apple-darwin10-gcc-4.2.1.
> 
> When I run 
> 
> ./configure --prefix=/opt/openmpi1.4.3 F77=/Applications/Absoft11.0/bin/f77
> 
> and 
> 
> ./configure --prefix=/opt/openmpi1.4.3GF F77=/sw/bin/gfortran
> 
> in both cases I get the message:
> 
> ==
> It appears that your Fortran 77 compiler is unable to link against
> object files created by your C compiler.  This typically indicates
> one of a few possibilities:
> 
>  - A conflict between CFLAGS and FFLAGS
>  - A problem with your compiler installation(s)
>  - Different default build options between compilers (e.g., C
>building for 32 bit and Fortran building for 64 bit)
>  - Incompatible compilers

The problem is exactly what Open MPI is telling you -- the C compiler is not 
compatible with the Fortran compilers that you have specified.

If you have /sw/bin/gfortran, I'm guessing you have it installed via fink...?  
If so, you might also have a fink-installed gcc that is compatible with that 
gfortran.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Building openmpi with PGI 11.4: won't find torque??

2011-05-03 Thread Jeff Squyres
Here's the issue:

configure:110046: checking for tm_finalize
configure:110102: pgcc -o conftest -O -DNDEBUG-D_REENTRANT 
-I/opt/torque/include-L/opt/torque/lib64 -Wl,--rpath -Wl,/opt/torque/lib64 
conftest.c -lnsl -lutil-ltorque >&5
/usr/bin/ld: skipping incompatible /opt/torque/lib64/libtorque.so when 
searching for -ltorque
/usr/bin/ld: skipping incompatible /opt/torque/lib64/libtorque.a when searching 
for -ltorque
/usr/bin/ld: cannot find -ltorque

Somehow the compiler/linker doesn't think that /opt/torque/lib64/libtorque.so 
is compatible.  Is pgcc making 32 bit executables by default?  I.e., do you 
need to specify some flag to pgcc to force it to make 64 bit executables?  If 
so, specify it in CFLAGS --- something like this:

./configure CC=pgcc CXX=pgCC FC=pgfortran F77=pgfortran CFLAGS=-m64 \
CXXFLAGS=-m64 FCFLAGS=-m64 FFLAGS=-m64 ...

(I don't know that it's -m64; I just made that out)



On May 3, 2011, at 6:21 PM, Jim Kusznir wrote:

> My gzipp'ed config.log is attached.  Thanks!
> --Jim
> 
> On Tue, May 3, 2011 at 4:52 AM, Jeff Squyres  wrote:
>> It should search both tmdir/lib and tmdir/lib64 by default, IIRC.
>> 
>> Please send your config.log (please compress); it'll contain the specific 
>> reason why configure didn't find libtorque.
>> 
>> 
>> On May 2, 2011, at 10:21 PM, Ralph Castain wrote:
>> 
>>> It's probably looking for the torque lib in lib instead of lib64. There 
>>> should be a configure option to tell it --with-tm-libdir or something like 
>>> that - check "configure -h"
>>> 
>>> 
>>> On May 2, 2011, at 6:22 PM, Jim Kusznir wrote:
>>> 
 Hi all:
 
 I'm trying to build openmpi 1.4.3 against PGI 11.4 on my Rocks 5.1
 system.  My "tried and true" build command for OpenMPI is:
 
 CC=pgcc CXX=pgCC F77=pgf77 FC=pgf90 rpmbuild -bb --define
 'install_in_opt 1' --define 'install_modulefile 1' --define
 'modules_rpm_name environment-modules' --define 'build_all_in_one_rpm
 0'  --define 'configure_options --with-tm=/opt/torque' --define '_name
 openmpi-pgi2011' --define 'use_default_rpm_opt_flags 0'
 openmpi-1.4.3.spec
 
 This is what I've used to build openmpi 1.4.3 for gcc and against PGI
 8.x (our last version of PGI installed).  This time, its not working,
 though, and with what I consider to be a very strange failure point:
 
 --- MCA component plm:tm (m4 configuration macro)
 checking for MCA component plm:tm compile mode... dso
 checking --with-tm value... sanity check ok (/opt/torque)
 checking for pbs-config... /opt/torque/bin/pbs-config
 checking tm.h usability... yes
 checking tm.h presence... yes
 checking for tm.h... yes
 checking for tm_finalize... no
 checking tm.h usability... yes
 checking tm.h presence... yes
 checking for tm.h... yes
 looking for library in lib
 checking for tm_init in -lpbs... no
 looking for library in lib64
 checking for tm_init in -lpbs... no
 looking for library in lib
 checking for tm_init in -ltorque... no
 looking for library in lib64
 checking for tm_init in -ltorque... no
 configure: error: TM support requested but not found.  Aborting
 error: Bad exit status from /var/tmp/rpm-tmp.7564 (%build)
 
 
 However, /opt/torque/ is present.  /opt/torque/bin/pbs-config returns:
 [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --prefix
 /opt/torque
 [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --package
 pbs
 [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --version
 2.3.0
 [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --libs
 -L/opt/torque/lib64 -ltorque -Wl,--rpath -Wl,/opt/torque/lib64
 
 and /opt/torque/lib64 does have:
 [root@aeolus modulefiles]# ls /opt/torque/lib64
 libtorque.a  libtorque.la  libtorque.so  libtorque.so.2  libtorque.so.2.0.0
 
 so I'm a bit dumbfounded as to why configure doesn't "find" torque
 support...Any suggestions?
 
 --Jim
 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] Building openmpi with PGI 11.4: won't find torque??

2011-05-03 Thread Jim Kusznir
My gzipp'ed config.log is attached.  Thanks!
--Jim

On Tue, May 3, 2011 at 4:52 AM, Jeff Squyres  wrote:
> It should search both tmdir/lib and tmdir/lib64 by default, IIRC.
>
> Please send your config.log (please compress); it'll contain the specific 
> reason why configure didn't find libtorque.
>
>
> On May 2, 2011, at 10:21 PM, Ralph Castain wrote:
>
>> It's probably looking for the torque lib in lib instead of lib64. There 
>> should be a configure option to tell it --with-tm-libdir or something like 
>> that - check "configure -h"
>>
>>
>> On May 2, 2011, at 6:22 PM, Jim Kusznir wrote:
>>
>>> Hi all:
>>>
>>> I'm trying to build openmpi 1.4.3 against PGI 11.4 on my Rocks 5.1
>>> system.  My "tried and true" build command for OpenMPI is:
>>>
>>> CC=pgcc CXX=pgCC F77=pgf77 FC=pgf90 rpmbuild -bb --define
>>> 'install_in_opt 1' --define 'install_modulefile 1' --define
>>> 'modules_rpm_name environment-modules' --define 'build_all_in_one_rpm
>>> 0'  --define 'configure_options --with-tm=/opt/torque' --define '_name
>>> openmpi-pgi2011' --define 'use_default_rpm_opt_flags 0'
>>> openmpi-1.4.3.spec
>>>
>>> This is what I've used to build openmpi 1.4.3 for gcc and against PGI
>>> 8.x (our last version of PGI installed).  This time, its not working,
>>> though, and with what I consider to be a very strange failure point:
>>>
>>> --- MCA component plm:tm (m4 configuration macro)
>>> checking for MCA component plm:tm compile mode... dso
>>> checking --with-tm value... sanity check ok (/opt/torque)
>>> checking for pbs-config... /opt/torque/bin/pbs-config
>>> checking tm.h usability... yes
>>> checking tm.h presence... yes
>>> checking for tm.h... yes
>>> checking for tm_finalize... no
>>> checking tm.h usability... yes
>>> checking tm.h presence... yes
>>> checking for tm.h... yes
>>> looking for library in lib
>>> checking for tm_init in -lpbs... no
>>> looking for library in lib64
>>> checking for tm_init in -lpbs... no
>>> looking for library in lib
>>> checking for tm_init in -ltorque... no
>>> looking for library in lib64
>>> checking for tm_init in -ltorque... no
>>> configure: error: TM support requested but not found.  Aborting
>>> error: Bad exit status from /var/tmp/rpm-tmp.7564 (%build)
>>>
>>>
>>> However, /opt/torque/ is present.  /opt/torque/bin/pbs-config returns:
>>> [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --prefix
>>> /opt/torque
>>> [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --package
>>> pbs
>>> [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --version
>>> 2.3.0
>>> [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --libs
>>> -L/opt/torque/lib64 -ltorque -Wl,--rpath -Wl,/opt/torque/lib64
>>>
>>> and /opt/torque/lib64 does have:
>>> [root@aeolus modulefiles]# ls /opt/torque/lib64
>>> libtorque.a  libtorque.la  libtorque.so  libtorque.so.2  libtorque.so.2.0.0
>>>
>>> so I'm a bit dumbfounded as to why configure doesn't "find" torque
>>> support...Any suggestions?
>>>
>>> --Jim
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


config.log.gz
Description: GNU Zip compressed data


[OMPI users] Open MPI 1.4.3 - Mac OS X 10.6.7

2011-05-03 Thread Paul Cizmas
Hello:

I am trying to install OpenMPI 1.4.3 on a Mac OS 10.6.7.  

I have installed Gfortran GCC 4.4.4 and Absoft11.0.

It appears that I have i686-apple-darwin10-gcc-4.2.1.

When I run 

./configure --prefix=/opt/openmpi1.4.3 F77=/Applications/Absoft11.0/bin/f77

and 

./configure --prefix=/opt/openmpi1.4.3GF F77=/sw/bin/gfortran

in both cases I get the message:

==
It appears that your Fortran 77 compiler is unable to link against
object files created by your C compiler.  This typically indicates
one of a few possibilities:

  - A conflict between CFLAGS and FFLAGS
  - A problem with your compiler installation(s)
  - Different default build options between compilers (e.g., C
building for 32 bit and Fortran building for 64 bit)
  - Incompatible compilers

Such problems can usually be solved by picking compatible compilers
and/or CFLAGS and FFLAGS.  More information (including exactly what
command was given to the compilers and what error resulted when the
commands were executed) is available in the config.log file in this
directory.
**
configure: error: C and Fortran 77 compilers are not link compatible.  Can not 
continue.
==

I read the FAQ but did not find suggestions about this problem.

What should be my next step?

Thank you,

Paul








Re: [OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1

2011-05-03 Thread Steph Bredenhann
Done, thanks a lot, the result is attached again for final scrutiny.

And I have now moved to my Linux box so as NOT to make a mistake with
Windows again!

Regards


Steph

On Tue, 2011-05-03 at 13:59 -0600, Damien wrote:

> That last error is because you don't have permission to install
> to /opt as a regular user.  You need to run that command as  "sudo
> make install".
> 
> Damien
> 
> On 03/05/2011 1:55 PM, Steph Bredenhann wrote: 
> 
> > I think you are a genius!
> > 
> > The new result is attached, it was only the last step make install that
> > looked suspect.
> > 
> > I'll appreciate if you can look at these results?
> > 
> > While I am at it, thank you a million times for making this available to the
> > public! Without openmpi I would not have been able to complete my PhD!!!
> > 
> > Thanks
> > 
> > -Original Message-
> > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
> > Behalf Of Jeff Squyres
> > Sent: Tuesday, May 03, 2011 21:27
> > To: Open MPI Users
> > Subject: Re: [OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1
> > 
> > Ah, I see why your output is munged -- there's a bunch of ^M's in there.
> > 
> > It looks like OMPI's configure script got mucked up somehow.  Did you expand
> > the tarball on a windows machine and copy it over to a Linux box, perchance?
> > If so, try expanding it directly on your Linux machine.
> > 
> > 
> > 
> > On May 3, 2011, at 2:15 PM, Steph Bredenhann wrote:
> > 
> > 
> > > Thanks for the speedy reply. The required file with information is
> > 
> > attached.
> > 
> > > I first thought I must send the file to openmpi again, sorry if that was
> > 
> > wrong.
> > 
> > > Thanks
> > > 
> > > 
> > > --
> > > Steph Bredenhann Pr.Eng Pr.CPM
> > > 
> > > 
> > > Quoting Jeff Squyres :
> > > 
> > > 
> > > > Your output appears jumbled.  Can you send all the data listed here:
> > > > 
> > > >http://www.open-mpi.org/community/help/
> > > > 
> > > > On May 3, 2011, at 1:36 PM, Steph Bredenhann wrote:
> > > > 
> > > > 
> > > > > Dear Sir/Madam
> > > > > 
> > > > > I want to build openmpi for use with INTEL compilers (version 11.1) 
> > > > > on my
> > > > 
> > > > Ubuntu
> > > > 
> > > > > 10.10 x64 system. I am using the guidelines from
> > > > > 
> > > 
> > > http://software.intel.com/en-us/articles/performance-tools-for-softwar
> > > e-developers-building-open-mpi-with-the-intel-compilers/
> > > 
> > > > > and specifically the following instructions:
> > > > > 
> > > > > 
> > > > > ./configure --prefix=/usr/local CC=icc CXX=icpc F77=ifort FC=ifort 
> > > > > ... output of configure ...
> > > > > make all install
> > > > > ... output of build and installation ...
> > > > > 
> > > > > The result is shown below. As can be seen it was unsuccessful. I'll
> > > > 
> > > > appreciate
> > > > 
> > > > > some guidance here as I am nearing the deadline for a project that 
> > > > > is part
> > > > 
> > > > of
> > > > 
> > > > > my research for my PhD.
> > > > > 
> > > > > Thanks in advance.
> > > > > 
> > > > > steph@sjb-linux:/src/openmpi-1.4.3$ ./configure 
> > > > > --prefix=/opt/openmpi-1.4.3 CC=icc CXX=icpc F77=ifort FC=ifort 
> > > > > checking for a BSD-compatible install... /usr/bin/install -c 
> > > > > checking whether build environment is sane... yes
> > > > > : command not foundconfig/missing: line 3:
> > > > > : command not foundconfig/missing: line 5:
> > > > > : command not foundconfig/missing: line 9:
> > > > > : command not foundconfig/missing: line 14:
> > > > > : command not foundconfig/missing: line 19:
> > > > > : command not foundconfig/missing: line 24:
> > > > > : command not foundconfig/missing: line 29:
> > > > > /src/openmpi-1.4.3/config/missing: line 49: syntax error near 
> > > > > unexpected
> > > > 
> > > > token
> > > > 
> > > > > `'n
> > > > > 'src/openmpi-1.4.3/config/missing: line 49: `case $1 in
> > > > > configure: WARNING: `missing' script is too old or missing checking 
> > > > > for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... 
> > > > > gawk checking whether make sets $(MAKE)... yes checking how to 
> > > > > create a ustar tar archive... gnutar
> > > > > 
> > > > > 
> > > > 
> > > > =
> > > > ===
> > > > 
> > > > > == Configuring Open MPI
> > > > > 
> > > > 
> > > > =
> > > > ===
> > > > 
> > > > > *** Checking versions
> > > > > : integer expression expected 3
> > > > > : integer expression expected 0
> > > > > .4ecking Open MPI version... 1
> > > > > checking Open MPI release date... Oct 05, 2010 checking Open MPI 
> > > > > Subversion repository version... r23834
> > > > > : integer expression expected 3
> > > > > : integer expression expected 0
> > > > > .4ecking Open Run-Time Environment version... 1 checking Open 
> > > > > Run-Time Environment release date... Oct 05, 2010 checking Open 
> > > > > Run-Time 

Re: [OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1

2011-05-03 Thread Damien
That last error is because you don't have permission to install to /opt 
as a regular user.  You need to run that command as  "sudo make install".


Damien

On 03/05/2011 1:55 PM, Steph Bredenhann wrote:

I think you are a genius!

The new result is attached, it was only the last step make install that
looked suspect.

I'll appreciate if you can look at these results?

While I am at it, thank you a million times for making this available to the
public! Without openmpi I would not have been able to complete my PhD!!!

Thanks

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Jeff Squyres
Sent: Tuesday, May 03, 2011 21:27
To: Open MPI Users
Subject: Re: [OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1

Ah, I see why your output is munged -- there's a bunch of ^M's in there.

It looks like OMPI's configure script got mucked up somehow.  Did you expand
the tarball on a windows machine and copy it over to a Linux box, perchance?
If so, try expanding it directly on your Linux machine.



On May 3, 2011, at 2:15 PM, Steph Bredenhann wrote:


Thanks for the speedy reply. The required file with information is

attached.

I first thought I must send the file to openmpi again, sorry if that was

wrong.

Thanks


--
Steph Bredenhann Pr.Eng Pr.CPM


Quoting Jeff Squyres:


Your output appears jumbled.  Can you send all the data listed here:

http://www.open-mpi.org/community/help/

On May 3, 2011, at 1:36 PM, Steph Bredenhann wrote:


Dear Sir/Madam

I want to build openmpi for use with INTEL compilers (version 11.1)
on my

Ubuntu

10.10 x64 system. I am using the guidelines from


http://software.intel.com/en-us/articles/performance-tools-for-softwar
e-developers-building-open-mpi-with-the-intel-compilers/

and specifically the following instructions:


./configure --prefix=/usr/local CC=icc CXX=icpc F77=ifort FC=ifort
... output of configure ...
make all install
... output of build and installation ...

The result is shown below. As can be seen it was unsuccessful. I'll

appreciate

some guidance here as I am nearing the deadline for a project that
is part

of

my research for my PhD.

Thanks in advance.

steph@sjb-linux:/src/openmpi-1.4.3$ ./configure
--prefix=/opt/openmpi-1.4.3 CC=icc CXX=icpc F77=ifort FC=ifort
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
: command not foundconfig/missing: line 3:
: command not foundconfig/missing: line 5:
: command not foundconfig/missing: line 9:
: command not foundconfig/missing: line 14:
: command not foundconfig/missing: line 19:
: command not foundconfig/missing: line 24:
: command not foundconfig/missing: line 29:
/src/openmpi-1.4.3/config/missing: line 49: syntax error near
unexpected

token

`'n
'src/openmpi-1.4.3/config/missing: line 49: `case $1 in
configure: WARNING: `missing' script is too old or missing checking
for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk...
gawk checking whether make sets $(MAKE)... yes checking how to
create a ustar tar archive... gnutar



=
===

== Configuring Open MPI


=
===

*** Checking versions
: integer expression expected 3
: integer expression expected 0
.4ecking Open MPI version... 1
checking Open MPI release date... Oct 05, 2010 checking Open MPI
Subversion repository version... r23834
: integer expression expected 3
: integer expression expected 0
.4ecking Open Run-Time Environment version... 1 checking Open
Run-Time Environment release date... Oct 05, 2010 checking Open
Run-Time Environment Subversion repository version... r23834
: integer expression expected 3
: integer expression expected 0
.4ecking Open Portable Access Layer version... 1 checking Open
Portable Access Layer release date... Oct 05, 2010 checking Open
Portable Access Layer Subversion repository version... r23834
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found

*** Initialization, setup
configure: builddir: /src/openmpi-1.4.3
configure: srcdir: /src/openmpi-1.4.3
configure: error: cannot run /bin/sh config/config.sub
steph@sjb-linux:/src/openmpi-1.4.3$ make all install
make: *** No rule to make target `all'.  Stop.
steph@sjb-linux:/src/openmpi-1.4.3$ make install
make: *** No rule to make target `install'.  Stop.
steph@sjb-linux:/src/openmpi-1.4.3$


Regards

Steph Bredenhann






--
This message was sent by Adept Internet's webmail.
http://www.adept.co.za/


Re: [OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1

2011-05-03 Thread Steph Bredenhann
I think you are a genius!

The new result is attached, it was only the last step make install that
looked suspect.

I'll appreciate if you can look at these results?

While I am at it, thank you a million times for making this available to the
public! Without openmpi I would not have been able to complete my PhD!!!

Thanks

-Original Message-
From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On
Behalf Of Jeff Squyres
Sent: Tuesday, May 03, 2011 21:27
To: Open MPI Users
Subject: Re: [OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1

Ah, I see why your output is munged -- there's a bunch of ^M's in there.

It looks like OMPI's configure script got mucked up somehow.  Did you expand
the tarball on a windows machine and copy it over to a Linux box, perchance?
If so, try expanding it directly on your Linux machine.



On May 3, 2011, at 2:15 PM, Steph Bredenhann wrote:

> Thanks for the speedy reply. The required file with information is
attached.
> 
> I first thought I must send the file to openmpi again, sorry if that was
wrong.
> 
> Thanks
> 
> 
> --
> Steph Bredenhann Pr.Eng Pr.CPM
> 
> 
> Quoting Jeff Squyres :
> 
>> Your output appears jumbled.  Can you send all the data listed here:
>> 
>>http://www.open-mpi.org/community/help/
>> 
>> On May 3, 2011, at 1:36 PM, Steph Bredenhann wrote:
>> 
>>> Dear Sir/Madam
>>> 
>>> I want to build openmpi for use with INTEL compilers (version 11.1) 
>>> on my
>> Ubuntu
>>> 10.10 x64 system. I am using the guidelines from
>>> 
>> 
> http://software.intel.com/en-us/articles/performance-tools-for-softwar
> e-developers-building-open-mpi-with-the-intel-compilers/
>>> and specifically the following instructions:
>>> 
>>> 
>>> ./configure --prefix=/usr/local CC=icc CXX=icpc F77=ifort FC=ifort 
>>> ... output of configure ...
>>> make all install
>>> ... output of build and installation ...
>>> 
>>> The result is shown below. As can be seen it was unsuccessful. I'll
>> appreciate
>>> some guidance here as I am nearing the deadline for a project that 
>>> is part
>> of
>>> my research for my PhD.
>>> 
>>> Thanks in advance.
>>> 
>>> steph@sjb-linux:/src/openmpi-1.4.3$ ./configure 
>>> --prefix=/opt/openmpi-1.4.3 CC=icc CXX=icpc F77=ifort FC=ifort 
>>> checking for a BSD-compatible install... /usr/bin/install -c 
>>> checking whether build environment is sane... yes
>>> : command not foundconfig/missing: line 3:
>>> : command not foundconfig/missing: line 5:
>>> : command not foundconfig/missing: line 9:
>>> : command not foundconfig/missing: line 14:
>>> : command not foundconfig/missing: line 19:
>>> : command not foundconfig/missing: line 24:
>>> : command not foundconfig/missing: line 29:
>>> /src/openmpi-1.4.3/config/missing: line 49: syntax error near 
>>> unexpected
>> token
>>> `'n
>>> 'src/openmpi-1.4.3/config/missing: line 49: `case $1 in
>>> configure: WARNING: `missing' script is too old or missing checking 
>>> for a thread-safe mkdir -p... /bin/mkdir -p checking for gawk... 
>>> gawk checking whether make sets $(MAKE)... yes checking how to 
>>> create a ustar tar archive... gnutar
>>> 
>>> 
>> =
>> ===
>>> == Configuring Open MPI
>>> 
>> =
>> ===
>>> 
>>> *** Checking versions
>>> : integer expression expected 3
>>> : integer expression expected 0
>>> .4ecking Open MPI version... 1
>>> checking Open MPI release date... Oct 05, 2010 checking Open MPI 
>>> Subversion repository version... r23834
>>> : integer expression expected 3
>>> : integer expression expected 0
>>> .4ecking Open Run-Time Environment version... 1 checking Open 
>>> Run-Time Environment release date... Oct 05, 2010 checking Open 
>>> Run-Time Environment Subversion repository version... r23834
>>> : integer expression expected 3
>>> : integer expression expected 0
>>> .4ecking Open Portable Access Layer version... 1 checking Open 
>>> Portable Access Layer release date... Oct 05, 2010 checking Open 
>>> Portable Access Layer Subversion repository version... r23834
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> 
>>> *** Initialization, setup
>>> configure: builddir: /src/openmpi-1.4.3
>>> configure: srcdir: /src/openmpi-1.4.3
>>> configure: error: cannot run /bin/sh config/config.sub 
>>> steph@sjb-linux:/src/openmpi-1.4.3$ make all install
>>> make: *** No rule to make target `all'.  Stop.
>>> steph@sjb-linux:/src/openmpi-1.4.3$ make install
>>> 

Re: [OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1

2011-05-03 Thread Jeff Squyres
Ah, I see why your output is munged -- there's a bunch of ^M's in there.

It looks like OMPI's configure script got mucked up somehow.  Did you expand 
the tarball on a windows machine and copy it over to a Linux box, perchance?  
If so, try expanding it directly on your Linux machine.



On May 3, 2011, at 2:15 PM, Steph Bredenhann wrote:

> Thanks for the speedy reply. The required file with information is attached.
> 
> I first thought I must send the file to openmpi again, sorry if that was 
> wrong.
> 
> Thanks
> 
> 
> -- 
> Steph Bredenhann Pr.Eng Pr.CPM
> 
> 
> Quoting Jeff Squyres :
> 
>> Your output appears jumbled.  Can you send all the data listed here:
>> 
>>http://www.open-mpi.org/community/help/
>> 
>> On May 3, 2011, at 1:36 PM, Steph Bredenhann wrote:
>> 
>>> Dear Sir/Madam
>>> 
>>> I want to build openmpi for use with INTEL compilers (version 11.1) on my
>> Ubuntu
>>> 10.10 x64 system. I am using the guidelines from
>>> 
>> 
> http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/
>>> and specifically the following instructions:
>>> 
>>> 
>>> ./configure --prefix=/usr/local CC=icc CXX=icpc F77=ifort FC=ifort
>>> ... output of configure ...
>>> make all install
>>> ... output of build and installation ...
>>> 
>>> The result is shown below. As can be seen it was unsuccessful. I'll
>> appreciate
>>> some guidance here as I am nearing the deadline for a project that is part
>> of
>>> my research for my PhD.
>>> 
>>> Thanks in advance.
>>> 
>>> steph@sjb-linux:/src/openmpi-1.4.3$ ./configure --prefix=/opt/openmpi-1.4.3
>>> CC=icc CXX=icpc F77=ifort FC=ifort
>>> checking for a BSD-compatible install... /usr/bin/install -c
>>> checking whether build environment is sane... yes
>>> : command not foundconfig/missing: line 3:
>>> : command not foundconfig/missing: line 5:
>>> : command not foundconfig/missing: line 9:
>>> : command not foundconfig/missing: line 14:
>>> : command not foundconfig/missing: line 19:
>>> : command not foundconfig/missing: line 24:
>>> : command not foundconfig/missing: line 29:
>>> /src/openmpi-1.4.3/config/missing: line 49: syntax error near unexpected
>> token
>>> `'n
>>> 'src/openmpi-1.4.3/config/missing: line 49: `case $1 in
>>> configure: WARNING: `missing' script is too old or missing
>>> checking for a thread-safe mkdir -p... /bin/mkdir -p
>>> checking for gawk... gawk
>>> checking whether make sets $(MAKE)... yes
>>> checking how to create a ustar tar archive... gnutar
>>> 
>>> 
>> 
>>> == Configuring Open MPI
>>> 
>> 
>>> 
>>> *** Checking versions
>>> : integer expression expected 3
>>> : integer expression expected 0
>>> .4ecking Open MPI version... 1
>>> checking Open MPI release date... Oct 05, 2010
>>> checking Open MPI Subversion repository version... r23834
>>> : integer expression expected 3
>>> : integer expression expected 0
>>> .4ecking Open Run-Time Environment version... 1
>>> checking Open Run-Time Environment release date... Oct 05, 2010
>>> checking Open Run-Time Environment Subversion repository version... r23834
>>> : integer expression expected 3
>>> : integer expression expected 0
>>> .4ecking Open Portable Access Layer version... 1
>>> checking Open Portable Access Layer release date... Oct 05, 2010
>>> checking Open Portable Access Layer Subversion repository version... r23834
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> : command not found
>>> 
>>> *** Initialization, setup
>>> configure: builddir: /src/openmpi-1.4.3
>>> configure: srcdir: /src/openmpi-1.4.3
>>> configure: error: cannot run /bin/sh config/config.sub
>>> steph@sjb-linux:/src/openmpi-1.4.3$ make all install
>>> make: *** No rule to make target `all'.  Stop.
>>> steph@sjb-linux:/src/openmpi-1.4.3$ make install
>>> make: *** No rule to make target `install'.  Stop.
>>> steph@sjb-linux:/src/openmpi-1.4.3$
>>> 
>>> 
>>> Regards
>>> 
>>> Steph Bredenhann
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> This message was sent by Adept Internet's webmail.
>>> http://www.adept.co.za/
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>> 
>> 
>> 

Re: [OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1

2011-05-03 Thread Steph Bredenhann
Thanks for the speedy reply. The required file with information is attached.

I first thought I must send the file to openmpi again, sorry if that was wrong.

Thanks


-- 
Steph Bredenhann Pr.Eng Pr.CPM


Quoting Jeff Squyres :

> Your output appears jumbled.  Can you send all the data listed here:
>
> http://www.open-mpi.org/community/help/
>
> On May 3, 2011, at 1:36 PM, Steph Bredenhann wrote:
>
> > Dear Sir/Madam
> >
> > I want to build openmpi for use with INTEL compilers (version 11.1) on my
> Ubuntu
> > 10.10 x64 system. I am using the guidelines from
> >
>
http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/
> > and specifically the following instructions:
> >
> >
> > ./configure --prefix=/usr/local CC=icc CXX=icpc F77=ifort FC=ifort
> > ... output of configure ...
> > make all install
> > ... output of build and installation ...
> >
> > The result is shown below. As can be seen it was unsuccessful. I'll
> appreciate
> > some guidance here as I am nearing the deadline for a project that is part
> of
> > my research for my PhD.
> >
> > Thanks in advance.
> >
> > steph@sjb-linux:/src/openmpi-1.4.3$ ./configure --prefix=/opt/openmpi-1.4.3
> > CC=icc CXX=icpc F77=ifort FC=ifort
> > checking for a BSD-compatible install... /usr/bin/install -c
> > checking whether build environment is sane... yes
> > : command not foundconfig/missing: line 3:
> > : command not foundconfig/missing: line 5:
> > : command not foundconfig/missing: line 9:
> > : command not foundconfig/missing: line 14:
> > : command not foundconfig/missing: line 19:
> > : command not foundconfig/missing: line 24:
> > : command not foundconfig/missing: line 29:
> > /src/openmpi-1.4.3/config/missing: line 49: syntax error near unexpected
> token
> > `'n
> > 'src/openmpi-1.4.3/config/missing: line 49: `case $1 in
> > configure: WARNING: `missing' script is too old or missing
> > checking for a thread-safe mkdir -p... /bin/mkdir -p
> > checking for gawk... gawk
> > checking whether make sets $(MAKE)... yes
> > checking how to create a ustar tar archive... gnutar
> >
> >
> 
> > == Configuring Open MPI
> >
> 
> >
> > *** Checking versions
> > : integer expression expected 3
> > : integer expression expected 0
> > .4ecking Open MPI version... 1
> > checking Open MPI release date... Oct 05, 2010
> > checking Open MPI Subversion repository version... r23834
> > : integer expression expected 3
> > : integer expression expected 0
> > .4ecking Open Run-Time Environment version... 1
> > checking Open Run-Time Environment release date... Oct 05, 2010
> > checking Open Run-Time Environment Subversion repository version... r23834
> > : integer expression expected 3
> > : integer expression expected 0
> > .4ecking Open Portable Access Layer version... 1
> > checking Open Portable Access Layer release date... Oct 05, 2010
> > checking Open Portable Access Layer Subversion repository version... r23834
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> > : command not found
> >
> > *** Initialization, setup
> > configure: builddir: /src/openmpi-1.4.3
> > configure: srcdir: /src/openmpi-1.4.3
> > configure: error: cannot run /bin/sh config/config.sub
> > steph@sjb-linux:/src/openmpi-1.4.3$ make all install
> > make: *** No rule to make target `all'.  Stop.
> > steph@sjb-linux:/src/openmpi-1.4.3$ make install
> > make: *** No rule to make target `install'.  Stop.
> > steph@sjb-linux:/src/openmpi-1.4.3$
> >
> >
> > Regards
> >
> > Steph Bredenhann
> >
> >
> >
> >
> >
> >
> > --
> > This message was sent by Adept Internet's webmail.
> > http://www.adept.co.za/
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

-- 
This message was sent by Adept Internet's webmail. 
http://www.adept.co.za/


ompi-output.tar.bz2
Description: application/bzip


Re: [OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1

2011-05-03 Thread Jeff Squyres
Your output appears jumbled.  Can you send all the data listed here:

http://www.open-mpi.org/community/help/

On May 3, 2011, at 1:36 PM, Steph Bredenhann wrote:

> Dear Sir/Madam
> 
> I want to build openmpi for use with INTEL compilers (version 11.1) on my 
> Ubuntu
> 10.10 x64 system. I am using the guidelines from
> http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/
> and specifically the following instructions:
> 
> 
> ./configure --prefix=/usr/local CC=icc CXX=icpc F77=ifort FC=ifort
> ... output of configure ...
> make all install
> ... output of build and installation ...
> 
> The result is shown below. As can be seen it was unsuccessful. I'll appreciate
> some guidance here as I am nearing the deadline for a project that is part of
> my research for my PhD.
> 
> Thanks in advance.
> 
> steph@sjb-linux:/src/openmpi-1.4.3$ ./configure --prefix=/opt/openmpi-1.4.3
> CC=icc CXX=icpc F77=ifort FC=ifort
> checking for a BSD-compatible install... /usr/bin/install -c
> checking whether build environment is sane... yes
> : command not foundconfig/missing: line 3:
> : command not foundconfig/missing: line 5:
> : command not foundconfig/missing: line 9:
> : command not foundconfig/missing: line 14:
> : command not foundconfig/missing: line 19:
> : command not foundconfig/missing: line 24:
> : command not foundconfig/missing: line 29:
> /src/openmpi-1.4.3/config/missing: line 49: syntax error near unexpected token
> `'n
> 'src/openmpi-1.4.3/config/missing: line 49: `case $1 in
> configure: WARNING: `missing' script is too old or missing
> checking for a thread-safe mkdir -p... /bin/mkdir -p
> checking for gawk... gawk
> checking whether make sets $(MAKE)... yes
> checking how to create a ustar tar archive... gnutar
> 
> 
> == Configuring Open MPI
> 
> 
> *** Checking versions
> : integer expression expected 3
> : integer expression expected 0
> .4ecking Open MPI version... 1
> checking Open MPI release date... Oct 05, 2010
> checking Open MPI Subversion repository version... r23834
> : integer expression expected 3
> : integer expression expected 0
> .4ecking Open Run-Time Environment version... 1
> checking Open Run-Time Environment release date... Oct 05, 2010
> checking Open Run-Time Environment Subversion repository version... r23834
> : integer expression expected 3
> : integer expression expected 0
> .4ecking Open Portable Access Layer version... 1
> checking Open Portable Access Layer release date... Oct 05, 2010
> checking Open Portable Access Layer Subversion repository version... r23834
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> : command not found
> 
> *** Initialization, setup
> configure: builddir: /src/openmpi-1.4.3
> configure: srcdir: /src/openmpi-1.4.3
> configure: error: cannot run /bin/sh config/config.sub
> steph@sjb-linux:/src/openmpi-1.4.3$ make all install
> make: *** No rule to make target `all'.  Stop.
> steph@sjb-linux:/src/openmpi-1.4.3$ make install
> make: *** No rule to make target `install'.  Stop.
> steph@sjb-linux:/src/openmpi-1.4.3$
> 
> 
> Regards
> 
> Steph Bredenhann
> 
> 
> 
> 
> 
> 
> -- 
> This message was sent by Adept Internet's webmail. 
> http://www.adept.co.za/
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] BUILDING OPENMPI ON UBUNTU WITH INTEL 11.1

2011-05-03 Thread Steph Bredenhann
Dear Sir/Madam

I want to build openmpi for use with INTEL compilers (version 11.1) on my Ubuntu
10.10 x64 system. I am using the guidelines from
http://software.intel.com/en-us/articles/performance-tools-for-software-developers-building-open-mpi-with-the-intel-compilers/
and specifically the following instructions:


./configure --prefix=/usr/local CC=icc CXX=icpc F77=ifort FC=ifort
... output of configure ...
make all install
... output of build and installation ...

The result is shown below. As can be seen it was unsuccessful. I'll appreciate
some guidance here as I am nearing the deadline for a project that is part of
my research for my PhD.

Thanks in advance.

steph@sjb-linux:/src/openmpi-1.4.3$ ./configure --prefix=/opt/openmpi-1.4.3
CC=icc CXX=icpc F77=ifort FC=ifort
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
: command not foundconfig/missing: line 3:
: command not foundconfig/missing: line 5:
: command not foundconfig/missing: line 9:
: command not foundconfig/missing: line 14:
: command not foundconfig/missing: line 19:
: command not foundconfig/missing: line 24:
: command not foundconfig/missing: line 29:
/src/openmpi-1.4.3/config/missing: line 49: syntax error near unexpected token
`'n
'src/openmpi-1.4.3/config/missing: line 49: `case $1 in
configure: WARNING: `missing' script is too old or missing
checking for a thread-safe mkdir -p... /bin/mkdir -p
checking for gawk... gawk
checking whether make sets $(MAKE)... yes
checking how to create a ustar tar archive... gnutar


== Configuring Open MPI


*** Checking versions
: integer expression expected 3
: integer expression expected 0
.4ecking Open MPI version... 1
checking Open MPI release date... Oct 05, 2010
checking Open MPI Subversion repository version... r23834
: integer expression expected 3
: integer expression expected 0
.4ecking Open Run-Time Environment version... 1
checking Open Run-Time Environment release date... Oct 05, 2010
checking Open Run-Time Environment Subversion repository version... r23834
: integer expression expected 3
: integer expression expected 0
.4ecking Open Portable Access Layer version... 1
checking Open Portable Access Layer release date... Oct 05, 2010
checking Open Portable Access Layer Subversion repository version... r23834
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found
: command not found

*** Initialization, setup
configure: builddir: /src/openmpi-1.4.3
configure: srcdir: /src/openmpi-1.4.3
configure: error: cannot run /bin/sh config/config.sub
steph@sjb-linux:/src/openmpi-1.4.3$ make all install
make: *** No rule to make target `all'.  Stop.
steph@sjb-linux:/src/openmpi-1.4.3$ make install
make: *** No rule to make target `install'.  Stop.
steph@sjb-linux:/src/openmpi-1.4.3$


Regards

Steph Bredenhann






-- 
This message was sent by Adept Internet's webmail. 
http://www.adept.co.za/



Re: [OMPI users] btl_openib_cpc_include rdmacm questions

2011-05-03 Thread Dave Love
Brock Palen  writes:

> We managed to have another user hit the bug that causes collectives (this 
> time MPI_Bcast() ) to hang on IB that was fixed by setting:
>
> btl_openib_cpc_include rdmacm

Could someone explain this?  We also have problems with collective hangs
with openib/mlx4 (specifically in IMB), but not with psm, and I couldn't
see any relevant issues filed.  However, rdmacm isn't an available value
for that parameter with our 1.4.3 or 1.5.3 installations, only oob (not
that I understand what these things are...).



Re: [OMPI users] OpenMPI LS-DYNA Connection refused

2011-05-03 Thread Terry Dontje
Looking at your output more the below "Connect to address" doesn't match 
any messages I see in the source code.  Also "trying normal 
/usr/bin/rsh" looks odd to me.


You may want to set the mca parameter mpi_abort_delay and attach a 
debugger to the abortive process and dump out a stack trace.  That 
should give a better idea where the failure is being triggered.  You can 
look at http://www.open-mpi.org/faq/?category=debugging question 4 for 
more info on the parameter.


--td

On 05/02/2011 03:40 PM, Robert Walters wrote:


I've attached the typical error message I've been getting. This is 
from a run I initiated this morning. The first few lines or so are 
related to the LS-DYNA program and are just there to let you know its 
running successfully for an hour and a half.


What's interesting is this doesn't happen on every job I run, and will 
recur for the same simulation. For instance, Simulation A will run for 
40 hours, and complete successfully. Simulation B will run for 6 
hours, and die from an error. Any further attempts to run simulation B 
will always end from an error. This makes me think there is some kind 
of bad calculation happening that OpenMPI doesn't know how to handle, 
or LS-DYNA doesn't know how to pass to OpenMPI. On the other hand, 
this particular simulation is one of those "benchmarks" and everyone 
runs it. I should not be getting errors from the FE code itself. 
Odd... I think I'll try this as an SMP job as well as an MPP job over 
a single node and see if the issue continues. That way I can figure 
out if its OpenMPI related or FE code related, but as I mentioned, I 
don't think it is FE code related since others have successfully run 
this particular benchmarking simulation.


*_Error Message:_*

 Parallel execution with 56 MPP proc

 NLQ used/max   152/   152

 Start time   05/02/2011 10:02:20

 End time 05/02/2011 11:24:46

 Elapsed time4946 seconds(  1 hours 22 min. 26 sec.) for9293 
cycles


 E r r o r   t e r m i n a t i o n

--

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD

with errorcode -1525207032.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.

You may or may not see output from other processes, depending on

exactly when Open MPI kills them.

--

connect to address xx.xxx.xx.xxx port 544: Connection refused

connect to address xx.xxx.xx.xxx port 544: Connection refused

trying normal rsh (/usr/bin/rsh)

--

mpirun has exited due to process rank 0 with PID 24488 on

node allision exiting without calling "finalize". This may

have caused other processes in the application to be

terminated by signals sent by mpirun (as reported here).

--

Regards,

Robert Walters



*From:*users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] 
*On Behalf Of *Terry Dontje

*Sent:* Monday, May 02, 2011 2:50 PM
*To:* us...@open-mpi.org
*Subject:* Re: [OMPI users] OpenMPI LS-DYNA Connection refused

On 05/02/2011 02:04 PM, Robert Walters wrote:

Terry,

I was under the impression that all connections are made because of 
the nature of the program that OpenMPI is invoking. LS-DYNA is a 
finite element solver and for any given simulation I run, the cores on 
each node must constantly communicate with one another to check for 
various occurrences (contact with various pieces/parts, updating nodal 
coordinates, etc...).


You might be right, the connections might have been established but 
the error message you state (connection refused) seems out of place if 
the connection was already established.


Was there more error messages from OMPI other than "connection 
refused"?  If so could you possibly provide that output to us, maybe 
it will give us a hint where in the library things are messing up.


I've run the program using --mca mpi_preconnect_mpi 1 and the 
simulation has started itself up successfully which I think means that 
the mpi_preconnect passed since all of the child processes have 
started up on each individual node. Thanks for the suggestion though, 
it's a good place to start.


Yeah, it possibly could be telling if things do work with this setting.

I've been worried (though I have no basis for it) that messages may be 
getting queued up and hitting some kind of ceiling or timeout. As a 
finite element code, I think the communication occurs on a large 
scale. Lots of very small packets going back and forth quickly. A few 
studies have been done by the High Performance Computing Advisory 
Council 
(http://www.hpcadvisorycouncil.com/pdf/LS-DYNA%20_analysis.pdf) and 
they've suggested that LS-DYNA communicates at very, very high rates 
(Not 

Re: [OMPI users] OpenMPI LS-DYNA Connection refused

2011-05-03 Thread Terry Dontje

A little more clarification:

1.  Simulations that fail always seem to fail?
2.  Does the same simulation always fail between the same processes (how 
about nodes)? I thought you

said no previously.
3.  Did the mpi_preconnect_mpi help any?
4.  Is there any informational messages in the /var/log/messages file 
around or before the abort?
5.  Have you tried netstat -s 1 while the program is running on one of 
the nodes that fail and see if

you are getting any of the failure type events spiking?

The error code coming back from MPI_Abort seems really odd.  I am 
curious whether the connection refused is a result of the abort or what?


--td
On 05/02/2011 03:40 PM, Robert Walters wrote:


I've attached the typical error message I've been getting. This is 
from a run I initiated this morning. The first few lines or so are 
related to the LS-DYNA program and are just there to let you know its 
running successfully for an hour and a half.


What's interesting is this doesn't happen on every job I run, and will 
recur for the same simulation. For instance, Simulation A will run for 
40 hours, and complete successfully. Simulation B will run for 6 
hours, and die from an error. Any further attempts to run simulation B 
will always end from an error. This makes me think there is some kind 
of bad calculation happening that OpenMPI doesn't know how to handle, 
or LS-DYNA doesn't know how to pass to OpenMPI. On the other hand, 
this particular simulation is one of those "benchmarks" and everyone 
runs it. I should not be getting errors from the FE code itself. 
Odd... I think I'll try this as an SMP job as well as an MPP job over 
a single node and see if the issue continues. That way I can figure 
out if its OpenMPI related or FE code related, but as I mentioned, I 
don't think it is FE code related since others have successfully run 
this particular benchmarking simulation.


*_Error Message:_*

 Parallel execution with 56 MPP proc

 NLQ used/max   152/   152

 Start time   05/02/2011 10:02:20

 End time 05/02/2011 11:24:46

 Elapsed time4946 seconds(  1 hours 22 min. 26 sec.) for9293 
cycles


 E r r o r   t e r m i n a t i o n

--

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD

with errorcode -1525207032.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.

You may or may not see output from other processes, depending on

exactly when Open MPI kills them.

--

connect to address xx.xxx.xx.xxx port 544: Connection refused

connect to address xx.xxx.xx.xxx port 544: Connection refused

trying normal rsh (/usr/bin/rsh)

--

mpirun has exited due to process rank 0 with PID 24488 on

node allision exiting without calling "finalize". This may

have caused other processes in the application to be

terminated by signals sent by mpirun (as reported here).

--

Regards,

Robert Walters



*From:*users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] 
*On Behalf Of *Terry Dontje

*Sent:* Monday, May 02, 2011 2:50 PM
*To:* us...@open-mpi.org
*Subject:* Re: [OMPI users] OpenMPI LS-DYNA Connection refused

On 05/02/2011 02:04 PM, Robert Walters wrote:

Terry,

I was under the impression that all connections are made because of 
the nature of the program that OpenMPI is invoking. LS-DYNA is a 
finite element solver and for any given simulation I run, the cores on 
each node must constantly communicate with one another to check for 
various occurrences (contact with various pieces/parts, updating nodal 
coordinates, etc...).


You might be right, the connections might have been established but 
the error message you state (connection refused) seems out of place if 
the connection was already established.


Was there more error messages from OMPI other than "connection 
refused"?  If so could you possibly provide that output to us, maybe 
it will give us a hint where in the library things are messing up.


I've run the program using --mca mpi_preconnect_mpi 1 and the 
simulation has started itself up successfully which I think means that 
the mpi_preconnect passed since all of the child processes have 
started up on each individual node. Thanks for the suggestion though, 
it's a good place to start.


Yeah, it possibly could be telling if things do work with this setting.

I've been worried (though I have no basis for it) that messages may be 
getting queued up and hitting some kind of ceiling or timeout. As a 
finite element code, I think the communication occurs on a large 
scale. Lots of very small packets going back and forth quickly. A few 
studies have been done by the High Performance Computing 

Re: [OMPI users] Building openmpi with PGI 11.4: won't find torque??

2011-05-03 Thread Jeff Squyres
It should search both tmdir/lib and tmdir/lib64 by default, IIRC.

Please send your config.log (please compress); it'll contain the specific 
reason why configure didn't find libtorque.


On May 2, 2011, at 10:21 PM, Ralph Castain wrote:

> It's probably looking for the torque lib in lib instead of lib64. There 
> should be a configure option to tell it --with-tm-libdir or something like 
> that - check "configure -h"
> 
> 
> On May 2, 2011, at 6:22 PM, Jim Kusznir wrote:
> 
>> Hi all:
>> 
>> I'm trying to build openmpi 1.4.3 against PGI 11.4 on my Rocks 5.1
>> system.  My "tried and true" build command for OpenMPI is:
>> 
>> CC=pgcc CXX=pgCC F77=pgf77 FC=pgf90 rpmbuild -bb --define
>> 'install_in_opt 1' --define 'install_modulefile 1' --define
>> 'modules_rpm_name environment-modules' --define 'build_all_in_one_rpm
>> 0'  --define 'configure_options --with-tm=/opt/torque' --define '_name
>> openmpi-pgi2011' --define 'use_default_rpm_opt_flags 0'
>> openmpi-1.4.3.spec
>> 
>> This is what I've used to build openmpi 1.4.3 for gcc and against PGI
>> 8.x (our last version of PGI installed).  This time, its not working,
>> though, and with what I consider to be a very strange failure point:
>> 
>> --- MCA component plm:tm (m4 configuration macro)
>> checking for MCA component plm:tm compile mode... dso
>> checking --with-tm value... sanity check ok (/opt/torque)
>> checking for pbs-config... /opt/torque/bin/pbs-config
>> checking tm.h usability... yes
>> checking tm.h presence... yes
>> checking for tm.h... yes
>> checking for tm_finalize... no
>> checking tm.h usability... yes
>> checking tm.h presence... yes
>> checking for tm.h... yes
>> looking for library in lib
>> checking for tm_init in -lpbs... no
>> looking for library in lib64
>> checking for tm_init in -lpbs... no
>> looking for library in lib
>> checking for tm_init in -ltorque... no
>> looking for library in lib64
>> checking for tm_init in -ltorque... no
>> configure: error: TM support requested but not found.  Aborting
>> error: Bad exit status from /var/tmp/rpm-tmp.7564 (%build)
>> 
>> 
>> However, /opt/torque/ is present.  /opt/torque/bin/pbs-config returns:
>> [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --prefix
>> /opt/torque
>> [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --package
>> pbs
>> [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --version
>> 2.3.0
>> [root@aeolus modulefiles]# /opt/torque/bin/pbs-config --libs
>> -L/opt/torque/lib64 -ltorque -Wl,--rpath -Wl,/opt/torque/lib64
>> 
>> and /opt/torque/lib64 does have:
>> [root@aeolus modulefiles]# ls /opt/torque/lib64
>> libtorque.a  libtorque.la  libtorque.so  libtorque.so.2  libtorque.so.2.0.0
>> 
>> so I'm a bit dumbfounded as to why configure doesn't "find" torque
>> support...Any suggestions?
>> 
>> --Jim
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] WRF Problem running in Parallel on multiple nodes (cluster)

2011-05-03 Thread Ralph Castain
The error message is telling you the problem. You don't have your remote path 
set so it can find the OMPI installation on the remote hosts. Look at the OMPI 
FAQ section for more info if you are unsure how to set paths on remote hosts.


On May 3, 2011, at 2:04 AM, Ahsan Ali wrote:

> Hello,
> 
> I am able to run WRFV3.2.1 using mpirun on multiple cores of single machine, 
> but when I want to run it across multiple nodes in cluster using hostlist 
> then I get error, The compute nodes are mounted with the master node during 
> boot using NFS. I get following error. Please help.
> 
> [root@pmd02 em_real]# mpirun -np 10 -hostfile /home/pmdtest/hostlist 
> ./real.exe
> bash: orted: command not found
> bash: orted: command not found
> --
> A daemon (pid 22006) died unexpectedly with status 127 while attempting
> to launch so we are aborting.
> 
> There may be more information reported by the environment (see above).
> 
> This may be because the daemon was unable to find all the needed shared
> libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
> location of the shared libraries on the remote nodes and this will
> automatically be forwarded to the remote nodes.
> --
> --
> mpirun noticed that the job aborted, but has no info as to the process
> that caused that situation.
> --
> mpirun: clean termination accomplished
> 
> 
> -- 
> Syed Ahsan Ali Bokhari 
> Electronic Engineer (EE)
> 
> Research & Development Division
> Pakistan Meteorological Department H-8/4, Islamabad.
> Phone # off  +92518358714
> Cell # +923155145014
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] problems with the -xterm option

2011-05-03 Thread jody
Launching xterm by mpirun onto a remote platform without a command
simply opens a xterm-window which sits there until you type exit into it
or close it by pressing on the frame's close button.
(of course only if the display is forwarded to the local machine)



On Mon, May 2, 2011 at 4:30 PM, Ralph Castain  wrote:
>
> On May 2, 2011, at 8:21 AM, jody wrote:
>
>> Hi
>> Well, the difference is that one time i call the application
>> 'HelloMPI' with the '--xterm' option,
>> whereas in my previous mail i am calling the application 'xterm'
>> (without the '--xterm' option)
>
> Ah, well that might explain it. I don't know how xterm would react to just 
> being launched by mpirun onto a remote platform without any command to run. I 
> can't explain what the plm verbosity has to do with anything, though.
>
>> Jody
>>
>> On Mon, May 2, 2011 at 4:08 PM, Ralph Castain  wrote:
>>>
>>> On May 2, 2011, at 7:56 AM, jody wrote:
>>>
 Hi Ralph

 Thank You for doing the fix.

 Do you perhaps also have an idea what is going on when i try to start
 xterm (or probably an other X application) on a remote host?
 In this case it is not enough to specify the '--leave-session-attached' 
 option.

 These calls won't open any xterms
  mpirun -np 4 -host squid_0 -mca plm_rsh_agent "ssh -Y"  -mca
 plm_base_verbose 1 xterm
  mpirun -np 4 -host squid_0 -mca plm_rsh_agent "ssh -Y"
 --leave-session-attached xterm
  mpirun -np 4 -host squid_0 -mca plm_rsh_agent "ssh -Y"  -mca
 odls_base_verbose 5 xterm
  mpirun -np 4 -host squid_0 -mca plm_rsh_agent "ssh -Y"  -mca
 odls_base_verbose 5 --leave-session-attached xterm

 But this will open the xterms:
  mpirun -np 4 -host squid_0 -mca plm_rsh_agent "ssh -Y"  -mca
 plm_base_verbose 1  --leave-session-attached xterm

 Any verbosity level > 0 will open xterms, but with ' -mca
 plm_base_verbose 0' there are again no xterms.

>>>
>>> No earthly idea...this seems to contradict what you had below. You said you 
>>> were seeing the xterms with this cmd line:
>>>
>> I just found that everything works as expected if i use the the
>> '--leave-session-attached' option (without the debug options):
>>  jody@chefli ~/share/neander $ mpirun -np 4 -host squid_0 -mca
>> plm_rsh_agent "ssh -Y"  --leave-session-attached  --xterm 0,1,2,3!
>> ./HelloMPI
>> The xterms are also opened if i do not use the '!' hold option.
>
>>>
>>> Did I miss something?
>>>
>>>
 Thank You
  Jody

 On Mon, May 2, 2011 at 2:29 PM, Ralph Castain  wrote:
>
> On May 2, 2011, at 2:34 AM, jody wrote:
>
>> Hi Ralph
>>
>> I rebuilt open MPI 1.4.2 with the debug option on both chefli and 
>> squid_0.
>> The results are interesting!
>>
>> I wrote a small HelloMPI app which basically calls usleep for a pause
>> of 5 seconds.
>>
>> Now calling it as i did before, no MPI errors appear anymore, only the
>> display problems:
>>  jody@chefli ~/share/neander $ mpirun -np 1 -host squid_0 -mca
>> plm_rsh_agent "ssh -Y" --xterm 0 ./HelloMPI
>>  /usr/bin/xterm Xt error: Can't open display: localhost:10.0
>>
>> When i do the same call *with* the debug option, the xterm appears and
>> shows the output of HelloMPI!
>> I attach the output in ompidbg_1.txt (It also works if i call with
>> '-np 4' and '--xterm 0,1,2,3'
>
> Good!
>
>>
>> Calling hostname the same way does not open an xterm (cf. ompidbg_2.txt).
>>
>> If i use the hold-option, the xterm appears with the output of
>> 'hostrname' (cf. ompidbg_3.txt)
>> The xterm opens after the line "launch complete for job..." has been
>> written (line 59)
>
> Okay, that's also expected. Like I said, without the "hold", the output 
> is generated so quickly that the window just flashes at best. I've had 
> similar experiences - hence the "hold" option.
>
>>
>> I just found that everything works as expected if i use the the
>> '--leave-session-attached' option (without the debug options):
>>  jody@chefli ~/share/neander $ mpirun -np 4 -host squid_0 -mca
>> plm_rsh_agent "ssh -Y"  --leave-session-attached  --xterm 0,1,2,3!
>> ./HelloMPI
>> The xterms are also opened if i do not use the '!' hold option.
>
> Okay, I can understand why. The --leave-session-attached option just 
> tells mpirun to not daemonize the backend daemons - thus leaving the ssh 
> session alive. The debug options do the same thing, but turn on all the 
> debug output.
>
> The problem is that if you don't leave the ssh session alive, then the 
> xterm has no way back to your screen. By daemonizing, we severe that 
> connection.
>
> What I should do (and maybe used to do, but it got removed) is 
> automatically turn "on" the 

Re: [OMPI users] is there an equiv of iprove for bcast?

2011-05-03 Thread Randolph Pullen

From: Randolph Pullen 
Subject: Re: Re: [OMPI users] is there an equiv of iprove for bcast?
To: us...@open-mpi.or
Received: Monday, 2 May, 2011, 12:53 PM

Non blocking Bcasts or tests would do it.I currently have the clearing-house 
solution working but it is unsatisfying because of its serial node. - As it 
scales it will overload this node.

The problem rephrased:Instead of n*2 processes, I am having to use n*2+1 with 
the extra process serially receiving listener messages on behalf of the workers 
before transmitting these messages to workers in its comm_group.
Is there a way to Bcast directly from each listener to the worker pool?  
(listeners must monitor their ports most of the time and cant participate in 
global bcasts)Not knowing which listener is going to transmit prevents the 
correct comm_group being used with Bcast calls.
--- On Sat, 30/4/11, Jeff Squyres  wrote:

From: Jeff Squyres 
Subject: Re: [OMPI users] is there an equiv of iprove for bcast?
To: randolph_pul...@yahoo.com.au, "Open MPI Users" 
Received: Saturday, 30 April, 2011, 7:17 AM

On Apr 29, 2011, at 1:21 AM, Randolph Pullen wrote:

> I am having a design issue:
> My server application has 2 processes per node, 1 listener and 1 worker.
> Each listener monitors a specified port for incoming TCP connections with the 
> goal that on receipt of a request it will distribute it over the workers in a 
> SIMD fashion.
> 
> My problem is this: how can I get the workers to accept work from any of the 
> listeners?
> Making a separate communicator does not help as the sender is unknown.  Other 
> than making a serial 'clearing house' process I cant think
 of a way  - Iprobe for Bcast would be useful.

I'm not quite sure I understand your question.

There currently is no probe for collectives, but MPI-3 has non-blocking 
collectives which you could MPI_Test for.  There's a 3rd party library 
implementation called libNBC (non-blocking collectives) that you could use 
until such things become natively available.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/



[OMPI users] WRF Problem running in Parallel on multiple nodes (cluster)

2011-05-03 Thread Ahsan Ali
Hello,

I am able to run WRFV3.2.1 using mpirun on multiple cores of single machine,
but when I want to run it across multiple nodes in cluster using hostlist
then I get error, The compute nodes are mounted with the master node during
boot using NFS. I get following error. Please help.

[root@pmd02 em_real]# mpirun -np 10 -hostfile /home/pmdtest/hostlist
./real.exe
bash: orted: command not found
bash: orted: command not found
--
A daemon (pid 22006) died unexpectedly with status 127 while attempting
to launch so we are aborting.

There may be more information reported by the environment (see above).

This may be because the daemon was unable to find all the needed shared
libraries on the remote node. You may set your LD_LIBRARY_PATH to have the
location of the shared libraries on the remote nodes and this will
automatically be forwarded to the remote nodes.
--
--
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--
mpirun: clean termination accomplished


-- 
Syed Ahsan Ali Bokhari
Electronic Engineer (EE)

Research & Development Division
Pakistan Meteorological Department H-8/4, Islamabad.
Phone # off  +92518358714
Cell # +923155145014