Re: [OMPI users] Compiling OpenMPI for i386 on a x86_64

2007-10-19 Thread Gurhan
On 10/19/07, Jim Kusznir  wrote:
> I caught some of that above...I suspect rpm's build environment for
> cross-platform building "leaves much to be desired"...At this point, I
> was thinking my best option would be to set up a i386 box and build
> the .i386 libs on that.
>

Multilib installations are not trivial. However this particular
issue is a trivial issue/oversight. I haven't checked the CentOS one,
but the RHEL spec file doesn't pass any FCFLAGS argument to the
configure file. So, edit your spec file and add
FCFLAGS="$RPM_OPT_FLAGS $XFLAGS"
line in the configure script line and retry.

You wouldn't need to do this on an i386 box, though if you can do it,
it'd be just fine. However if you are on hardware/time/resource
restraint and can't do it , you should be able to do this just fine on
your x86_64 machines, but you might have to install some i386
dependencies .

thanks,
gurhan

> I must say, I'm pretty disappointed in rpm, as the x86_64 platform
> seems to "require" both 64 and 32 versions of its libs and devel
> files.  Yet, on a x86_64 platform, it doesn't appear that the 32-bit
> versions can be generated reliably.
>
> Unfortunately for me, I need the binaries in rpm form, as this will be
> "mass-deployed" to a ROCKs cluster, and the installer installs rpms.
>
> --Jim
>
> On 10/18/07, Gurhan  wrote:
> > Hello,
> >
> > configure:33918: gcc -DNDEBUG -O2 -g -pipe -m32 -march=i386
> > -mtune=pentium4 -fno-strict-aliasing -I. -c conftest.c
> > configure:33925: $? = 0
> > configure:33935: gfortran   conftestf.f90 conftest.o -o conftest
> > /usr/bin/ld: warning: i386 architecture of input file `conftest.o' is
> > incompatible with i386:x86-64 output
> > configure:33942: $? = 0
> > configure:33990: ./conftest
> > configure:33997: $? = 139
> > configure:34006: error: Could not determine size of LOGICAL
> >
> > Is this correct? We are feeding a 32-bit object file to be linked with
> > a 64-bit output executable file? When target is i386 shouldn't -m32
> > -march=i386 need to be passed on to gfortran as well on above
> > instance, unless it's for negative testing?
> >
> > Thanks,
> > gurhan
> >
> >
> > On 10/18/07, Jim Kusznir  wrote:
> > > Attached is the requested info.  There's not much here, though...it
> > > dies pretty early in.
> > >
> > > --Jim
> > >
> > > On 10/17/07, Jeff Squyres  wrote:
> > > > On Oct 17, 2007, at 12:35 PM, Jim Kusznir wrote:
> > > >
> > > > > checking if Fortran 90 compiler supports LOGICAL... yes
> > > > > checking size of Fortran 90 LOGICAL... ./configure: line 34070:  7262
> > > > > Segmentation fault  ./conftest 1>&5 2>&1
> > > > > configure: error: Could not determine size of LOGICAL
> > > >
> > > > Awesome!  It looks like gfortran itself is seg faulting.
> > > >
> > > > Can you send all the information listed on the getting help page?
> > > >
> > > >  http://www.open-mpi.org/community/help/
> > > >
> > > > That will help confirm/deny whether it's gfortran itself that is seg
> > > > faulting.  If it's gfortran that's seg faulting, there's not much
> > > > that Open MPI can do...
> > > >
> > > > --
> > > > Jeff Squyres
> > > > Cisco Systems
> > > >
> > > > ___
> > > > users mailing list
> > > > us...@open-mpi.org
> > > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > > >
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> > >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] Compiling OpenMPI for i386 on a x86_64

2007-10-19 Thread Jim Kusznir
I caught some of that above...I suspect rpm's build environment for
cross-platform building "leaves much to be desired"...At this point, I
was thinking my best option would be to set up a i386 box and build
the .i386 libs on that.

I must say, I'm pretty disappointed in rpm, as the x86_64 platform
seems to "require" both 64 and 32 versions of its libs and devel
files.  Yet, on a x86_64 platform, it doesn't appear that the 32-bit
versions can be generated reliably.

Unfortunately for me, I need the binaries in rpm form, as this will be
"mass-deployed" to a ROCKs cluster, and the installer installs rpms.

--Jim

On 10/18/07, Gurhan  wrote:
> Hello,
>
> configure:33918: gcc -DNDEBUG -O2 -g -pipe -m32 -march=i386
> -mtune=pentium4 -fno-strict-aliasing -I. -c conftest.c
> configure:33925: $? = 0
> configure:33935: gfortran   conftestf.f90 conftest.o -o conftest
> /usr/bin/ld: warning: i386 architecture of input file `conftest.o' is
> incompatible with i386:x86-64 output
> configure:33942: $? = 0
> configure:33990: ./conftest
> configure:33997: $? = 139
> configure:34006: error: Could not determine size of LOGICAL
>
> Is this correct? We are feeding a 32-bit object file to be linked with
> a 64-bit output executable file? When target is i386 shouldn't -m32
> -march=i386 need to be passed on to gfortran as well on above
> instance, unless it's for negative testing?
>
> Thanks,
> gurhan
>
>
> On 10/18/07, Jim Kusznir  wrote:
> > Attached is the requested info.  There's not much here, though...it
> > dies pretty early in.
> >
> > --Jim
> >
> > On 10/17/07, Jeff Squyres  wrote:
> > > On Oct 17, 2007, at 12:35 PM, Jim Kusznir wrote:
> > >
> > > > checking if Fortran 90 compiler supports LOGICAL... yes
> > > > checking size of Fortran 90 LOGICAL... ./configure: line 34070:  7262
> > > > Segmentation fault  ./conftest 1>&5 2>&1
> > > > configure: error: Could not determine size of LOGICAL
> > >
> > > Awesome!  It looks like gfortran itself is seg faulting.
> > >
> > > Can you send all the information listed on the getting help page?
> > >
> > >  http://www.open-mpi.org/community/help/
> > >
> > > That will help confirm/deny whether it's gfortran itself that is seg
> > > faulting.  If it's gfortran that's seg faulting, there's not much
> > > that Open MPI can do...
> > >
> > > --
> > > Jeff Squyres
> > > Cisco Systems
> > >
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> > >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> >
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] which alternative to OpenMPI should I choose?

2007-10-19 Thread Michael

On Oct 19, 2007, at 9:29 AM, Marcin Skoczylas wrote:


Jeff Squyres wrote:

On Oct 18, 2007, at 9:24 AM, Marcin Skoczylas wrote:



/I assume this could be because of:

$ /sbin/route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref
Use
Iface
192.125.17.0*   255.255.255.0   U 0
00 eth1
192.168.12.0*   255.255.255.0   U 0
00 eth1
161.254.0.0 *   255.255.0.0 U 0
00 eth1
default 192.125.17.10.0.0.0 UG0
00 eth1



Actually the configuration here is quite strange, this is not a  
private

address. The head node sits on a public address from 192.125.17.0 net
(routable from outside), workers are on 192.168.12.0


I have an almost similar configuration that works just fine with  
OpenMPI, in my case the head node has three interfaces and the worker  
nodes each have two interfaces, the configuration is roughly:


master: eth0: 192.168.x.x, eth1 & eth2 bonded to 10.0.0.1
node2: eth0 & eth1 bonded to 10.0.0.2
nodeN: eth0 & eth1 bonded to 10.0.0.N

So our "outside" communication with the head node is on the 192.168  
network and the internal communication is on the 10.0.0.x network.


In your case the "outside" communication is on the the 192.125  
network and the internal communication is on the 192.168 network.


The primary difference seems to be that you have all communication  
going over a single interface.


I'm a little surprised there is any problem at all with OpenMPI &  
your configuration as my configuration is more complicated.


Michael



Re: [OMPI users] which alternative to OpenMPI should I choose?

2007-10-19 Thread Marcin Skoczylas

Jeff Squyres wrote:

On Oct 18, 2007, at 9:24 AM, Marcin Skoczylas wrote:

  

  PML add procs failed
  --> Returned "Unreachable" (-12) instead of "Success" (0)
-- 


*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (goodbye)



Yoinks -- OMPI is determining that it can't use the TCP BTL to reach  
other hosts.


  

/I assume this could be because of:

$ /sbin/route
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref 
Use

Iface
192.125.17.0*   255.255.255.0   U 0   
00 eth1
192.168.12.0*   255.255.255.0   U 0   
00 eth1
161.254.0.0 *   255.255.0.0 U 0   
00 eth1
default 192.125.17.10.0.0.0 UG0   
00 eth1



192.125 -- is that supposed to be a private address?  If so, that's  
not really the Right way to do things...
  
Actually the configuration here is quite strange, this is not a private 
address. The head node sits on a public address from 192.125.17.0 net 
(routable from outside), workers are on 192.168.12.0


So "narrowly scoped netmasks" which (as it's written in the FAQ)  
are not

supported in the OpenMPI. I asked for a workaround on this newsgroup
some time ago - but no answer uptill now. So my question is: what
alternative should I choose that will work in such configuration?



We haven't put in a workaround because (to be blunt) we either forgot  
about it and/or not enough people have asked for it.  Sorry.  :-(


It probably wouldn't be too hard to put in an MCA parameter to say  
"don't do netmask comparisons; just assume that every IP address is  
reachable by every other IP address."
  

Would be really great! I hope it's not so complicated to add.


George -- did you mention that you were working on this at one point?

  

Do you
have some experience in other MPI implementations, for example LamMPI?



LAM/MPI should be able to work just fine in this environment; it  
doesn't do any kind of reachability computations like Open MPI does  
-- it blindly assumes that every MPI process is reachable by every  
other MPI process.
  
Firstly I'm going to have some discussion with administrators here to do 
some more checks... and then I'll try to use LamMPI. The thing is that 
I'm not familiar with it at all, I was always using OpenMPI instead. 
Hope the configuration is as easy as in the OpenMPI and will work 
without root account.


Thank you for your help!

regards, Marcin



[OMPI users] Recursive use of "orterun"

2007-10-19 Thread idesbald van den bosch
Hi,

I've run into the same problem as discussed in the thread Lev Gelb: "Re:
[OMPI users] Recursive use of "orterun" (Ralph H
Castain)"

I am running a parallel python code, then from python I launch a C++
parallel program using the python os.system command, then I come back in
python and keep going.

With LAM/MPI there is no problem with this.

But Open-mpi systematically crashes, because the python os.system command
launches the C++ program with the same OMPI_* environment variables as for
the Python program. As discussed in the thread, I have tried filtering the
OMPI_* variables prior to launching the C++ program with an
os.execvecommand, but then it fails to return the hand to python and
instead simply
terminates when the C++ program ends.

There is a workaround (
http://thread.gmane.org/gmane.comp.clustering.open-mpi.user/986): create a
*.sh file with the following lines:


for i in $(env | grep OMPI_MCA |sed 's/=/ /' | awk '{print $1}')
do
   unset $i
done

# now the C++ call
mpirun -np 2  ./MoM/communicateMeshArrays
--

and then call the *.sh program through the python os.system command.

What I would like to know is that if this "problem" will get fixed in
open-MPI? Is there another way to elegantly solve this issue? Meanwhile, I
will stick to the ugly *.sh hack listed above.

Cheers

Ides