[OMPI users] ompi 1.3 make distclean broken ?

2008-11-05 Thread Mehdi Bozzo-Rey
Hi, It looks like make distclean will remove the following file: orte_hosts.7 and prevent a configure, new make ; make install. I tried the pre-release versions (openmpi-1.3b1,b2) and one nightly tarball (openmpi-1.3b2r19907): For openmpi-1.3b2: Step 1 : ./configure --prefix=/tmp/

Re: [OMPI users] Beowulf cluster and openmpi

2008-11-05 Thread Ralph Castain
Sorry for delayed response - had to dig into this a little since it has been so long since I wrote the bproc support code. The problem here is with how you named your nodes. On bproc clusters, the backend nodes are normally named with just a number. Our system therefore expects to see node

Re: [OMPI users] Beowulf cluster and openmpi

2008-11-05 Thread Kelley, Sean
I would suggest making sure that the /etc/beowulf/config file has a "libraries" line for every directory where required shared libraries (application and mpi) are located. Also, make sure that the filesystems containing the executables and shared libraries are accessible from the compute nodes.

Re: [OMPI users] Beowulf cluster and openmpi

2008-11-05 Thread Rima Chaudhuri
Thanks for all your help Ralph and Sean!! I changed the machinefile to just containing the node numbers. I added the env variable NODES in my .bash_profile and .bashrc. As per Sean's suggestion I added the $LD_LIBRARY_PATH (shared lib path which the openmpi lib directory path) and the $AMBERHOME/li

Re: [OMPI users] Beowulf cluster and openmpi

2008-11-05 Thread Daniel Gruner
Can your nodes see the openmpi libraries and executables? I have the /usr/local and /opt from the master node mounted on the compute nodes, in addition to having the LD_LIBRARY_PATH defined correctly. In your case the nodes must be able to see /home/rchaud/openmpi-1.2.6 in order to get the librar

Re: [OMPI users] OK, got it installed, but... can't find libraries

2008-11-05 Thread Jeff Squyres
The errors you are seeing aren't related to using g95 vs. gfortran: 1. The warnings from configure are fairly normal. It's just configure trying to be responsible and telling you things that you might want to know (e.g., your system has no support for Fortran INTEGER*16, so OMPI is not inc

Re: [OMPI users] ompi 1.3 make distclean broken ?

2008-11-05 Thread Jeff Squyres
You are absolutely correct; looks like a typo in orte/util/ Makefile.am. Thanks for reporting this! I fixed it on the trunk in r19936 and have filed a CMR to get it over to the v1.3 branch. On Nov 5, 2008, at 8:54 AM, Mehdi Bozzo-Rey wrote: Hi, It looks like make distclean will remove t

[OMPI users] program stalls in __write_nocancel()

2008-11-05 Thread Peter Beerli
On some of my larger problems , my program stalls and does not continue (50 or more nodes, 'long' runs >5 hours). My program is set up as a master-worker and it seems that the master gets stuck in a write to stdout see gdb backtrace below (It took all day to get there on 50 nodes). the functi