Re: [OMPI users] Segfault on any MPI communication on head node

2011-09-28 Thread Jeff Squyres
Use --enable-debug on your configure line. This will add in some debugging code to OMPI, and it'll compile everything with -g so that you can get stack traces. Beware that the extra debugging junk makes OMPI slightly slower; don't do any benchmarking with this install, etc. On Sep 28, 2011,

Re: [OMPI users] Segfault on any MPI communication on head node

2011-09-28 Thread Phillip Vassenkov
I tried 1.4.4rc4, same problem. Where do I get a debugging version? On 9/28/11 8:32 AM, Jeff Squyres wrote: Agreed that the original program had the char*[20]/char[20] bug, but his segv is occurring before trying to use that array. So it's a bug - but he just hadn't hit it yet. :-) I'd stil

Re: [OMPI users] EXTERNAL: Re: Unresolved reference 'mbind' and 'get_mempolicy'

2011-09-28 Thread Blosch, Edwin L
Jeff, I've tried it now adding --without-libnuma. Actually that did NOT fix the problem, so I can send you the full output from configure if you want, to understand why this "hwloc" function is trying to use a function which appears to be unavailable. The answers to some of your questions:

[OMPI users] Proper way to stop MPI process

2011-09-28 Thread Xin Tong
I am wondering what the proper way of stop a mpirun process and the child process it created. I tried to send SIGTERM, it does not respond to it ? What kind of signal should I be sending to it ? Thanks Xin

Re: [OMPI users] Unresolved reference 'mbind' and 'get_mempolicy'

2011-09-28 Thread Reuti
Am 28.09.2011 um 18:09 schrieb Brice Goglin: Le 28/09/2011 17:55, Blosch, Edwin L a écrit : I am getting some undefined references in building OpenMPI 1.5.4 and I would like to know how to work around it. The errors look like this: /scratch1/bloscel/builds/release/openmpi-intel/lib/

Re: [OMPI users] Role of ethernet interfaces of startup of openmpi job using IB

2011-09-28 Thread Prentice Bisbal
On 09/27/2011 05:30 PM, Jeff Squyres wrote: > On Sep 27, 2011, at 5:03 PM, Prentice Bisbal wrote: > >> To clarify, is IP/Ethernet required, or will IPoIB be used if it's >> configured on the nodes? Would this make a difference. > > IPoIB is fine, although I've heard concerns about its stability a

Re: [OMPI users] Unresolved reference 'mbind' and 'get_mempolicy'

2011-09-28 Thread Brice Goglin
Le 28/09/2011 17:55, Blosch, Edwin L a écrit : > > I am getting some undefined references in building OpenMPI 1.5.4 and I > would like to know how to work around it. > > > > The errors look like this: > > > > /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o): > In fu

Re: [OMPI users] unresolvable R_X86_64_64 relocation against symbol `mpi_fortran_*

2011-09-28 Thread Dmitry N. Mikushin
Hi, Interestingly, the errors are gone after I removed "-g" from the app compile options. I tested again on the fresh Ubuntu 11.10 install: both 1.4.3 and 1.5.4 compile fine, but with the same error. Also I tried hard to find any 32-bit object or library and failed. They all are 64-bit. - D. 20

Re: [OMPI users] Unresolved reference 'mbind' and 'get_mempolicy'

2011-09-28 Thread Jeff Squyres
Yowza; that sounds like a configury bug. :-( What line were you using to configure Open MPI? Do you have libnuma installed? If so, do you have the .h and .so files? Do you have the .a file? Can you send the last few lines of output from a failed "make V=1" in that tree? (it'll show us the

[OMPI users] Unresolved reference 'mbind' and 'get_mempolicy'

2011-09-28 Thread Blosch, Edwin L
I am getting some undefined references in building OpenMPI 1.5.4 and I would like to know how to work around it. The errors look like this: /scratch1/bloscel/builds/release/openmpi-intel/lib/libmpi.a(topology-linux.o): In function `hwloc_linux_alloc_membind': topology-linux.c:(.text+0x1da): und

[OMPI users] orte_grpcomm_modex failed

2011-09-28 Thread devendra rai
Hello All, I have just rebuilt openmpi-1.4-3 on our cluster, and I see this error: It looks like MPI_INIT failed for some reason; your parallel process is likely to abort.  There are many reasons that a parallel process can fail during MPI_INIT; some of which are due to configuration or environme

Re: [OMPI users] Segfault on any MPI communication on head node

2011-09-28 Thread Jeff Squyres
Agreed that the original program had the char*[20]/char[20] bug, but his segv is occurring before trying to use that array. So it's a bug - but he just hadn't hit it yet. :-) I'd still like to see a debugging version so that we can get a real stack trace, and/or try the latest 1.4.4 RC (poste

Re: [OMPI users] alternate PBS_NODEFILE

2011-09-28 Thread Wiegers, Bert
Hello Reuti, > defining 12 slots and request the machines exclusive is not an option? I would like to. Unfortunatly the system is productive (for 2 years now) and many scripts depend on this setup. > > The only way to get it working otherwise is to unset $JOB_ID and so > on, so that Open MPI

Re: [OMPI users] maximum size for read buffer in MPI_File_read/write

2011-09-28 Thread German Hoecht
Hi Rob, thanks for your comments. I understand that it's most probably not worth the effort to find the actual reason. Because I have to deal with very large files I preferred using "std::numeric_limits::max()" rather than a hard-coded value to split the read in case an IO request exceeds this am