Hmmm...I added that directory and tried this on odin (which is an IB-based 
machine). Any MPI proc segfaults:

Core was generated by `./hello'.
Program terminated with signal 11, Segmentation fault.
w#0  _sysio_p_validate (pno=0x0, intnt=0x0, path=0x0) at src/inode.c:574
574     src/inode.c: No such file or directory.
        in src/inode.c
(gdb) where
#0  _sysio_p_validate (pno=0x0, intnt=0x0, path=0x0) at src/inode.c:574
#1  0x00002aaaabd3f3e9 in _sysio_path_walk (parent=0x0, nd=0x7fffffffd8e0) at 
src/namei.c:216
#2  0x00002aaaabd3faad in _sysio_namei (parent=0x0, path=<value optimized out>, 
flags=0, intnt=0x7fffffffd950, pnop=0x7fffffffd970) at src/namei.c:505
#3  0x00002aaaabd3fd98 in open (path=0x2aaaac24280f "/sys/devices/system/node", 
flags=<value optimized out>) at src/open.c:179
#4  0x00002aaaabd43d5b in opendir (name=0x2aaaac24280f 
"/sys/devices/system/node") at src/stddir.c:60
#5  0x00002aaaac241825 in numa_max_node () from /usr/lib64/libnuma.so.1
#6  0x00002aaaac241d13 in numa_init () from /usr/lib64/libnuma.so.1
#7  0x00002aaaaaab845b in call_init () from /lib64/ld-linux-x86-64.so.2
#8  0x00002aaaaaab8565 in _dl_init_internal () from /lib64/ld-linux-x86-64.so.2
#9  0x00002aaaaaaabaaa in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#10 0x0000000000000001 in ?? ()
#11 0x00007fffffffe03c in ?? ()
#12 0x0000000000000000 in ?? ()

I got the same thing whether I excluded openib or not. I then ran on my Linux 
cluster, which doesn't have IB at all - and it ran fine. Also runs clean on the 
Mac. However, in both those cases, I had left IO romio enabled.

Now on odin, I always disable-io-romio. So I tried deliberately enabling it, 
and everything works. So this appears to be something that the IO work has 
broken.

Edgar: can you please fix --disable-io-romio?

Thanks
Ralph




On Oct 29, 2012, at 11:55 AM, Edgar Gabriel <gabr...@cs.uh.edu> wrote:

> I'm sorry to add one more thing to the list, but beyond this file, it
> looks like also the entire ompi/mca/common/verbs/ directory is also
> missing in the 1.7 branch, but is required to compile the bcoll
> framework.  It is there in the trunk, but missing in the 1.7 branch...
> 
> Thanks
> Edgar
> 
> 
> On 10/26/2012 5:31 PM, Ralph Castain wrote:
>> Okay, I'll fix for tonights tarball.
>> 
>> Thanks!
>> 
>> On Oct 26, 2012, at 3:28 PM, "Shamis, Pavel" <sham...@ornl.gov> wrote:
>> 
>>> There is a bug in makefile. The file existing in svn, but it is not listed 
>>> in the Makefile.am. As a result, it wasn't pulled to the tarball.
>>> 
>>> Pavel (Pasha) Shamis
>>> ---
>>> Computer Science Research Group
>>> Computer Science and Math Division
>>> Oak Ridge National Laboratory
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Oct 26, 2012, at 2:33 PM, Edgar Gabriel wrote:
>>> 
>>> we have trouble compiling the 1.7 series on a machine in Dresden.
>>> Specifically, we receive an error message when compiling the
>>> bcol/iboffload component (other infiniband components compile fine).
>>> 
>>> Any idea/suggestions what we might be doing wrong or what to look for?
>>> 
>>> make[2]: Entering directory
>>> `/home/h2/gabriel/openmpi-1.7rc4/ompi/mca/bcol/iboffload'
>>> CC       bcol_iboffload_module.lo
>>> CC       bcol_iboffload_mca.lo
>>> CC       bcol_iboffload_endpoint.lo
>>> CC       bcol_iboffload_frag.lo
>>> In file included from bcol_iboffload_frag.c:16:0:
>>> bcol_iboffload.h:46:36: fatal error: bcol_iboffload_qp_info.h: No such
>>> file or directory
>>> compilation terminated.
>>> make[2]: *** [bcol_iboffload_frag.lo] Error 1
>>> make[2]: *** Waiting for unfinished jobs....
>>> In file included from bcol_iboffload_mca.c:18:0:
>>> bcol_iboffload.h:46:36: fatal error: bcol_iboffload_qp_info.h: No such
>>> file or directory
>>> compilation terminated.
>>> make[2]: *** [bcol_iboffload_mca.lo] Error 1
>>> In file included from bcol_iboffload_endpoint.c:23:0:
>>> bcol_iboffload.h:46:36: fatal error: bcol_iboffload_qp_info.h: No such
>>> file or directory
>>> compilation terminated.
>>> make[2]: *** [bcol_iboffload_endpoint.lo] Error 1
>>> In file included from bcol_iboffload_module.c:39:0:
>>> bcol_iboffload.h:46:36: fatal error: bcol_iboffload_qp_info.h: No such
>>> file or directory
>>> compilation terminated.
>>> make[2]: *** [bcol_iboffload_module.lo] Error 1
>>> make[2]: Leaving directory
>>> `/home/h2/gabriel/openmpi-1.7rc4/ompi/mca/bcol/iboffload'
>>> make[1]: *** [all-recursive] Error 1
>>> make[1]: Leaving directory `/home/h2/gabriel/openmpi-1.7rc4/ompi'
>>> make: *** [all-recursive] Error 1
>>> 
>>> Thanks
>>> Edgar
>>> 
>>> --
>>> Edgar Gabriel
>>> Associate Professor
>>> Parallel Software Technologies Lab      http://pstl.cs.uh.edu
>>> Department of Computer Science          University of Houston
>>> Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
>>> Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335
>>> 
>>> <signature.asc>_______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org<mailto:de...@open-mpi.org>
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 
> -- 
> Edgar Gabriel
> Associate Professor
> Parallel Software Technologies Lab      http://pstl.cs.uh.edu
> Department of Computer Science          University of Houston
> Philip G. Hoffman Hall, Room 524        Houston, TX-77204, USA
> Tel: +1 (713) 743-3857                  Fax: +1 (713) 743-3335
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to