Only takes <30 seconds of typing to start the test and I get email when it is done. Typing these emails takes more of my time than the actual testing does.
-Paul On Wed, Jan 8, 2014 at 8:35 PM, Ralph Castain <r...@open-mpi.org> wrote: > If you have the time, it might be worth nailing it down. However, I'm > mindful of all the things you need to do, so please only if you have the > time. > > Thanks > Ralph > > On Jan 8, 2014, at 8:23 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > > Ralph, > > Building with gcc-4.1.2 fixed the problem for me. I also removed an old > install of ompi-1.4 that was in LD_LIBRARY_PATH at build time and might > have been a contributing factor. If I'd known earlier that it was there, I > wouldn't have reported the problem without first removing it. > > I can build again with gcc-4.0.0 and --enable-debug if you are still > interested in trying to get a line number. This would also determine if > LD_LIBRARY_PATH was the true culprit. > > -Paul [Sent from my phone] > On Jan 8, 2014 8:02 PM, "Ralph Castain" <r...@open-mpi.org> wrote: > >> Most likely problem is a bad backing store site - any chance you could >> give me a line number from this? There are a lot of calls to register >> params in that code and I'd need some help in figuring out which one wasn't >> right. >> >> >> On Jan 8, 2014, at 6:59 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: >> >> I am still testing the current 1.7.4rc tarball on my various systems. >> The latest failure (shown below) is a SEGV somewhere below MPI_Init on a >> old, but otherwise fairly normal, Linux/x86 (32-bit) system. >> >> $ /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/bin/mpirun >> -np 1 examples/ring_c >> [pcp-j-6:29031] *** Process received signal *** >> [pcp-j-6:29031] Signal: Segmentation fault (11) >> [pcp-j-6:29031] Signal code: Address not mapped (1) >> [pcp-j-6:29031] Failing at address: 0x6c6c6f63 >> [pcp-j-6:29031] [ 0] [0xbe4440] >> [pcp-j-6:29031] [ 1] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libopen-pal.so.6(mca_base_var_enum_create+0x15d) >> [0x2b11ed] >> [pcp-j-6:29031] [ 2] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/openmpi/mca_coll_ml.so(mca_coll_ml_register_params+0x639) >> [0x440909] >> [pcp-j-6:29031] [ 3] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libopen-pal.so.6(mca_base_framework_components_register+0x14e) >> [0x2b2cce] >> [pcp-j-6:29031] [ 4] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libopen-pal.so.6(mca_base_framework_register+0x1b5) >> [0x2b32a5] >> [pcp-j-6:29031] [ 5] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libopen-pal.so.6(mca_base_framework_open+0x4e) >> [0x2b333e] >> [pcp-j-6:29031] [ 6] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libmpi.so.1(ompi_mpi_init+0x53d) >> [0xaf359d] >> [pcp-j-6:29031] [ 7] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libmpi.so.1(MPI_Init+0x13d) >> [0xb10d6d] >> [pcp-j-6:29031] [ 8] examples/ring_c [0x80486e9] >> [pcp-j-6:29031] [ 9] /lib/libc.so.6(__libc_start_main+0xdc) [0x125ebc] >> [pcp-j-6:29031] [10] examples/ring_c [0x8048631] >> [pcp-j-6:29031] *** End of error message *** >> -------------------------------------------------------------------------- >> mpirun noticed that process rank 0 with PID 29031 on node pcp-j-6 exited >> on signal 11 (Segmentation fault). >> -------------------------------------------------------------------------- >> >> The failure shown is for a singleton run, but np=2 fails as well. >> >> System info: >> $ uname -a >> Linux pcp-j-6 2.6.18-238.1.1.el5PAE #1 SMP Tue Jan 18 19:28:42 EST 2011 >> i686 athlon i386 GNU/Linux >> $ gcc --version >> gcc (GCC) 4.0.0 >> Copyright (C) 2005 Free Software Foundation, Inc. >> This is free software; see the source for copying conditions. There is NO >> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR >> PURPOSE. >> >> The only configure argument used was --prefix. >> >> I was going to attach output from "ompi_info --all", but it SEGV's too! >> >> $ ompi_info --all >> [pcp-j-6:29092] *** Process received signal *** >> [pcp-j-6:29092] Signal: Segmentation fault (11) >> [pcp-j-6:29092] Signal code: Address not mapped (1) >> [pcp-j-6:29092] Failing at address: 0x6c6c6f63 >> [pcp-j-6:29092] [ 0] [0xd8a440] >> [pcp-j-6:29092] [ 1] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libopen-pal.so.6(mca_base_var_enum_create+0x15d) >> [0x2db1ed] >> [pcp-j-6:29092] [ 2] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/openmpi/mca_coll_ml.so(mca_coll_ml_register_params+0x639) >> [0x48d909] >> [pcp-j-6:29092] [ 3] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libopen-pal.so.6(mca_base_framework_components_register+0x14e) >> [0x2dccce] >> [pcp-j-6:29092] [ 4] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libopen-pal.so.6(mca_base_framework_register+0x1b5) >> [0x2dd2a5] >> [pcp-j-6:29092] [ 5] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libopen-pal.so.6(opal_info_register_project_frameworks+0x57) >> [0x2b83d7] >> [pcp-j-6:29092] [ 6] >> /home/pcp1/phargrov/OMPI/openmpi-1.7-latest-linux-x86/INST/lib/libmpi.so.1(ompi_info_register_framework_params+0x81) >> [0xa69251] >> [pcp-j-6:29092] [ 7] ompi_info(main+0x2ba) [0x8049a2a] >> [pcp-j-6:29092] [ 8] /lib/libc.so.6(__libc_start_main+0xdc) [0x125ebc] >> [pcp-j-6:29092] [ 9] ompi_info [0x80496e1] >> [pcp-j-6:29092] *** End of error message *** >> Segmentation fault (core dumped) >> >> I will try again with a newer gcc and report back. >> >> -Paul >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> Future Technologies Group >> Computer and Data Sciences Department Tel: +1-510-495-2352 >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Paul H. Hargrove phhargr...@lbl.gov Future Technologies Group Computer and Data Sciences Department Tel: +1-510-495-2352 Lawrence Berkeley National Laboratory Fax: +1-510-486-6900