Hi Paul The binding stuff was in there, but the limit protection code just went in today. Jeff has since regenerated the tarball for the web site, so the one up there should have most (if not all) of these problems fixed
Have a great holiday! Ralph On Dec 20, 2013, at 11:40 AM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Ralph, > > I see the same behavior w/ last night's 1.7 tarball (openmpi-1.7.4rc2r30002). > The very next commit, r30003, is your addition (on trunk) of guards for > RLIMIT_AS, etc.. > So, I DON'T think any fix for this behavior is in the 1.7 branch as you > thought (maybe just CMR'ed?) > > Let me know if there is additional information about the platform or error > which I should collect. > > -Paul > > P.S. > You may see my email vacation auto-responder message. > My vacation has started (no *paid* work) but I am still reading email today. > I plan to re-test tonight's 1.7 tarball on all the systems where I reported > issues on Thu night. > > > On Thu, Dec 19, 2013 at 7:19 PM, Ralph Castain <r...@open-mpi.org> wrote: > I believe this one has already been fixed and is in the nightly (1.7.4rc2) - > for now, you can just set "--bind-to none" on the cmd line to get past it > > > On Dec 19, 2013, at 6:42 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > >> Testing with Solaris 10 on SPARC, I was expecting to encounter the bus error >> reported previously by Siegman Gross. Instead I see the following >> hwloc-related abort: >> >> $ env >> PATH=/home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/bin:$PATH >> >> LD_LIBRARY_PATH_64=/home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/lib:$LD_LIBRARY_PATH_64 >> OMPI_MCA_shmem_mmap_enable_nfs_warning=0 >> /home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/INST/bin/mpirun >> -mca btl sm,self -np 2 examples/ring_c >> -------------------------------------------------------------------------- >> Open MPI tried to bind a new process, but something went wrong. The >> process was killed without launching the target application. Your job >> will now abort. >> >> Local host: niagara1 >> Application name: examples/ring_c >> Error message: hwloc indicates cpu binding cannot be enforced >> Location: >> /home/hargrove/OMPI/openmpi-1.7.4rc1-solaris10-sparcT2-ss12u3-v9/openmpi-1.7.4rc1/orte/mca/odls/default/odls_default_module.c:478 >> -------------------------------------------------------------------------- >> 2 total processes failed to start >> >> >> I am assuming I just need some magic pixie dust to disable cpu binding. >> I'd appreciate some corresponding instructions. >> >> However, if this is NOT an expected/desired/known behavior please let me >> know what I can/should do to help determine the root cause. >> >> >> -Paul >> >> -- >> Paul H. Hargrove phhargr...@lbl.gov >> Future Technologies Group >> Computer and Data Sciences Department Tel: +1-510-495-2352 >> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel