Re: [OMPI devel] callback debugging

2014-01-20 Thread Josh Hursey
If it is the application, then there is probably a barrier in the app_coord_init() to make sure all the applications are up and running. After this point then the global coordinator knows that the application can be checkpointed. I don't think orte-checkpoint should be calling a barrier - from wha

Re: [OMPI devel] callback debugging

2014-01-20 Thread Ralph Castain
Is it orte-checkpoint that is hanging, or the app you are trying to checkpoint? On Jan 20, 2014, at 2:10 PM, Adrian Reber wrote: > Thanks for your help. I tried initializing the barrier correctly (see > attached patch) but now, instead of crashing, it just hangs on the > barrier while running o

Re: [OMPI devel] callback debugging

2014-01-20 Thread Adrian Reber
Thanks for your help. I tried initializing the barrier correctly (see attached patch) but now, instead of crashing, it just hangs on the barrier while running orte-checkpoint [dcbz:20150] [[41665,0],0] grpcomm:bad entering barrier [dcbz:20150] [[41665,0],0] ACTIVATING GRCPCOMM OP 0 at ../../../..

Re: [OMPI devel] [EXTERNAL] 1.7.4rc: linux/ppc32/xlc-11.1 build failure

2014-01-20 Thread Barrett, Brian W
On 1/17/14 6:28 PM, "Paul Hargrove" mailto:phhargr...@lbl.gov>> wrote: I am trying to build the 1.7 nightly tarball (1.7.4rc2r30303) on a Linux/PPC system with the xlc-11.1 compilers configured for 32-bit output: $ export OBJECT_MODE=32 $ [pathto]/configure CC=xlc CXX=xlC FC=xlf90 --enable-debu

Re: [OMPI devel] [EXTERNAL] 1.7.4rc: build failure on mips32

2014-01-20 Thread Barrett, Brian W
On 1/17/14 8:00 PM, "Paul Hargrove" mailto:phhargr...@lbl.gov>> wrote: Trying to build 1.7.4rc2r30303 with gcc on linux/mips32 yields the following failure: CXX mpicxx.lo /home/phargrov/OMPI/openmpi-1.7.4-latest-linux-mips32/openmpi-1.7.4rc2r30303/ompi/mpi/cxx/mpicxx.cc:31:2: warning: #

Re: [OMPI devel] SC13 birds of a feather - thermal monitoring

2014-01-20 Thread Ralph Castain
Just as a follow-up to this: I have added a sensor module to monitor core temperatures per this email thread. I haven't added the cooling devices from this last bit as the info I could find under there didn't seem all that helpful right now - mostly just how fast the fan is running on a scale of