Re: [OMPI devel] NetPIPE performance curves

2017-05-04 Thread Dave Turner
George, I think if you get rid of the --start 100 you'll see the curves I'm getting on my local cluster and on Comet. I don't know exactly how the memory registration is handled so it isn't clear to me why there would be less of a penalty for non factors of 8 when starting the tests at 1

[OMPI devel] count = -1 for reduce

2017-05-04 Thread Dahai Guo
Hi, Using opemi 2.1, the following code resulted in the core dump, although only a simple error msg was expected. Any idea what is wrong? It seemed related the errhandler somewhere. D.G. *** An error occurred in MPI_Reduce *** reported by process [3645440001,0] *** on communicator MPI_CO

Re: [OMPI devel] count = -1 for reduce

2017-05-04 Thread Nathan Hjelm
By default MPI errors are fatal and abort. The error message says it all: *** An error occurred in MPI_Reduce *** reported by process [3645440001,0] *** on communicator MPI_COMM_WORLD *** MPI_ERR_COUNT: invalid count argument *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort

Re: [OMPI devel] count = -1 for reduce

2017-05-04 Thread Dahai Guo
Those messages are what I like to see. But, there are some other error messages and core dump I don't like, as I attached in my previous email. I think something might be wrong with errhandler in openmpi. Similar thing happened for Bcast, etc Dahai On Thu, May 4, 2017 at 4:32 PM, Nathan Hjelm

Re: [OMPI devel] count = -1 for reduce

2017-05-04 Thread George Bosilca
Dahai, You are right the segfault is unexpected. I can't replicate this on my mac. What architecture are you seeing this issue ? How was your OMPI compiled ? Please post the output of ompi_info. Thanks, George. On Thu, May 4, 2017 at 5:42 PM, Dahai Guo wrote: > Those messages are what I lik

Re: [OMPI devel] count = -1 for reduce

2017-05-04 Thread Dahai Guo
Hi, George: attached is the ompi_info. I built it on Power8 arch. The configure is also simple. ../configure --prefix=${installdir} \ --enable-orterun-prefix-by-default Dahai On Thu, May 4, 2017 at 4:45 PM, George Bosilca wrote: > Dahai, > > You are right the segfault is unexpected. I can't

Re: [OMPI devel] count = -1 for reduce

2017-05-04 Thread Jeff Squyres (jsquyres)
Can you get a stack trace? > On May 4, 2017, at 6:44 PM, Dahai Guo wrote: > > Hi, George: > > attached is the ompi_info. I built it on Power8 arch. The configure is also > simple. > > ../configure --prefix=${installdir} \ > --enable-orterun-prefix-by-default > > Dahai > > On Thu, May 4, 2

Re: [OMPI devel] count = -1 for reduce

2017-05-04 Thread George Bosilca
I was able to reproduce it (with the correct version of OMPI, aka. the v2.x branch). The problem seems to be that we are lacking a part of the fe68f230991 commit, that remove a free on a statically allocated array. Here is the corresponding patch: diff --git a/ompi/errhandler/errhandler_predefined