[OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet
Folks, i (sometimes) get some failure with the c_accumulate test from the ibm test suite on one host with 4 mpi tasks so far, i was only able to observe this on linux/sparc with the vader btl here is a snippet of the test : MPI_Win_create(, sizeOfInt, 1, MPI_INFO_NULL,

Re: [OMPI devel] c_accumulate

2015-04-20 Thread Rolf vandeVaart
Hi Gilles: Is your failure similar to this ticket? https://github.com/open-mpi/ompi/issues/393 Rolf From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Gilles Gouaillardet Sent: Monday, April 20, 2015 9:12 AM To: Open MPI Developers Subject: [OMPI devel] c_accumulate Folks, i

[OMPI devel] binding output error

2015-04-20 Thread Elena Elkina
Hi guys, I faced with an issue on our cluster related to mapping & binding policies on 1.8.5. The matter is that --report-bindings output doesn't correspond to the locale. It looks like there is a mistake on the output itself, because it just puts serial core number while that core can be on

Re: [OMPI devel] 1.8.5rc1 is ready for testing

2015-04-20 Thread Jeff Squyres (jsquyres)
I looked at this thread in a little more detail... The question below is a little moot because of the change that was done to v1.8, but please humor me anyway. :-) Macro: I think you told me before, but I forget, so please refresh my memory: I seem to recall that there's a reason you're

Re: [OMPI devel] 1.8.5rc1 is ready for testing

2015-04-20 Thread Marco Atzeri
On 4/20/2015 5:16 PM, Jeff Squyres (jsquyres) wrote: I looked at this thread in a little more detail... The question below is a little moot because of the change that was done to v1.8, but please humor me anyway. :-) Macro: I think you told me before, but I forget, so please refresh my

Re: [OMPI devel] 1.8.5rc1 is ready for testing

2015-04-20 Thread Jeff Squyres (jsquyres)
Got it; I knew there was a reason -- I just couldn't remember what it was. If you care, the problem was actually a bug in Libtool's libltdl embedding machinery. We "fixed" the problem by not embedding libltdl by default any more (and went a different way...). If you care:

[OMPI devel] Master appears broken on the Mac

2015-04-20 Thread Ralph Castain
Hit this error with current HEAD: checking if threads have different pids (pthreads on linux)... configure: WARNING: Found configure shell variable clash! configure: WARNING: OPAL_VAR_SCOPE_PUSH called on "LDFLAGS_save", configure: WARNING: but it is already defined with value "

Re: [OMPI devel] Master appears broken on the Mac

2015-04-20 Thread Nathan Hjelm
Shoot. That would be my configure changes. Looks like I should rename that temporary variable or push/pop it. Will get you a fix soon. -Nathan On Mon, Apr 20, 2015 at 01:57:45PM -0700, Ralph Castain wrote: >Hit this error with current HEAD: > >checking if threads have different pids

[OMPI devel] noticing odd message

2015-04-20 Thread Howard Pritchard
Hi Folks, Working on master, I"m getting an odd message: malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c, 170) whenever I launch a job. It looks like this can be traced back to this line in orte_ess_singleton_component_register: mca_base_var_register_synonym(ret "orte",

Re: [OMPI devel] noticing odd message

2015-04-20 Thread Jeff Squyres (jsquyres)
+1 -- I saw this today/over the weekend. I didn't bisect to see where it started; I assume it was one of the MCA var base updates. > On Apr 20, 2015, at 6:34 PM, Howard Pritchard wrote: > > Hi Folks, > > Working on master, I"m getting an odd message: > > malloc debug:

Re: [OMPI devel] noticing odd message

2015-04-20 Thread Nathan Hjelm
Tracking it down now. Probably a typo in a component initialization. -Nathan On Mon, Apr 20, 2015 at 04:34:10PM -0600, Howard Pritchard wrote: >Hi Folks, >Working on master, I"m getting an odd message: >malloc debug: Request for 1 zeroed elements of size 0 (mca_base_var.c, >170)

Re: [OMPI devel] binding output error

2015-04-20 Thread Jeff Squyres (jsquyres)
Ralph's the authority on this one, but just to be sure: are all nodes the same topology? E.g., does adding "--hetero-nodes" to the mpirun command line fix the problem? > On Apr 20, 2015, at 9:29 AM, Elena Elkina wrote: > > Hi guys, > > I faced with an issue on our

Re: [OMPI devel] binding output error

2015-04-20 Thread Ralph Castain
Also, was this with HT's enabled? I'm wondering if the print code is incorrectly computing the core because it isn't correctly accounting for HT cpus. On Mon, Apr 20, 2015 at 3:49 PM, Jeff Squyres (jsquyres) wrote: > Ralph's the authority on this one, but just to be sure:

Re: [OMPI devel] noticing odd message

2015-04-20 Thread Nathan Hjelm
Fixed in 359a282e7d31a8a7af3a69ead518ff328862b801. mca_base_var does not currently allow component to be registered with NULL for both the framework and component. -Nathan On Mon, Apr 20, 2015 at 04:34:10PM -0600, Howard Pritchard wrote: >Hi Folks, >Working on master, I"m getting an odd

Re: [OMPI devel] noticing odd message

2015-04-20 Thread Ralph Castain
I confirmed it is cleaned up for me - thanks Nathan! On Mon, Apr 20, 2015 at 3:52 PM, Nathan Hjelm wrote: > > Fixed in 359a282e7d31a8a7af3a69ead518ff328862b801. mca_base_var does not > currently allow component to be registered with NULL for both the > framework and component.

Re: [OMPI devel] noticing odd message

2015-04-20 Thread Jeff Squyres (jsquyres)
Thanks! > On Apr 20, 2015, at 6:52 PM, Nathan Hjelm wrote: > > > Fixed in 359a282e7d31a8a7af3a69ead518ff328862b801. mca_base_var does not > currently allow component to be registered with NULL for both the > framework and component. > > -Nathan > > On Mon, Apr 20, 2015 at

Re: [OMPI devel] binding output error

2015-04-20 Thread Devendar Bureddy
HT is not enabled. All node are same topo . This is reproducible even on single node. I ran osu latency to see if it is really is mapped to other socket or not with –map-by socket. It looks likes mapping is correct as per latency test. $mpirun -np 2 -report-bindings -map-by socket

Re: [OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet
Hi Rolf, yes, same issue ... i attached a patch to the github issue ( the issue might be in the test). From the standards (11.5 Synchronization Calls) : "TheMPI_WIN_FENCE collective synchronization call supports a simple synchroniza- tion pattern that is often used in parallel computations:

Re: [OMPI devel] c_accumulate

2015-04-20 Thread Kawashima, Takahiro
Hi Gilles, Nathan, I read the MPI standard but I think the standard doesn't require a barrier in the test program. >From the standards (11.5.1 Fence) : A fence call usually entails a barrier synchronization: a process completes a call to MPI_WIN_FENCE only after all other processes in

Re: [OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet
Kawashima-san, Nathan reached the same conclusion (see the github issue) and i fixed the test by manually adding a MPI_Barrier. Cheers, Gilles On 4/21/2015 10:20 AM, Kawashima, Takahiro wrote: Hi Gilles, Nathan, I read the MPI standard but I think the standard doesn't require a barrier in

Re: [OMPI devel] binding output error

2015-04-20 Thread tmishima
Hi Devendar, As far as I know, the report-bindings option shows the logical cpu order. On the other hand, you are talking about physical one, I guess. Regards, Tetsuya Mishima 2015/04/21 9:04:37、"devel"さんは「Re: [OMPI devel] binding output error」で書きました > HT is not enabled.  All node are same topo

Re: [OMPI devel] c_accumulate

2015-04-20 Thread Kawashima, Takahiro
Hi Gilles, Nathan, No, my conclusion is that the MPI program does not need a MPI_Barrier but MPI implementations need some synchronizations. Thanks, Takahiro Kawashima, > Kawashima-san, > > Nathan reached the same conclusion (see the github issue) and i fixed > the test > by manually adding a

Re: [OMPI devel] c_accumulate

2015-04-20 Thread Gilles Gouaillardet
Kawashima-san, i am confused ... as you wrote : In the MPI_MODE_NOPRECEDE case, a barrier is not necessary in the MPI implementation to end access/exposure epochs. and the test case calls MPI_Win_fence with MPI_MODE_NOPRECEDE. are you saying Open MPI implementation of MPI_Win_fence should

Re: [OMPI devel] c_accumulate

2015-04-20 Thread Kawashima, Takahiro
Gilles, Sorry for confusing you. My understanding is: MPI_WIN_FENCE has four roles regarding access/exposure epochs. - end access epoch - end exposure epoch - start access epoch - start exposure epoch In order to end access/exposure epochs, a barrier is not needed in the MPI

[hwloc-devel] Create success (hwloc git 1.10.1-24-g684fdcd)

2015-04-20 Thread MPI Team
Creating nightly hwloc snapshot git tarball was a success. Snapshot: hwloc 1.10.1-24-g684fdcd Start time: Mon Apr 20 21:03:04 EDT 2015 End time: Mon Apr 20 21:04:29 EDT 2015 Your friendly daemon, Cyrador