Re: [OMPI devel] binding output error

2015-04-23 Thread Elena Elkina
Thanks guys, you're right. This is an output of lstopo on our system which confirms that logical cpus numbering is used in report bindings: lstopo -l Machine (256GB) NUMANode L#0 (P#0 128GB) + Socket L#0 + L3 L#0 (35MB) L2 L#0 (256KB) + L1 L#0 (32KB) + Core L#0 + PU L#0 (P#0) L2 L#1

Re: [OMPI devel] binding output error

2015-04-21 Thread Elena Elkina
(256KB) + L1 L#22 (32KB) + Core L#22 + PU L#22 (P#22) > >> > >> L2 L#23 (256KB) + L1 L#23 (32KB) + Core L#23 + PU L#23 (P#23) > >> > >> L2 L#24 (256KB) + L1 L#24 (32KB) + Core L#24 + PU L#24 (P#24) > >> > >> L2 L#25 (256

[OMPI devel] binding output error

2015-04-20 Thread Elena Elkina
Hi guys, I faced with an issue on our cluster related to mapping & binding policies on 1.8.5. The matter is that --report-bindings output doesn't correspond to the locale. It looks like there is a mistake on the output itself, because it just puts serial core number while that core can be on

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch revert-520-valgrind_cleanness created. dev-1504-g7a8a4a0

2015-04-15 Thread Elena Elkina
at 6:36 PM, Ralph Castain <r...@open-mpi.org> wrote: > S….are you going to restore the rest of it? Or are we asking Nathan to > refile it without that one piece? > > > On Apr 15, 2015, at 7:26 AM, Elena Elkina <elena.elk...@itseez.com> wrote: > > Hi Ralph. >

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch revert-520-valgrind_cleanness created. dev-1504-g7a8a4a0

2015-04-15 Thread Elena Elkina
Hi Ralph. We don't need to revert the whole commit, just to fix this small part. I proposed a fast fix for that in the PR but probably we need to fix it more intellectually. Best regards, Elena On Wed, Apr 15, 2015 at 6:08 PM, Ralph Castain wrote: > I’m really puzzled - I

Re: [OMPI devel] [OMPI commits] Git: open-mpi/ompi branch master updated. dev-612-g05af80b

2014-12-24 Thread Elena Elkina
Hi Ralph, As I remember the idea of this code was to create a reply once (and set flag stored to true) but send this reply multiple times (to each process from the list of requests). Flag stored is set to false earlier in the code. It means that once (for the first request in the loop

Re: [OMPI devel] simple_spawn test fails using different set of btls.

2014-11-06 Thread Elena Elkina
I believe it is a bug - I provided some initial values for the modex scope >> with the expectation (and request when we committed it) that people would >> review and modify them as appropriate. I recall setting the openib scope as >> “remote” only because I wasn’t aware of anyo

[OMPI devel] simple_spawn test fails using different set of btls.

2014-11-05 Thread Elena Elkina
Hi, It looks like there is a problem in trunk which reproduces with simple_spawn test (orte/test/mpi/simple_spawn.c). It seems to be a n issue with pmix. It doesn't reproduce with default set of btls. But it reproduces with several btls specified. For example, salloc -N5

Re: [OMPI devel] OMPI BCOL hang with PMI1

2014-10-17 Thread Elena Elkina
Hi Artem, Actually some time ago there was a known issue with coll ml. I used to run my command lines with -mca coll ^ml to avoid these problems, so I don't know if it was fixed or not. It looks like you have the same problem. Best regards, Elena On Fri, Oct 17, 2014 at 7:01 PM, Artem Polyakov

Re: [OMPI devel] regression with derived datatypes

2014-05-08 Thread Elena Elkina
Hi, My reproducer failed even with one port enabled (-mca btl_openib_if_include mlx4_0:1 ). I tried with trunk as well - the same issue. Best, Elena On Thu, May 8, 2014 at 11:49 AM, Gilles Gouaillardet < gilles.gouaillar...@iferc.org> wrote: > Nathan and George, > > here are the output files

Re: [OMPI devel] regression with derived datatypes

2014-05-07 Thread Elena Elkina
Yes, this commit is also in the trunk. Best, Elena On Wed, May 7, 2014 at 5:45 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>wrote: > Is this also happening on the trunk? > > > Sent from my phone. No type good. > > On May 7, 2014, at 9:44 AM, "Elena Elkina"

Re: [OMPI devel] regression with derived datatypes

2014-05-07 Thread Elena Elkina
regards, Elena On Wed, May 7, 2014 at 5:43 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com>wrote: > Can you cite the branch and SVN r number? > > Sent from my phone. No type good. > > > On May 7, 2014, at 9:24 AM, "Elena Elkina"

[OMPI devel] regression with derived datatypes

2014-05-07 Thread Elena Elkina
Hi, I've found that commit b531973419a056696e6f88d813769aa4f1f1aee6 doesn't work Author: Jeff Squyres List-Post: devel@lists.open-mpi.org Date: Tue Apr 22 19:48:56 2014 + caused new failures with derived datatypes. Collectives return incorrect

[OMPI devel] -mca coll "ml" cause segv or hangs with different command lines.

2014-03-04 Thread Elena Elkina
Hi, Recently I often meet hangs and seg faults with different command lines and there are "ml" functions in the stack trace. When I just turn "ml" off by do -mca coll ^ml, problems disappear. For example, oshrun -np 4 --map-by node --display-map ./ring_oshmem fails with seg fault while oshrun