[OMPI devel] btl_openib_receive_queues mca param not always taken into account

2014-07-11 Thread Nadia Derbey
, -- Nadia Derbey # HG changeset patch # Parent 4cb09323aca44faec7d027586ffa94e7d9681989 btl/openib: when specifying the receive_queues as an mca param to bypass the XRC settings, the XRC settings in the .ini file are taken into account nevertheless if we use the default QPs value diff -r

Re: [OMPI devel] bug in opal_generic_simple_pack_function()

2013-11-25 Thread Nadia Derbey
ur example, I don't think they are related. I guess we should look at all the patches in the opal/datatype and ompi/datatype over the last 13 months (the starting point of the 1.6.3). George. On Nov 25, 2013, at 14:10 , Nadia Derbey <nadia.der...@bull.net <mailto:nadia.der...@

Re: [OMPI devel] bug in opal_generic_simple_pack_function()

2013-11-25 Thread Nadia Derbey
drawn by hand. George. On Nov 25, 2013, at 11:40 , Nadia Derbey <nadia.der...@bull.net <mailto:nadia.der...@bull.net>> wrote: Hi, I'm currently working on a bug occuring at the client site with openmpi when calling MPI_Sendreceive() on datatypes built by the application.

[OMPI devel] bug in opal_generic_simple_pack_function()

2013-11-25 Thread Nadia Derbey
- pConvertor->pBaseBuf ); +source_base - pStack->disp - pConvertor->pBaseBuf - pData->lb ); DO_DEBUG( opal_output( 0, "pack save stack stack_pos %d pos_desc %d count_desc %d disp %ld\n", pConvertor->stack_pos, pStack->i

Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see processes as bound if the job has been launched by srun

2012-02-17 Thread nadia . derbey
devel-boun...@open-mpi.org wrote on 02/17/2012 08:36:54 AM: > De : Brice Goglin > A : de...@open-mpi.org > Date : 02/17/2012 08:37 AM > Objet : Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see > processes as bound if the job has been launched by srun > Envoyé

Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see processes as bound if the job has been launched by srun

2012-02-16 Thread nadia . derbey
no need for any other patch: the fix you committed was the only one needed to fix the issue. Could you please move it to v1.5 (do I need to fill a CMR)? Thanks! -- Nadia Derbey devel-boun...@open-mpi.org wrote on 02/09/2012 06:00:48 PM: > De : Jeff Squyres <jsquy...@cisco.com> &

Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see processes as bound if the job has been launched by srun

2012-02-09 Thread nadia . derbey
devel-boun...@open-mpi.org wrote on 02/09/2012 01:32:31 PM: > De : Ralph Castain > A : Open MPI Developers > Date : 02/09/2012 01:32 PM > Objet : Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see > processes as bound if the job has been

Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see processes as bound if the job has been launched by srun

2012-02-09 Thread nadia . derbey
devel-boun...@open-mpi.org wrote on 02/09/2012 12:20:41 PM: > De : Brice Goglin > A : Open MPI Developers > Date : 02/09/2012 12:20 PM > Objet : Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see > processes as bound if the job has been

Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see processes as bound if the job has been launched by srun

2012-02-09 Thread nadia . derbey
devel-boun...@open-mpi.org wrote on 02/09/2012 12:18:20 PM: > De : Jeff Squyres > A : Open MPI Developers > Date : 02/09/2012 12:18 PM > Objet : Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see > processes as bound if the job has been

Re: [OMPI devel] btl/openib: get_ib_dev_distance doesn't see processes as bound if the job has been launched by srun

2012-02-06 Thread nadia . derbey
Resending, as i didn't get any answer... Regards, Nadia -- Nadia Derbey devel-boun...@open-mpi.org wrote on 01/27/2012 05:38:34 PM: > De : "nadia.derbey" <nadia.der...@bull.net> > A : Open MPI Developers <de...@open-mpi.org> > Date : 01/27/2012 05:35 PM &g

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-30 Thread nadia . derbey
devel-boun...@open-mpi.org wrote on 08/29/2011 06:59:49 PM: > De : Brice Goglin > A : Open MPI Developers > Date : 08/29/2011 07:00 PM > Objet : Re: [OMPI devel] known limitation or bug in hwloc? > Envoyé par : devel-boun...@open-mpi.org > > I am

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-30 Thread nadia . derbey
Thanks a lot Ralph! Regards, -- Nadia Derbey Phone: +33 (0)4 76 29 77 62 devel-boun...@open-mpi.org wrote on 08/29/2011 06:12:13 PM: > De : Ralph Castain <r...@open-mpi.org> > A : Open MPI Developers <de...@open-mpi.org> > Date : 08/29/2011 06:12 PM > Objet

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread nadia . derbey
devel-boun...@open-mpi.org wrote on 08/29/2011 05:57:59 PM: > De : Ralph Castain > A : Open MPI Developers > Date : 08/29/2011 05:58 PM > Objet : Re: [OMPI devel] known limitation or bug in hwloc? > Envoyé par : devel-boun...@open-mpi.org > > On Aug 29,

Re: [OMPI devel] known limitation or bug in hwloc?

2011-08-29 Thread nadia . derbey
devel-boun...@open-mpi.org wrote on 08/29/2011 04:20:30 PM: > De : Ralph Castain > A : Open MPI Developers > Date : 08/29/2011 04:26 PM > Objet : Re: [OMPI devel] known limitation or bug in hwloc? > Envoyé par : devel-boun...@open-mpi.org > > Actually,

Re: [OMPI devel] Fix a hang in carto_base_select() if carto_module_init() fails

2011-07-08 Thread nadia . derbey
Yes, sure! Agreed. Regards, -- Nadia Derbey Phone: +33 (0)4 76 29 77 62 devel-boun...@open-mpi.org wrote on 07/08/2011 02:10:22 AM: > De : Jeff Squyres <jsquy...@cisco.com> > A : Open MPI Developers <de...@open-mpi.org> > Date : 07/08/2011 02:10 AM > Objet : Re:

Re: [OMPI devel] problem when binding to socket on a single socket node

2010-04-13 Thread Nadia Derbey
On Tue, 2010-04-13 at 01:27 -0600, Ralph Castain wrote: > On Apr 13, 2010, at 1:02 AM, Nadia Derbey wrote: > > > On Mon, 2010-04-12 at 10:07 -0600, Ralph Castain wrote: > >> By definition, if you bind to all available cpus in the OS, you are > >> bound to nothing (i.

Re: [OMPI devel] problem when binding to socket on a single socket node

2010-04-13 Thread Nadia Derbey
; > > nodes, so the test needs to be there. Again, if the user said "bind to > > > socket", but none of that socket's cores are assigned for our use, that > > > is an error. > > > > > > I haven't looked at your specific fix, but I agree with Terry's ques

Re: [OMPI devel] problem when binding to socket on a single socket node

2010-04-12 Thread Nadia Derbey
t whether or not we were externally bound is irrelevant. Even > if the overall result is what you want, I think a more logically > understandable test would help others reading the code. > > But first we need to resolve the question: should this scenario return an > error or

Re: [OMPI devel] problem when binding to socket on a single socket node

2010-04-12 Thread Nadia Derbey
r path, what we are doing is checking if we have set one or more bits in a mask after having actually set them: don't you think it's useless? That's why I'm suggesting to call the last check only if orte_odls_globals.bound is true. Regards, Nadia > > --td > > > > > > O

Re: [OMPI devel] problem when binding to socket on a single socket node

2010-04-09 Thread Nadia Derbey
looks like 1. the call to OPAL_PAFFINITY_PROCESS_IS_BOUND is still there in odls_default_fork_local_proc() 2. OPAL_PAFFINITY_PROCESS_IS_BOUND() is defined the same way But, I'll give it a try with the latest trunk. Regards, Nadia > On Apr 9, 2010, at 3:39 AM, Nadia Derbey wrote: >

[OMPI devel] problem when binding to socket on a single socket node

2010-04-09 Thread Nadia Derbey
ification to this test (see attached patch). And may be both solutions could be mixed. Regards, Nadia -- Nadia Derbey <nadia.der...@bull.net> Do not test actual process binding in obvious cases diff -r 0b851b2e7934 orte/mca/odls/default/odls_default_module.c --- a/orte/mca/odls/default/odls

Re: [OMPI devel] RFC 1/1: improvements to the "notifier" framework and ORTE WDC

2010-03-30 Thread Nadia Derbey
bring the SOS and WDC > > branches > > to the trunk. This only brings in the "notifier" changes from the > > SOS > > branch, while the rest of the branch will be brought over after the > > timeout of the second RFC. > > > > == > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > > ___ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Nadia Derbey <nadia.der...@bull.net>

Re: [OMPI devel] typo in opal/event/evutil.h ?

2010-02-26 Thread Nadia Derbey
file is for me: changeset: 17413:32687831ca9e user:brbarret date:Thu Feb 04 05:38:30 2010 + summary: Update libevent to 1.4.13 But maybe something got messed here in our repo, will check. Regards, Nadia > > On Fri, Feb 26, 2010 at 3:48 AM, Nadia Derbey <nadia.

[OMPI devel] typo in opal/event/evutil.h ?

2010-02-26 Thread Nadia Derbey
igned __int64 -#elif _EVENT_SIZEOF_LONG_LONG == 8 +#elif SIZEOF_LONG_LONG == 8 #define ev_uint64_t unsigned long long #define ev_int64_t long long #elif SIZEOF_LONG == 8 Regards, Nadia -- Nadia Derbey <nadia.der...@bull.net>

Re: [OMPI devel] PATCH: remove trailing colon at the end of thegenerated LD_LIBRARY_PATH

2010-02-18 Thread Nadia Derbey
On Wed, 2010-02-17 at 17:14 -0500, Jeff Squyres wrote: > Looks good to me! > > Please commit and file CMRs for v1.4 and v1.5 (assuming this patch applies > cleanly to both branches). Not sure I have the rights to do these things? Regards, Nadia > > > On Feb 16, 2010, at

[OMPI devel] PATCH: remove trailing colon at the end of the generated LD_LIBRARY_PATH

2010-02-16 Thread Nadia Derbey
Hi, The mpivars.sh genereted in openmpi.spec might in some cases lead to a LD_LIBRARY_PATH that contains a trailing ":". This happens if the LD_LIBRARY_PATH is originally unset. This means that current directory is included in the search path for the loader, which might not be the desired result.

Re: [OMPI devel] HOSTNAME environment variable

2010-01-22 Thread Nadia Derbey
ely mess things up for > > more than OMPI. > > > > Are you sure that SLURM is propagating the environment (something I have > > never seen before)? Or is OMPI mistakenly picking it up and propagating it? > > > > On Jan 22, 2010, at 7:25 AM, Nadia Der

Re: [OMPI devel] HOSTNAME environment variable

2010-01-22 Thread Nadia Derbey
e - and as soon as Nadia confirms, on SLURM as well. > > I know that on Torque it was an innocent mistake where a line got added to > the launch code that shouldn't have... > > On Jan 22, 2010, at 8:07 AM, N.M. Maclaren wrote: > > > On Jan 22 2010, Nadia Derbey wrot

[OMPI devel] HOSTNAME environment variable

2010-01-22 Thread Nadia Derbey
Hi, I'm wondering whether the HOSTNAME environment variable shouldn't be handled as a "special case" when the orted daemons launch the remote jobs. This particularly applies to batch schedulers where the caller's environment is copied to the remote job: we are inheriting a $HOSTNAME which is the

Re: [OMPI devel] VT config.h.in

2010-01-19 Thread Nadia Derbey
de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- Nadia Derbey <nadia.der...@bull.net>

Re: [OMPI devel] mca_btl_openib_post_srr() posts to an uncreated SRQwhen ibv_resize_cq() has failed

2009-11-26 Thread Nadia Derbey
ies to be useful (e.g., 0 or some higher number > > that is still "too small"), or fail the BTL alltogether...? > > > > On Oct 23, 2009, at 10:10 AM, Nadia Derbey wrote: > > > >> Hi, > >> > >> Yesterdays I had to analyze a SIGSEV o

Re: [OMPI devel] RFC - "system-wide-only" MCA parameters

2009-09-04 Thread Nadia Derbey
and where those system-wide params should not be *unintentionally* set to different values. Regards, Nadia > > :-( > > > On Sep 4, 2009, at 12:42 AM, Jeff Squyres wrote: > > > On Sep 4, 2009, at 8:26 AM, Nadia Derbey wrote: > > > >> > Can the file nam

Re: [OMPI devel] RFC - "system-wide-only" MCA parameters

2009-09-04 Thread Nadia Derbey
even a set of discrete values) for any such parameter. Then, any higher priority setting will be done only if the new value belongs to the declared set. But actually, may be that extension is not desirable at all. In that case, I agree that your prposal is a very good compromise: . single parser (though it should be enhanced) . single configuration file Regards, Nadia -- Nadia Derbey <nadia.der...@bull.net>

Re: [OMPI devel] RFC - "system-wide-only" MCA parameters

2009-09-04 Thread Nadia Derbey
, or do you get a single-line version number? > I get the same. The reason is simple : > > $ hg tip > changeset: 9:f11244ed72b5 > tag: tip > user:Nadia Derbey <nadia.der...@bull.net> > date:Thu Sep 03 14:21:47 2009 +0200 > summary: up

Re: [OMPI devel] RFC - "system-wide-only" MCA parameters

2009-09-04 Thread Nadia Derbey
On Fri, 2009-09-04 at 10:05 +0300, Jeff Squyres wrote: > On Sep 3, 2009, at 12:23 PM, Nadia Derbey wrote: > > > What: Define a way for the system administrator to prevent users from > > overwriting the default system-wide MCA parameters settings. > > > > In

Re: [OMPI devel] RFC - "system-wide-only" MCA parameters

2009-09-04 Thread Nadia Derbey
ten as usual. > > Can the file name ( openmpi-priv-mca-params.conf ) also be configurable ? No, it isn't, presently, but this can be changed if needed. Regards, Nadia > > Rich > > > On 9/3/09 5:23 AM, "Nadia Derbey" <nadia.der...@bull.net> wrote: >

Re: [OMPI devel] problem in the ORTE notifier framework

2009-05-28 Thread Nadia Derbey
tion will how to do this with zero performance > impact when it is not being used. This has always been the > difficult issue when trying to implement any kind of > monitoring inside the core OMPI performance-sensitive paths. >

Re: [OMPI devel] problem in the ORTE notifier framework

2009-05-28 Thread Nadia Derbey
May 27, 2009, at 06:59 , Ralph Castain wrote: > > > > > ORTE_NOTIFIER_VERBOSE(api, counter, threshold,...) > > > > > > #if WANT_NOTIFIER_VERBOSE > > > opal_atomic_increment(counter); > > > if (counter > threshold) { > > > orte_notifier.api(.) > > > } > > > #endif > > > > ___ > > devel mailing list > > de...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > -- Nadia Derbey <nadia.der...@bull.net>

[OMPI devel] problem in the ORTE notifier framework

2009-05-26 Thread Nadia Derbey
Hi, While having a look at the notifier framework under orte, I noticed that the way it is written, the init routine for the selected module cannot be called. Attached is a small patch that fixes this issue. Regards, Nadia ORTE notifier module init routine is never called: orte_notifier.init

Re: [OMPI devel] RFC: Diagnostoc framework for MPI

2009-05-26 Thread Nadia Derbey
ditional (probably better) > ways of implementing this extension. My point here was simply to > ensure you knew that the basic mechanism already exists, and to > stimulate some thought as to how to use it for your proposed purpose. > > I would be happy to help you do so as this is some

[OMPI devel] RFC: Diagnostoc framework for MPI

2009-05-26 Thread Nadia Derbey
have your opinion about its usefulness, or even to know if there's an already existing mechanism to do this job. Regards, Nadia -- Nadia Derbey <nadia.der...@bull.net>