[OMPI devel] cosmetic misleading mpirun error message

2015-08-25 Thread Cabral, Matias A
Hi, Playing with the 1.10.0 (just released) build I found a cosmetic misleading error message in mpirun. If by mistake you type -hosts (with an extra "s"), the error message complains about an unknown "-o" option that is actually not being used. Typing the parameters correctly fixes the issue

Re: [OMPI devel] Problem running from ompi master

2015-09-01 Thread Cabral, Matias A
9/1/2015 9:35 AM, Cabral, Matias A wrote: Hi, Before submitting a pull req I decided to test some changes on ompi master branch but I'm facing an unrelated runtime error with ess pmi not being found. I confirmed PATH and LD_LIBRARY_PATH are set correctly and also that mca_ess_pmi.so where

Re: [OMPI devel] Problem running from ompi master

2015-09-01 Thread Cabral, Matias A
? Are you doing a VPATH build, or doing the build in the repo location? Also, I assume you remembered to run autogen.pl before configure, yes? On Sep 1, 2015, at 10:11 AM, Cabral, Matias A <matias.a.cab...@intel.com<mailto:matias.a.cab...@intel.com>> wrote: Hi Gilles, I deleted everythin

[OMPI devel] orted hangs on SLES12 when running 80 ranks per node

2016-02-03 Thread Cabral, Matias A
Hi, I have hit an issue in which orted hangs during the finalization of a job. This is reproduced by running 80 ranks per node (yes, oversubscribed) on a 4 nodes cluster that runs SLES12 with OMPI 1.10.2 (I also see it with 1.10.0). I found that it is independent of the binary used (I used a

Re: [OMPI devel] psm2 and psm2_ep_open problems

2016-04-14 Thread Cabral, Matias A
Hi Howard, I suspect this is the known issue that when using SLURM with OMPI and PSM that is discussed here: https://www.open-mpi.org/community/lists/users/2010/12/15220.php As per today, orte generates the psm_key, so when using SLURM this does not happen and is necessary to set it in the

Re: [OMPI devel] PSM2 Intel folks question

2016-04-19 Thread Cabral, Matias A
Hi Howard, Couple more questions to understand a little better the context: - What type of job running? - Is this also under srun? For PSM2 you may find more details in the programmer’s guide:

Re: [OMPI devel] PSM2 Intel folks question

2016-04-19 Thread Cabral, Matias A
Errata: PSM2_DEVICES="self,hfi" _MAC From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of Cabral, Matias A Sent: Tuesday, April 19, 2016 11:25 AM To: Open MPI Developers <de...@open-mpi.org> Subject: Re: [OMPI devel] PSM2 Intel folks question Hi Howard, Coupl

Re: [OMPI devel] PSM2 Intel folks question

2016-04-20 Thread Cabral, Matias A
fixes to get the PSM2 MTL working on our omnipath clusters. I don't think this problem has anything to do with SLURM except for the jobid manipulation to generate the unique key. Howard 2016-04-19 17:18 GMT-06:00 Cabral, Matias A <matias.a.cab...@intel.com<mailto:matias.a.cab...@intel.com&

Re: [OMPI devel] 2.0.0 is coming: what do we need to communicate to users?

2016-04-29 Thread Cabral, Matias A
How about for “developers that have not been following the transition from 1.x to 2.0? Particularly myself ☺. I started contributing to some specific parts (psm2 mtl) and following changes. However, I don’t have details of what is changing in 2.0. I see there could be different level of

Re: [OMPI devel] 2.0.0 is coming: what do we need to communicate to users?

2016-04-29 Thread Cabral, Matias A
he > v2.x series -- what kinds of user-noticeable things will they see? > > >> On Apr 29, 2016, at 12:34 PM, Cabral, Matias A <matias.a.cab...@intel.com> >> wrote: >> >> How about for “developers that have not been following the transition from >> 1

[OMPI devel] MPI_Init() affecting rand()

2016-07-14 Thread Cabral, Matias A
Hi All, Doing quick test with rand()/srand() I found that MPI_Init() seems to be calling a function in their family that is affecting the values in the user application. Please see below my simple test and the results. Yes, moving the second call to srand() after MPI_init() solves the

Re: [OMPI devel] mtl/psm2 and $PSM2_DEVICES

2016-10-03 Thread Cabral, Matias A
_error_register_handler(NULL /* no ep */, > PSM2_ERRHANDLER_NOP); @@ -230,8 > +244,10 @@ ompi_mtl_psm2_component_init(bool enable_progress_threads, > return NULL; > } > > -if (num_local_procs == num_total_procs) { > - setenv("

Re: [OMPI devel] mtl/psm2 and $PSM2_DEVICES

2016-09-29 Thread Cabral, Matias A
al_procs) && (num_max_procs <= num_total_procs)) { +if (NULL == getenv("PSM2_DEVICES")) { +setenv("PSM2_DEVICES", "self,shm", 0); +} } err = psm2_init(_major, _minor); On 9/30/2016 12:38 AM, Cabral, Matia

Re: [OMPI devel] Last call: v1.10.5

2016-12-19 Thread Cabral, Matias A
Hi Ralph, Should v1.10.5 release wait to include the fix for #2591? Thanks, _MAC -Original Message- From: devel [mailto:devel-boun...@lists.open-mpi.org] On Behalf Of r...@open-mpi.org Sent: Monday, December 19, 2016 8:57 AM To: OpenMPI Devel Subject:

Re: [OMPI devel] Default tag for OFI MTL

2018-03-03 Thread Cabral, Matias A
imit via the MPI_TAG_UB. I personally would prefer a solution where we can alter the distribution of bits between bits in the cid and tag at compile time. We can also envision this selection to be driven by an MCA parameter, but this might be too costly. George. On Sat, Mar 3, 2018

[OMPI devel] Default tag for OFI MTL

2018-03-02 Thread Cabral, Matias A
Hi all, I'm working on extending the OFI MTL to support FI_REMOTE_CQ_DATA (1) to extend the number of ranks currently supported by the MTL. Currently limited to only 16 bits included in the OFI tag (2). After the feature is implemented there will be no limitation for providers that support

Re: [OMPI devel] Default tag for OFI MTL

2018-03-05 Thread Cabral, Matias A
s <devel@lists.open-mpi.org> Subject: Re: [OMPI devel] Default tag for OFI MTL On Sat, Mar 3, 2018 at 6:35 PM, Cabral, Matias A <matias.a.cab...@intel.com<mailto:matias.a.cab...@intel.com>> wrote: Hi George, Thanks for the feedback, appreciated. Few questions/comments: > Rega

Re: [OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-19 Thread Cabral, Matias A
Hi Edgar, I also saw some similar issues, not exactly the same, but look very similar (may be because of different version of libpsm2 ). 1 and 2 are related to the introduction of the OFI BTL and the fact that it opens an OFI EP in its init function. I see that all btls call the init function

Re: [OMPI devel] Announcing Open MPI v4.0.0rc1

2018-09-19 Thread Cabral, Matias A
nent_init to component_open will solve the problem? Arm On Sep 19, 2018, at 1:08 PM, Cabral, Matias A mailto:matias.a.cab...@intel.com>> wrote: Hi Edgar, I also saw some similar issues, not exactly the same, but look very similar (may be because of different version of libpsm2 ). 1 an