Re: [OMPI devel] OMPI v2.0.1rc1 available for test

2016-08-24 Thread Gilles Gouaillardet
If my git is correct $ git rev-list --boundary v2.x...master/master | grep '^-' -acc2c7937cd3b50de16044b673399e4c4a7456bc -6772d32b85b836438eeca72e4a8fda026ea67a55 -ec44a25070f99b6e1d96886fe3990ad560ee63c0 $ git show ec44a25070f99b6e1d96886fe3990ad560ee63c0 commit

Re: [OMPI devel] [2.0.1.rc1] Solaris MPIX failure

2016-08-24 Thread Gilles Gouaillardet
that patch. -Paul On Tue, Aug 23, 2016 at 10:16 PM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote: Paul, you can download a patch at https://patch-diff.githubusercontent.com/raw/open-mpi/om

Re: [OMPI devel] Upgrading our (openSUSE) Open MPI version

2016-08-24 Thread Gilles Gouaillardet
Karol, is there any place i can download glibc 2.24 for tumbleweed ? i'd like to have a look at what is going wrong with openmpi 1.10 Cheers, Gilles On 8/25/2016 10:24 AM, Karol Mroz wrote: Greetings! I would like to upgrade our (openSUSE Tumbleweed) version of Open MPI from 1.10.3

Re: [OMPI devel] Upgrading our (openSUSE) Open MPI version

2016-08-24 Thread Gilles Gouaillardet
Karol, it is correct vampirtrace was removed from Open MPI 2.0.0 are you using your own (or openSUSE) .spec file ? or the one provided by Open MPI ? fwiw, Open MPI 1.10.4 and 2.0.1 should be released in a near future. Cheers, Gilles On 8/25/2016 10:24 AM, Karol Mroz wrote:

Re: [OMPI devel] [2.0.1.rc1] Solaris MPIX failure

2016-08-23 Thread Gilles Gouaillardet
_CRED; } #else return PMIX_ERR_NOT_SUPPORTED; #endif I can only surmise, therefore, that Solaris doesn’t pass either of the two #if define’d tests. Is there a Solaris alternative? On Aug 23, 2016, at 5:55 AM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote: Thanks Gilles!

Re: [OMPI devel] stdin issue with master

2016-08-23 Thread Gilles Gouaillardet
wrote: > Fixed in 9210230 > > > On Aug 22, 2016, at 8:49 PM, r...@open-mpi.org <javascript:;> wrote: > > > > Yeah, I started working on it earlier this evening - will look some more > tomorrow > > > >> On Aug 22, 2016, at 7:57 PM, Gilles G

Re: [OMPI devel] [2.0.1.rc1] Solaris MPIX failure

2016-08-23 Thread Gilles Gouaillardet
Thanks Paul, at first glance, something is going wrong in the sec module under solaris. I will keep digging tomorrow Cheers, Gilles On Tuesday, August 23, 2016, Paul Hargrove wrote: > On Solaris 11.3 on x86-64: > > $ mpirun -mca btl sm,self,openib -np 2 -host

[OMPI devel] stdin issue with master

2016-08-22 Thread Gilles Gouaillardet
Folks, i made a trivial test echo hello | mpirun -np 1 cat and with v2.x and v1.10, the output is "hello" as expected but with master, there is no output at all (!) i was able to fix that with the dirty workaround below. the root cause (on master) is that orte_cmd_options.stdin_target

Re: [OMPI devel] Coll/sync component missing???

2016-08-22 Thread Gilles Gouaillardet
Folks, i was reviewing the sources of the coll/sync module, and 1) i noticed the same pattern is used in *every* sources : if (s->in_operation) { return s->c_coll.coll_xxx(...); } else { COLL_SYNC(s, s->c_coll.coll_xxx(...)); } is there any rationale for not

Re: [OMPI devel] Coll/sync component missing???

2016-08-20 Thread Gilles Gouaillardet
app to run. > > > On Aug 20, 2016, at 7:38 PM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: > > Ralph, > > in the meantime, and if not done already, your user can simply red

Re: [OMPI devel] Coll/sync component missing???

2016-08-20 Thread Gilles Gouaillardet
Ralph, in the meantime, and if not done already, your user can simply redefine MPI_Bcast in the app. int MPI_Bcast(void *buffer, int count, MPI_Datatype type, int root, MPI_Comm comm) { PMPI_Barrier(comm); return PMPI_Bcast(buffer, count, datatype, root, comm); } the root causes are -

Re: [OMPI devel] about MPI_Reduce_local

2016-08-18 Thread Gilles Gouaillardet
Thanks Jeff for the lengthy explanation, and fwiw, i have no strong opinion too Cheers, Gilles On 8/18/2016 8:20 PM, Jeff Squyres (jsquyres) wrote: On Aug 18, 2016, at 12:44 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: when reading the dev meeting minutes, i saw MPI_Reduce

[OMPI devel] about MPI_Reduce_local

2016-08-17 Thread Gilles Gouaillardet
Folks, when reading the dev meeting minutes, i saw MPI_Reduce_local was discussed and might be moved to the coll framework, and here are my 0.02 US$ : the prototype of MPI_Reduce_local is int MPI_Reduce_local(const void *inbuf, void *inoutbuf, int count, MPI_Datatype

Re: [OMPI devel] Open MPI 2.0.0: Fortran with NAG compiler (nagfor)

2016-08-15 Thread Gilles Gouaillardet
Franz-Joseph, the excellent NAG compiler is a commercial product, and as such, not all developers can afford it. this issue should be fixed in the master (feel free to give the latest nightly tarball a try) a fix for v2.x is pending for review at

Re: [OMPI devel] OPAL_PMIX_NODEID is not set by orted

2016-08-11 Thread Gilles Gouaillardet
ce as no specialized hierarchies can be built without the RTE information. George. On Wed, Aug 10, 2016 at 3:57 AM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote: Ralph, i noticed dist-graph/distgraph_test_4 from the ibm test suite

[OMPI devel] OPAL_PMIX_NODEID is not set by orted

2016-08-10 Thread Gilles Gouaillardet
Ralph, i noticed dist-graph/distgraph_test_4 from the ibm test suite fails when using a hostfile and running no task on the host running mpirun. n0$ mpirun --host n1:1,n2:1 -np 2 ./dist-graph/distgraph_test_4 the root cause is OPAL_PMIX_NODEID is correctly set ( 0, 1, 2) by mpirun, but

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-07-26 Thread Gilles Gouaillardet
Also, btl/vader has a higher exclusivity than btl/sm, so if you do not manually specify any btl, vader should be used. you can run with --mca btl_base_verbose 10 to confirm which btl is used Cheers, Gilles On 7/27/2016 9:20 AM, Nathan Hjelm wrote: sm is deprecated in 2.0.0 and will

Re: [OMPI devel] sm BTL performace of the openmpi-2.0.0

2016-07-26 Thread Gilles Gouaillardet
Hi, can you please run again with --mca pml ob1 if Open MPI was built with mxm support, pml/cm and mtl/mxm are used instead of pml/ob1 and btl/openib Cheers, Gilles On 7/27/2016 8:56 AM, tmish...@jcity.maeda.co.jp wrote: Hi folks, I saw a performance degradation of openmpi-2.0.0

Re: [OMPI devel] OpenMPI 2.0 and Petsc 3.7.2

2016-07-26 Thread Gilles Gouaillardet
of days. Let me know if you figure it out before I get to it. -Nathan On Jul 25, 2016, at 8:38 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: Eric, where can your test case be downloaded ? how many nodes and tasks do you need to reproduce the bug ? fwiw, currently there are two Op

Re: [OMPI devel] OpenMPI 2.0 and Petsc 3.7.2

2016-07-25 Thread Gilles Gouaillardet
Eric, where can your test case be downloaded ? how many nodes and tasks do you need to reproduce the bug ? fwiw, currently there are two Open MPI repositories - https://github.com/open-mpi/ompi there is only one branch and is the 'master' branch, today, this can be seen as Open MPI 3.0 pre

Re: [OMPI devel] PGI built Open MPI vs GNU built slurm

2016-07-25 Thread Gilles Gouaillardet
s to be replaced by "-lpthread", or similar. -Paul On Mon, Jul 25, 2016 at 6:03 PM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote: Folks, This is a followup of a thread that initially started at http://www.open-mpi.org/community/lists/users

[OMPI devel] PGI built Open MPI vs GNU built slurm

2016-07-25 Thread Gilles Gouaillardet
Folks, This is a followup of a thread that initially started at http://www.open-mpi.org/community/lists/users/2016/07/29635.php The user is trying to build Open MPI with PGI compiler and libslurm.la/libpmi.la support, and slurm was built with gcc compiler. At first, it fails because the

Re: [OMPI devel] Jenkins setup

2016-07-23 Thread Gilles Gouaillardet
Ralph, I am not sure of what you mean by "hook" there are basically two ways to trigger a build - github webhook contact your Jenkins server. your server must be publically accessible, and you need to register it's url in github. new build start right after a pr is created - poll from the

Re: [OMPI devel] about Mellanox Jenkins

2016-07-21 Thread Gilles Gouaillardet
that tomorrow though Cheers, Gilles On Thursday, July 21, 2016, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > On Jul 21, 2016, at 3:53 AM, Gilles Gouaillardet <gil...@rist.or.jp > <javascript:;>> wrote: > > > > Folks, > > > > Mellanox Jenkins

Re: [OMPI devel] about Mellanox Jenkins

2016-07-21 Thread Gilles Gouaillardet
nmpi" directory. My guess that > it is somehow removed during jenkins execution. > > I'm checking now. > > 2016-07-21 20:11 GMT+06:00 Jeff Squyres (jsquyres) <jsquy...@cisco.com > <javascript:_e(%7B%7D,'cvml','jsquy...@cisco.com');>>: > >> On Jul

[OMPI devel] singleton broken on master

2016-07-21 Thread Gilles Gouaillardet
Ralph, I noted singleton are broken on master. git bisect points to the commit in which PMIx_tool were introduced. if you revert to this commit, orted forked by the singleton crashes. iirc, latest master does nit work, but orted does not crash either. sorry for the lack of details , I am afk

[OMPI devel] about Mellanox Jenkins

2016-07-21 Thread Gilles Gouaillardet
Folks, Mellanox Jenkins marks recent PR's as failed for very surprising reasons. mpirun --mca btl sm,self ... failed because processes could not contact each other. i was able to reproduce this once on my workstation, and found the root cause was a dirty build and/or install dir. i

Re: [OMPI devel] tcp btl rendezvous performance question

2016-07-19 Thread Gilles Gouaillardet
Howard, did you bump both btl_tcp_rndv_eager_limit and btl_tcp_eager_limit ? you might also need to bump btl_tcp_sndbuf, btl_tcp_rcvbuf and btl_tcp_max_send_size to get the max performance out of your 100Gb ethernet cards last but not least, you might also need to bump btl_tcp_links to

Re: [OMPI devel] MPI_Comm_spawn broken on master on RHEL7

2016-07-16 Thread Gilles Gouaillardet
<r...@open-mpi.org> wrote: > Okay, I’ll investigate why that is happening - thanks! > > On Jul 16, 2016, at 7:45 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: > > Th

Re: [OMPI devel] MPI_Comm_spawn broken on master on RHEL7

2016-07-16 Thread Gilles Gouaillardet
uzzles me is that no debugger_release > message should be sent unless a debugger is attached - in which case, the > event should be registered. > > So why is it being sent? Is it the child job that is receiving it? Or is > it the parent? > > > On Jul 16, 2016, at 7:19 AM, Gilles

Re: [OMPI devel] MPI_Comm_spawn broken on master on RHEL7

2016-07-16 Thread Gilles Gouaillardet
, July 15, 2016, Ralph Castain <r...@open-mpi.org> wrote: > Okay, I’ll take a look - thanks! > > On Jul 15, 2016, at 7:08 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrot

Re: [OMPI devel] MPI_Comm_spawn broken on master on RHEL7

2016-07-15 Thread Gilles Gouaillardet
> On Jul 15, 2016, at 1:20 AM, Gilles Gouaillardet <gil...@rist.or.jp > <javascript:;>> wrote: > > > > Ralph, > > > > i noticed MPI_Comm_spawn is broken on master and on RHEL7 > > > > for some reason i cannot yet explain, it works just fine on RH

[OMPI devel] MPI_Comm_spawn broken on master on RHEL7

2016-07-15 Thread Gilles Gouaillardet
Ralph, i noticed MPI_Comm_spawn is broken on master and on RHEL7 for some reason i cannot yet explain, it works just fine on RHEL6 (!) mpirun -np 1 ./dynamic/intercomm_create from the ibm test suite can be used to reproduce the issue. i digged a bit and i found OPAL_ERR_DEBUGGER_RELEASE

Re: [OMPI devel] SHMEM, "mpp/shmem.fh", CMake and infinite loops

2016-07-13 Thread Gilles Gouaillardet
OpenMPI so it avoids a CMake bug, let's follow up at https://github.com/open-mpi/ompi/issues/1868 Cheers, Gilles On 7/13/2016 8:30 PM, Paul Kapinos wrote: Hi Gilles, On 07/13/16 01:10, Gilles Gouaillardet wrote: Paul, The two header files in include/mpp simply include the file

Re: [OMPI devel] 2.0.0rc4 Crash in MPI_File_write_all_end

2016-07-13 Thread Gilles Gouaillardet
Eric, OpenMPI 2.0.0 has been released, so the fix should land into the v2.x branch shortly. If i understand correctly, you script download/compile OpenMPI and then download/compile PETSc. In this is correct, and for the time being, feel free to patch Open MPI v2.x before compiling it,

Re: [OMPI devel] SHMEM, "mpp/shmem.fh", CMake and infinite loops

2016-07-12 Thread Gilles Gouaillardet
Paul, The two header files in include/mpp simply include the file with the same name in the upper directory. A simple workaround is to replace these two files in include/mpp with symbolic links to files with the same name in the upper directory. Would you mind giving this a try ? Cheers,

Re: [OMPI devel] v2.0.0rc4 is released

2016-07-07 Thread Gilles Gouaillardet
This is a warning that can be safely ignored. That being said, this can be seen as a false positive (unless we fix flex or its generated output). Also, and generally speaking, these kind of warnings is for developers only (e.g. end users can do nothing about that). That raises the

Re: [OMPI devel] Posting To Group

2016-06-22 Thread Gilles Gouaillardet
note Open MPI 1.10.3 has just been released, and you should use it, install procedure of 1.8.1 should work too. what is exactly the issue ? the fortran wrapper is mpifort, mpif90 is just an alias. is there a Fortran compiler on your system ? if not, just install one and re configure/make install

Re: [OMPI devel] [2.0.0rc3] OpenBSD/ROMIO

2016-06-15 Thread Gilles Gouaillardet
Paul, https://github.com/open-mpi/ompi-release/pull/1178 is ready to be merged, but the milestone was set to v2.0.1 Cheers, Gilles On 6/16/2016 10:40 AM, Paul Hargrove wrote: I still see failures to build ROMIO on OpenBSD-5.9:

Re: [OMPI devel] [2.0.0rc3] NAG Fortran failures

2016-06-15 Thread Gilles Gouaillardet
Paul, NAG related stuff was added in https://github.com/open-mpi/ompi/pull/1295 Milestone was set to v2.0.1 so no PR was even issued (yet) for the v2.x branch. If there is a consensus to update the milestone to v2.0.0, i ll be happy to PR Cheers, Gilles On 6/16/2016 10:32 AM, Paul

[OMPI devel] openib btl and 10 GbE port

2016-06-12 Thread Gilles Gouaillardet
Folks, this is a follow up of a user report available at http://www.open-mpi.org/community/lists/users/2016/06/29423.php Basically, one node has a dual port ConnectX3 card, with one IB port and one 10 GbE port. When diagnosing some RDMA errors (not the point of this email), the user was

Re: [OMPI devel] MPI_T and coll/tuned module

2016-06-10 Thread Gilles Gouaillardet
especially for reduction operations. It is clearly specified that if you execute multiple times a collective between the same processes with the same values and in the context of the same run you should get the exact same result. George. On Friday, June 10, 2016, Gilles Gouaillardet <gil.

[OMPI devel] MPI_T and coll/tuned module

2016-06-10 Thread Gilles Gouaillardet
Folks, i was thinking of using the MPI_T interface in order to try within the same MPI test program *all* the available algo of a given collective. That cannot currently be done because the mca parameter is registered with {flag=0, scope=MCA_BASE_VAR_SCOPE_READONLY} i made a proof of

Re: [OMPI devel] Jenkins testing - what purpose are we striving to achieve?

2016-06-07 Thread Gilles Gouaillardet
my 0.02 US$ from an implementation point of view, the canonical way of using Jenkins with Github is 1) receive/poll a new PR 2) create a "check" and mark it pending 3) run a script 4) update the "check" status (OK/Failed) based on the exit status of the script. that being said, it is

Re: [OMPI devel] Seldom deadlock in mpirun

2016-06-02 Thread Gilles Gouaillardet
fwiw, the onsided/c_fence_lock test from the ibm test suite hangs (mpirun -np 2 ./c_fence_lock) i ran a git bisect and it incriminates commit b90c83840f472de3219b87cd7e1a364eec5c5a29 commit b90c83840f472de3219b87cd7e1a364eec5c5a29 Author: bosilca List-Post:

Re: [OMPI devel] 1.10.3rc status

2016-05-27 Thread Gilles Gouaillardet
for Edgar or someone who cares about MPI-IO. Should we worry about this for 1.10? I’m inclined to not delay 1.10.3 over this one, but am open to contrary opinions On May 26, 2016, at 7:22 PM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote: In my environme

Re: [OMPI devel] 1.10.3rc status

2016-05-26 Thread Gilles Gouaillardet
Ralph, the cxx_win_attr issue is dealt at https://github.com/open-mpi/ompi/pull/1473 iirc, only big endian and/or sizeof(Fortran integer) > sizeof(int) is impacted. the second error seems a bit weirdest at a time. once in a while, MPI_File_open fails, and when it fails, it always fails

Re: [OMPI devel] [1.10.3rc3] Linux static link failure of fortran examples

2016-05-25 Thread Gilles Gouaillardet
Paul, here it is https://github.com/open-mpi/ompi-release/pull/1192 Thanks ! Gilles On 5/25/2016 4:42 PM, Gilles Gouaillardet wrote: Paul, please disregard this PR for now, i will make a new one shortly ... Cheers, Gilles On 5/25/2016 4:25 PM, Paul Hargrove wrote: Thanks Gilles

Re: [OMPI devel] [1.10.3rc3] Linux static link failure of fortran examples

2016-05-25 Thread Gilles Gouaillardet
Paul, please disregard this PR for now, i will make a new one shortly ... Cheers, Gilles On 5/25/2016 4:25 PM, Paul Hargrove wrote: Thanks Gilles. I will retest and follow up in the PR. -Paul On Tue, May 24, 2016 at 11:56 PM, Gilles Gouaillardet <gil...@rist.or.jp <mail

Re: [OMPI devel] [1.10.3rc3] Linux static link failure of fortran examples

2016-05-25 Thread Gilles Gouaillardet
Paul, this works fine on RHEL7 but not on RHEL6 here is the relevant configure output < checking for library containing clock_gettime... -lrt --- > checking for library containing clock_gettime... none required the reason is clock_gettime was moved from librt into libc between RHEL6 and

Re: [OMPI devel] [1.10.3rc2] Build failure with Studio 12.5-beta

2016-05-21 Thread Gilles Gouaillardet
Paul, the v2.x commit message says it was cherry-picked from master. I can pr this for v1.10 too Cheers, Gilles On Saturday, May 21, 2016, Paul Hargrove wrote: > The Solaris Studio 12.5 Fortran compiler is not being identified correctly > by libtool. > This is the same

Re: [OMPI devel] [1.10.3rc2] OpenBSD build failure

2016-05-21 Thread Gilles Gouaillardet
Paul, I was hoping I got some feedback from mpich folks before I commit it into master and PR to the release branches. I am fine to merge it and pr now if needed Cheers, Gilles On Saturday, May 21, 2016, Paul Hargrove wrote: > As before, this RC cannot build ROMIO on

Re: [OMPI devel] Fwd: Errored: open-mpi/ompi#1160 (master - 50b3775)

2016-05-19 Thread Gilles Gouaillardet
Note this affects OS X only. in opal/util/ethtool.c, ethtool_cmd_speed must not be defined if not HAVE_STRUCT_ETHTOOL_CMD I will not be able to push a fix until tomorrow Cheers, Gilles On Friday, May 20, 2016, George Bosilca wrote: > Travis seems to be broken due to a

Re: [OMPI devel] default mapping on master vs v2.x

2016-05-18 Thread Gilles Gouaillardet
, Gilles Gouaillardet wrote: Folks, currently, default mapping policy on master is different than v2.x. my preliminary question is : when will the master mapping policy land into the release branch ? v2.0.0 ? v2.x ? v3.0.0 ? here are some commands and their output (both n0 and n1 have 16

Re: [OMPI devel] Github pricing plan changes announced today

2016-05-17 Thread Gilles Gouaillardet
Samuel, the (main) reason is none of us are lawyers and none of us know whether all test suites can be redistributed for general public use or not. if someone has any interest in the test suites, he/she is free to make a request for access grant. Cheers, Gilles On 5/18/2016 8:46 AM,

[OMPI devel] default mapping on master vs v2.x

2016-05-16 Thread Gilles Gouaillardet
Folks, currently, default mapping policy on master is different than v2.x. my preliminary question is : when will the master mapping policy land into the release branch ? v2.0.0 ? v2.x ? v3.0.0 ? here are some commands and their output (both n0 and n1 have 16 cores each, mpirun runs on

Re: [OMPI devel] Process connectivity map

2016-05-16 Thread Gilles Gouaillardet
and for non-local peers with the reachability argument set to NULL (because the bitmask doesn't provide any benefit when adding only 1 peer). -Nathan On Tue, May 17, 2016 at 12:00:38AM +0900, Gilles Gouaillardet wrote: Jeff, this is not what I observed (tcp btl, 2 to 4 nodes with one task

Re: [OMPI devel] Process connectivity map

2016-05-16 Thread Gilles Gouaillardet
> In short, only BTLs with the same exclusivity level will be considered > (e.g., this is how we exclude TCP when using HPC-class networks), and then > the BTL modules with the highest priority will be used for a given peer. > > > > On May 16, 2016, at 7:19 AM, Gilles Gouaillarde

Re: [OMPI devel] Process connectivity map

2016-05-16 Thread Gilles Gouaillardet
{ >> MPI_Recv(buf, 6, MPI_CHAR, 0, 99, MPI_COMM_WORLD, MPI_STATUS_IGNORE); >> printf("%s received %s, rank %d\n", hostname, buf, world_rank); >> } >> else >> { >> strcpy(buf, "haha!"); >> MPI_Send(bu

Re: [OMPI devel] Process connectivity map

2016-05-16 Thread Gilles Gouaillardet
vises you to eat right, exercise regularly and quit ageing. On Sun, May 15, 2016 at 10:49 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com <mailto:gilles.gouaillar...@gmail.com>> wrote: At first glance, that seems a bit odd... are you sure you corre

Re: [OMPI devel] Process connectivity map

2016-05-15 Thread Gilles Gouaillardet
rgeon general advises you to eat right, exercise regularly and quit > ageing. > > On Sun, May 15, 2016 at 10:20 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com > <javascript:_e(%7B%7D,'cvml','gilles.gouaillar...@gmail.com');>> wrote: > >> did you check th

Re: [OMPI devel] Process connectivity map

2016-05-15 Thread Gilles Gouaillardet
did you check the add_procs callbacks ? (e.g. mca_btl_tcp_add_procs() for the tcp btl) this is where the reachable bitmap is set, and I guess this is what you are looking for. keep in mind that if several btl can be used, the one with the higher exclusivity is used (e.g. tcp is never used if

Re: [OMPI devel] OMPIO vs ROMIO

2016-05-11 Thread Gilles Gouaillardet
to use ompio mpirun --mca io ompio ... to use romio (v2.x) mpirun --mca io romio314 ... to use romio (v1.10) mpirun --mca io romio ... Cheers, Gilles On Wednesday, May 11, 2016, Michael Rezny wrote: > Hi Sreenidhi, > you need to specify --collective as an input

Re: [OMPI devel] Jenkins mindist test now failing in 2.x

2016-05-10 Thread Gilles Gouaillardet
Folks, i found a bug in the mindist_test.c test (missing strdup() cause free() to crash) and filed https://github.com/mellanox-hpc/jenkins_scripts/pull/30 to get it fixed. Cheers, Gilles On 5/11/2016 7:10 AM, Ralph Castain wrote: Ick - nothing I can do with that blast. I can't find

Re: [OMPI devel] [2.0.0rc2] Solaris Studio 12.5-beta build failure (libtool, w/ patch)

2016-05-10 Thread Gilles Gouaillardet
Thanks Paul, i pushed your patch at https://github.com/open-mpi/ompi/commit/ef3ee027b07fa8cd447e4fffec56ecfe3332548e and will PR from now Cheers, Gilles On 5/6/2016 1:13 PM, Paul Hargrove wrote: Disclamer first: Yes, I am testing a *beta* compiler but this is NOT about a compiler bug.

Re: [OMPI devel] Master broken for ILP32

2016-05-09 Thread Gilles Gouaillardet
Paul, on which distro are you running ? are you compiling on a 64 bit distro to generate a 32 bit library ? it seems we are currently only testing a atomic on a long (32 bits on a 32 bits arch) and then incorrectly assume it works also on 64 bits (!) Cheers, Gilles On 5/9/2016 3:59

Re: [OMPI devel] [2.0.0rc2] build failures on OpenBSD-5.7 (romio)

2016-05-06 Thread Gilles Gouaillardet
Paul, can you please give a try to https://patch-diff.githubusercontent.com/raw/open-mpi/ompi/pull/1643.patch ? Cheers, Gilles On 5/3/2016 2:21 PM, Paul Hargrove wrote: This is NOT a new issue, but I wanted to mention it explicitly once again since no progress has been made since I

Re: [OMPI devel] Warnings in 2.0 release candidate

2016-04-30 Thread Gilles Gouaillardet
Ralph, the OMPI_ENABLE_MPI_PROFILING related warnings are fixed in https://github.com/open-mpi/ompi-release/pull/1056 Cheers, Gilles On Saturday, April 30, 2016, Ralph Castain wrote: > On CentOS-7 using gcc 4.8: > > > btl_tcp.c: In function ‘mca_btl_tcp_add_procs’: >

Re: [OMPI devel] modex receive

2016-04-29 Thread Gilles Gouaillardet
Thank you > Durga > > > > The surgeon general advises you to eat right, exercise regularly and quit > ageing. > > On Thu, Apr 28, 2016 at 2:34 AM, Gilles Gouaillardet <gil...@rist.or.jp > <javascript:_e(%7B%7D,'cvml','gil...@rist.or.jp');>> wrote: > >

Re: [OMPI devel] modex receive

2016-04-28 Thread Gilles Gouaillardet
the add_procs subroutine of the btl should be called. /* i added a printf in mca_btl_tcp_add_procs and it *is* invoked */ can you try again with --mca pml ob1 --mca pml_base_verbose 100 ? maybe the add_procs subroutine is not invoked because openmpi uses cm instead of ob1 Cheers, Gilles

Re: [OMPI devel] 1.10.3rc MTT failures

2016-04-26 Thread Gilles Gouaillardet
12:03 AM, Jeff Squyres (jsquyres) wrote: On Apr 25, 2016, at 9:50 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote: and fwiw, Jeff uses an internally mirrored repo for ompi-tests, so it Cisco clusters should use the latest test suites. Correct. My local git mirrors update n

Re: [OMPI devel] 1.10.3rc MTT failures

2016-04-25 Thread Gilles Gouaillardet
ithub.com/open-mpi/ompi-tests.git > scm_subdir = ibm > > Not sure Ralph meant those errors. But they only happen on ppc64 and not > on x86_64 with a very similar mtt configuration file. > > Adrian > > On Mon, Apr 25, 2016 at 10:50:03PM +0900, Gilles Gouaillar

[OMPI devel] 1.10.3rc MTT failures

2016-04-25 Thread Gilles Gouaillardet
t; I don’t know - this isn’t on my machine, but rather in the weekend and > nightly MTT reports. I’m assuming folks are running the latest test suite, > but... > > > On Apr 25, 2016, at 6:20 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com> wrote: > > Ralph, > >

Re: [OMPI devel] 1.10.3rc MTT failures

2016-04-25 Thread Gilles Gouaillardet
Ralph, can you make sure the ibm test suite is up to date ? I pushed a fix for datatypes a few days ago, and it should be fine now. I will double check this tomorrow anyway Cheers, Gilles On Monday, April 25, 2016, Ralph Castain wrote: > I’m seeing some consistent errors

[OMPI devel] psm mtl and no link

2016-04-24 Thread Gilles Gouaillardet
Folks, This is a follow-up on a question initially posted on the users ML at http://www.open-mpi.org/community/lists/users/2016/04/29018.php. In this environment, there is no link on the Infinipath card. However, it seems the psm mtl is trying to use it instead of disqualifying itself at

Re: [OMPI devel] Missing support for 2 types in MPI_Sizeof()

2016-04-15 Thread Gilles Gouaillardet
Nadia, by any chance, could this test suite be contributed to the ompi-tests repository ? Cheers, Gilles On Friday, April 15, 2016, DERBEY, NADIA wrote: > Jeff, > > Actually, we have a functional test suite that used to pass for these > types and it fails now with

Re: [OMPI devel] Process placement

2016-04-12 Thread Gilles Gouaillardet
George, about the process binding part On 4/13/2016 7:32 AM, George Bosilca wrote: Also my processes, despite the fact that I asked for 1 per node, are not bound to the first core. Shouldn’t we release the process binding when we know there is a single process per node (as in the above case)

Re: [OMPI devel] pal_installdirs_base_framework declaration

2016-04-09 Thread Gilles Gouaillardet
David, This is declared via the MCA_BASE_FRAMEWORK_DECLARE macro https://svn.open-mpi.org/source/xref/ompi_master/opal/mca/installdirs/base/installdirs_base_components.c#175 Cheers, Gilles On Sunday, April 10, 2016, David Froger wrote: > Hello, > > Looking at Open

Re: [OMPI devel] IP address to verb interface mapping

2016-04-08 Thread Gilles Gouaillardet
learn from history. On Fri, Apr 8, 2016 at 12:12 AM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote: Hi, the hostnames (or their IPs) are only used to ssh orted. if you use only the tcp btl : TCP *MPI* communications (vs OOB man

Re: [OMPI devel] IP address to verb interface mapping

2016-04-08 Thread Gilles Gouaillardet
Hi, the hostnames (or their IPs) are only used to ssh orted. if you use only the tcp btl : TCP *MPI* communications (vs OOB management communications) are handled by btl/tcp by default, all usable interfaces are used, then messages are split (iirc, by ob1 pml) and then "fragments" are sent

[OMPI devel] Fwd: [OMPI users] Collective MPI-IO + MPI_THREAD_MULTIPLE

2016-03-29 Thread Gilles Gouaillardet
an On 03/25/2016 01:58 AM, Gilles Gouaillardet wrote: > Sebastian, > > at first glance, the global lock in romio glue is not necessary. > > feel free to give the attached patch a try > (it works with your example, and i made no further testing) > > Cheers, > >

Re: [OMPI devel] Confusion about slots

2016-03-24 Thread Gilles Gouaillardet
Scott, out of curiosity ... generally speaking, and when HT is more efficient, how is it used ? - flat MPI, with one task per thread - Hybrid MPI+OpenMP, a task is bound to a core or socket, but never to a thread Cheers, Gilles On Thursday, March 24, 2016, Atchley, Scott

Re: [OMPI devel] OMPI devel] MPI Error

2016-03-23 Thread Gilles Gouaillardet
gets the first >ids. > > >If MPI_Type_contiguous would work better I am open to switching to that. > > >On Tue, Mar 22, 2016 at 11:06 PM, Gilles Gouaillardet <gil...@rist.or.jp> >wrote: > >Dominik, > >with MPI_Type_indexed, array_of_d

Re: [OMPI devel] MPI Error

2016-03-23 Thread Gilles Gouaillardet
eger kinds? 4. What does the blocklength parameter specify exactly. I played with this some and changing the blocklength did not seem to change anything Thanks for the help. -Dominic Kedelty On Wed, Mar 16, 2016 at 12:02 AM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or

Re: [OMPI devel] Scaling down open mpi for embedded application

2016-03-19 Thread Gilles Gouaillardet
Monika, is there ant reason why you use openmpi 1.4.5 ? it is quite antique today, 1.10.2 is the latest stable version. strictly speaking, openmpi does not require linux, it works fine on Solaris, bsd variants, Cygwin and other arch. the memory footprint is made of the library size, and the

Re: [OMPI devel] How to 'hook' a new BTL to OMPI call chain?

2016-03-17 Thread Gilles Gouaillardet
if you did not configure with --disable-dlopen, you need to make install from opal/mca/btl/If if not, you need make install from your top builddir you can mpirun --mca btl_base_verbose 100 ... to see if your btl was found and did not somehow disqualify itself (for example because it's priority is

Re: [OMPI devel] MPI Error

2016-03-16 Thread Gilles Gouaillardet
that a configuration file is needed. all processes return: STOP A configuration file is required I am attaching the subroutine of the code that I believe is where the problem is occuring. On Mon, Mar 14, 2016 at 6:25 PM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com <mailto:gilles.gou

Re: [OMPI devel] mpif.h on Intel build when run with OMPI_FC=gfortran

2016-03-03 Thread Gilles Gouaillardet
actually do think they work fine if you do a GNU build and use them to specify the Intel compilers. I also think it works fine when you do an Intel build and compile with gcc. So to me it just looks like that one include file is the problem. Dave

Re: [OMPI devel] mpif.h on Intel build when run with OMPI_FC=gfortran

2016-03-03 Thread Gilles Gouaillardet
Samuel, there is clearly no hope when you use mpi.mod and mpi_f08.mod my point was, it is not even possible to expect "legacy" mpif.h work with different compilers. and by the way, if the application is compiled with -i8 (fortran integer is 8 bytes by default), then OpenMPI must have been

Re: [OMPI devel] mpif.h on Intel build when run with OMPI_FC=gfortran

2016-03-03 Thread Gilles Gouaillardet
Larry, currently, OpenMPI generate mpif-sizeof.h with up to 15 dimensions with intel compilers, but up to 7 dimensions with "recent" gcc (for example gcc 5.2 and higher) so i guess the logic behind this is "give the compiler all it can handle", so if intel somehow "extended" the standard to

Re: [OMPI devel] mpif.h on Intel build when run with OMPI_FC=gfortran

2016-03-03 Thread Gilles Gouaillardet
Dave, you should not expect anything when mixing Fortran compilers (and to be on the safe side, you might not expect much when mixing C/C++ compilers too, for example, if you built ompi with intel and use gcc for your app, gcc might complain about unresolved symbols from the intel runtime)

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-02 Thread Gilles Gouaillardet
include > Jeff’s warning about debug builds being used for performance testing > +1 I’m increasingly feeling that we shouldn’t output that message every time > someone executes a debug-based operation, even if we add a param to turn > off the warning. > +1 > > On

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-02 Thread Gilles Gouaillardet
t more convenient for those developers who > don’t would be nice. > > > On Mar 2, 2016, at 4:51 AM, Jeff Squyres (jsquyres) <jsquy...@cisco.com > <javascript:;>> wrote: > > > > On Mar 2, 2016, at 6:30 AM, Mark Santcroos <mark.santcr...@rutgers.edu > <javascrip

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-01 Thread Gilles Gouaillardet
a performance benchmark, then i will not get the warning i need (and yes, i will be the only one to blame ... but isn't it something we want to avoid here ?) Cheers, Gilles On 3/2/2016 1:43 PM, George Bosilca wrote: On Mar 1, 2016, at 22:27 , Gilles Gouaillardet <gil...@rist.or.jp> wrote:

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-01 Thread Gilles Gouaillardet
ng opinion, and i am fine with setting a parameter (i will likely soon forget i set that) in a config file. Cheers, Gilles On 3/2/2016 1:21 PM, Jeff Squyres (jsquyres) wrote: On Mar 1, 2016, at 10:17 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: In this case, should we only dis

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-01 Thread Gilles Gouaillardet
(jsquyres) wrote: On Mar 1, 2016, at 10:06 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: what about *not* issuing this warning if OpenMPI is built from git ? that would be friendlier for OMPI developers, and should basically *not* affect endusers, since they would rather build OMPI

Re: [OMPI devel] RFC: warn if running a debug build

2016-03-01 Thread Gilles Gouaillardet
Jeff, what about *not* issuing this warning if OpenMPI is built from git ? that would be friendlier for OMPI developers, and should basically *not* affect endusers, since they would rather build OMPI from a tarball. Cheers, Gilles On 3/2/2016 1:00 PM, Jeff Squyres (jsquyres) wrote: WHAT:

Re: [OMPI devel] MTT setup updated to gcc-6.0 (pre)

2016-03-01 Thread Gilles Gouaillardet
fwiw in a previous thread, Jeff Hammond explained this is why mpich is relying on C89 instead of C99, since C89 appears to be a subset of C++11. Cheers, Gilles On 3/2/2016 1:02 AM, Nathan Hjelm wrote: I will add to how crazy this is. The C standard has been very careful to not break

Re: [OMPI devel] Segmentation fault in opal_fifo (MTT)

2016-03-01 Thread Gilles Gouaillardet
Adrian, About bitness, it is correctly set when MPI install successes See https://mtt.open-mpi.org/index.php?do_redir or even your successful install on x86_64 I suspect it is queried once the installation is successful, and I ll try to have a look at it. Cheers, Gilles On Tuesday, March 1,

Re: [OMPI devel] Confused topic for developer's meeting

2016-02-26 Thread Gilles Gouaillardet
Ralph, The goal here is to allow vendor to distribute binary orte frameworks (on top of binary components they can already distribute) that can be used by a user compiled "stock" openmpi library). Did I get it right so far ? I gave it some thoughts and found that could be simplified. My

<    1   2   3   4   5   6   7   8   >