[OMPI devel] rml/ofi component broken in v4.0.x and v3.1.x

2019-02-14 Thread Gilles Gouaillardet
Folks, The rml/ofi component has been removed from master. Then common/ofi was later removed from master and mtl/ofi configury component was revamped not to depend on common/ofi configury stuff. Only the latter change was backported to the release branches. The issue is that rml/ofi is

Re: [OMPI devel] Did someone enable Travis?

2019-01-15 Thread Gilles Gouaillardet
eers, Gilles On 1/9/2019 11:45 AM, Gilles Gouaillardet wrote: I do not know how/why travis was enabled. That being said, the errors look legit to me, and there are two 1) with clang 5.0   CC   opal_convertor_raw.lo In file included from opal_convertor_raw.c:21: In file included

Re: [OMPI devel] Did someone enable Travis?

2019-01-08 Thread Gilles Gouaillardet
I do not know how/why travis was enabled. That being said, the errors look legit to me, and there are two 1) with clang 5.0   CC   opal_convertor_raw.lo In file included from opal_convertor_raw.c:21: In file included from ../../opal/datatype/opal_convertor_internal.h:21: In file included

[OMPI devel] btl/uct exclusivity and reachability

2018-12-10 Thread Gilles Gouaillardet
Nathan, I inadvertently noted that btl/uct has the highest exclusivity and consider all procs are reachable. As a consequence :  - btl/uct is used in altogether with btl/vader for intra-node communications  - btl/uct is used altogether with btl/self for self communications This

Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-14 Thread Gilles Gouaillardet
Chris, I am a bit puzzled at your logs. As far as I understand, ldd libhhgttg.so.1 reports that libopen-rte.so.40 and libopen-pal.so.40 are both dependencies, but that does not say anything on who is depending on them. They could be directly needed by libhhgttg.so.1 (I hope / do not think it

Re: [OMPI devel] 3.1.2: Datatype errors and segfault in MPI_Allgatherv

2018-11-02 Thread Gilles Gouaillardet
org.au>> wrote: HI Gilles, On 2 Nov 2018, at 11:03 am, Gilles Gouaillardet <mailto:gil...@rist.or.jp>> wrote: I noted the stack traces refers opal_cuda_memcpy(). Is this issue specific to CUDA environments ? No, this is just on normal CPU-only nodes. But memcpy always goes thr

Re: [OMPI devel] 3.1.2: Datatype errors and segfault in MPI_Allgatherv

2018-11-01 Thread Gilles Gouaillardet
Hi Ben, I noted the stack traces refers opal_cuda_memcpy(). Is this issue specific to CUDA environments ? The coll/tuned default collective module is known not to work when tasks use matching but different signatures. For example, one task sends one vector of N elements, and the other

Re: [OMPI devel] Open MPI vs msys2

2018-10-23 Thread Gilles Gouaillardet
ath for headers >(stat.h is found). > >On the other hand, statfs.h is not found, as if /usr/include were not in the >search path for headers. > > >If I try > > >export CPATH=/usr/include > >export C_INCLUDE_PATH=/usr/include > >export CPLUS_INCLUDE_PATH=/us

Re: [OMPI devel] Open MPI vs msys2

2018-10-22 Thread Gilles Gouaillardet
, Oct 22, 2018 at 5:26 AM Gilles Gouaillardet <mailto:gil...@rist.or.jp>> wrote: Santiago, the config.log reports there is no /usr/include/sys/statfs.h on your system. on my system, this file exists and is provided by the msys2-runtime-devel file, so the fi

Re: [OMPI devel] (no subject)

2018-10-22 Thread Gilles Gouaillardet
Santiago, the config.log reports there is no /usr/include/sys/statfs.h on your system. on my system, this file exists and is provided by the msys2-runtime-devel file, so the first steps are to check this package is installed and if not, install it. note I ran pacman -Syu a few times to

Re: [MTT devel] complaints about github pages being generated with every PR

2018-10-17 Thread Gilles Gouaillardet
Howard, each commits trigger a travis build, and the deploy section of travis builds the doc and push it if the html has changed (the pdf contains timestamps, that's why we only focus on the html). bottom line, the doc will be updated to gh-pages only if it has to. If the gh-pages repo is

Re: [OMPI devel] Hints for using an own pmix server

2018-10-08 Thread Gilles Gouaillardet
Stephan, Have you already checked https://github.com/pmix/prrte ? This is the PMIx Reference RunTime Environment (PPRTE), which was built on top of orted. Long story short, it deploys the PMIx server and then you start your MPI app with prun An example is available at

[OMPI devel] btl/vader: race condition in finalize on OS X

2018-10-02 Thread Gilles Gouaillardet
Folks, When running a simple helloworld program on OS X, we can end up with the following error message A system call failed during shared memory initialization that should not have.  It is likely that your MPI job will now either abort or experience performance degradation.   Local host: 

Re: [OMPI devel] MTT Perl client

2018-09-14 Thread Gilles Gouaillardet
IIRC mtt-relay is not only a proxy (squid can do that too). mtt results can be manually copied from a cluster behind a firewall, and then mtt-relay can “upload” these results to mtt.open-MPI.org My 0.02US$ Gilles On Saturday, September 15, 2018, Jeff Squyres (jsquyres) via devel <

Re: [OMPI devel] [p]ompi_foo_f symbols in mpi_f08.mod

2018-07-17 Thread Gilles Gouaillardet
:46 AM Jeff Squyres (jsquyres) via devel wrote: > > On Jul 17, 2018, at 8:49 PM, Gilles Gouaillardet wrote: > > > > I noted the internal Fortran bindings (e.g. [p]ompi_barrier_f and friends) > > are defined in the user facing mpi_f08.mod. > > > > My impressio

[OMPI devel] [p]ompi_foo_f symbols in mpi_f08.mod

2018-07-17 Thread Gilles Gouaillardet
Jeff, When working on https://github.com/open-mpi/ompi/pull/5430, I noted the internal Fortran bindings (e.g. [p]ompi_barrier_f and friends) are defined in the user facing mpi_f08.mod. My impressions are :  1. pompi_barrier_f and friends are never used (e.g. pbarrier_f08.F90 calls

Re: [OMPI devel] Open MPI: Undefined reference to pthread_atfork

2018-07-01 Thread Gilles Gouaillardet
I was unable to reproduce this on Ubuntu 14.04.5 Note the default gcc is 4.8 gcc-4.9 can be installed, but no g++ nor gfortran. Did you build Open MPI with the same compiler used to build libUtils.so and a.out What do type gcc ls -l /usr/bin/gcc gcc —version g++ —version say ? On top of the info

Re: [OMPI devel] Shared object dependencies

2018-06-12 Thread Gilles Gouaillardet
uyres) > > Subject: Re: [OMPI devel] Shared object dependencies > > > > How is it that Edgar is not running into these issues? > > > > Edgar: are you compiling with --disable-dlopen, perchance? > > > > > > > On Jun 12, 2018, at 6:04 AM, Gil

Re: [OMPI devel] Shared object dependencies

2018-06-12 Thread Gilles Gouaillardet
;f_set_aggregator_props), and the same with the mca parmaeters, we > access them through a function that is stored as a function pointer on the > file handle structure. > > Thanks > Edgar > > > > -Original Message- > > From: devel [mailto:devel-boun...@

Re: [OMPI devel] Shared object dependencies

2018-06-12 Thread Gilles Gouaillardet
Tyson, thanks for taking the time to do some more tests. This is really a bug in Open MPI, and unlike what I thought earlier, there are still some abstraction violations here and there related to ompio. I filed https://github.com/open-mpi/ompi/pull/5263 in order to address them

Re: [OMPI devel] Shared object dependencies

2018-06-10 Thread Gilles Gouaillardet
Edgar, I checked the various release branches, and I think this issue was fixed by https://github.com/open-mpi/ompi/commit/ccf76b779130e065de326f71fe6bac868c565300 This was back-ported into the v3.0.x branch, and that was before the v3.1.x branch was created. This has *not* been backported

Re: [OMPI devel] Master warnings?

2018-06-03 Thread Gilles Gouaillardet
Nathan, there could be another issue : with gcc 8.1.0, I get some warnings (see logs at the end) From https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-error-function-attribute error ("message")warning ("message") If the error or warning attribute is used on a function

Re: [OMPI devel] openmpi-3.1.0 cygwin patch

2018-05-23 Thread Gilles Gouaillardet
Marco, Have you tried to build Open MPI with an external (e.g. Cygwin provided) libevent library ? If that works, I think that would be the preferred method. Cheers, Gilles On Wednesday, May 23, 2018, Marco Atzeri wrote: > The attached patch allows the compilation of

Re: [OMPI devel] Open MPI 3.1.0rc4 posted

2018-04-17 Thread Gilles Gouaillardet
t; Do we honestly care for 3.1.0? I mean, we went 6 months without it working >> and no one cared. We can’t fix all bugs, and I’m a little concerned about >> making changes right before release. >> >> Brian >> >>> On Apr 17, 2018, at 7:49 AM, Gilles Gou

Re: [OMPI devel] Open MPI 3.1.0rc4 posted

2018-04-17 Thread Gilles Gouaillardet
Brian, https://github.com/open-mpi/ompi/pull/5081 fixes support for external PMIx v2.0 Support for external PMIx v1 is broken (same in master) and extra dev would be required to fix it. The easiest path, if acceptable, is to simply drop support for PMIx v1 Cheers, Gilles "Barrett, Brian

Re: [OMPI devel] Guidance on How to Submit PRs to Fix GitHub Issue #5000?

2018-04-01 Thread Gilles Gouaillardet
Bryce, The current status on OS X is half baked, the /usr/bin/javah symlink is still there, but it points to nothing, and a direct consequence is that AC_PATH_PROG detects /usr/bin/javah, when I hope it does not. Since javac -h is already working in Java 8, I guess we do not care of older java

Re: [OMPI devel] Running on Kubernetes

2018-03-16 Thread Gilles Gouaillardet
Hi Rong, SSH is safe when properly implemented. That being said, some sites does not allow endusers to directly SSH into compute nodes because they do not want them to do anything without the resource manager knowing about it. What is your concern with SSH ? You can run a resource manager (such

[OMPI devel] about cross-version mpirun interoperability

2018-02-04 Thread Gilles Gouaillardet
Folks, At SC'17 Open MPI BoF, we presented slice 74 about cross-version mpirun interoperability (i attached a screenshot for your convenience). The topic is documented on the wiki at https://github.com/open-mpi/ompi/wiki/Container-Versioning. If I oversimplify, we have two use-cases to consider

Re: [OMPI devel] cannot push directly to master anymore

2018-01-31 Thread Gilles Gouaillardet
.@cisco.com> >>> wrote: >>> >>> On Jan 31, 2018, at 10:14 AM, Gilles Gouaillardet >>> <gilles.gouaillar...@gmail.com> wrote: >>>> >>>> I tried to push some trivial commits directly to the master branch and >>>> w

[OMPI devel] cannot push directly to master anymore

2018-01-31 Thread Gilles Gouaillardet
Folks, I tried to push some trivial commits directly to the master branch and was surprised that is no more allowed. The error message is not crystal clear, but I guess the root cause is the two newly required checks (Commit email checker and Signed-off-by-checker) were not performed. As a kind

Re: [OMPI devel] Poor performance when compiling with --disable-dlopen

2018-01-23 Thread Gilles Gouaillardet
emory management hooks provided using patcher/overwrite, > leave pinned can give incorrect results. > > -Paul > > On Tue, Jan 23, 2018 at 9:17 PM, Gilles Gouaillardet > <gilles.gouaillar...@gmail.com> wrote: >> >> Dave, >> >> here is what I found >> &g

Re: [OMPI devel] Poor performance when compiling with --disable-dlopen

2018-01-23 Thread Gilles Gouaillardet
at 1:29 PM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote: > Dave, > > i can reproduce the issue with btl/openib and the IMB benchmark, that > is known to MPI_Init_thread(MPI_THREAD_MULTIPLE) > > note performance is ok with OSU benchmark that does not requi

Re: [OMPI devel] Poor performance when compiling with --disable-dlopen

2018-01-23 Thread Gilles Gouaillardet
Dave, i can reproduce the issue with btl/openib and the IMB benchmark, that is known to MPI_Init_thread(MPI_THREAD_MULTIPLE) note performance is ok with OSU benchmark that does not require MPI_THREAD_MULTIPLE Cheers, Gilles On Wed, Jan 24, 2018 at 1:16 PM, Gilles Gouaillardet <

Re: [OMPI devel] Poor performance when compiling with --disable-dlopen

2018-01-23 Thread Gilles Gouaillardet
Dave, one more question, are you running the openib/btl ? or other libraries such as MXM or UCX ? Cheers, Gilles On 1/24/2018 12:55 PM, Dave Turner wrote:    We compiled OpenMPI 2.1.1 using the EasyBuild configuration for CentOS as below and tested on Mellanox QDR hardware.

Re: [OMPI devel] Poor performance when compiling with --disable-dlopen

2018-01-23 Thread Gilles Gouaillardet
Dave, At first glance, that looks pretty odd, and I'll have a look at it. Which benchmark are you using to measure the bandwidth ? Does your benchmark MPI_Init_thread(MPI_THREAD_MULTIPLE) ? Have you tried without --enable-mpi-thread-multiple ? Cheers, Gilles On Wed, Jan 24, 2018 at 12:55 PM,

Re: [MTT devel] Send results to multiple databases?

2017-12-20 Thread Gilles Gouaillardet
My 0.02 US$ An other approach would be to run the MTT tests, store the results in a (text ? xml ?) file, and then submit them from an internet connected machine. not only that would allow you to submit to multiple databases, but that would allow you to run the MTT test suite on an

Re: [OMPI devel] ROMIO support in OpenMPI

2017-11-08 Thread Gilles Gouaillardet
, Josh Hursey wrote: We (IBM) are interested in maintaining support for ROMIO in Open MPI. We are investigating upgrading the ROMIO version inside Open MPI, but I have no ETA on when that work will be complete/available. On Wed, Nov 8, 2017 at 4:37 AM, Gilles Gouaillardet <gilles.gouail

Re: [OMPI devel] ROMIO support in OpenMPI

2017-11-08 Thread Gilles Gouaillardet
Farouk, as far as i am concerned, there is no plan to stop supporting ROMIO. that being said, i am not aware there are any plans to upgrade to the latest ROMIO in a near future. Also, please note ROMIO is the default module when the Lustre filesystem is used. Cheers, Gilles On Wed, Nov 8,

Re: [OMPI devel] subcommunicator OpenMPI issues on K

2017-11-07 Thread Gilles Gouaillardet
vanilla Open MPI has no support for the tofu interconnect, nor Fujitsu batch manager. also, and iirc, tcp communication between two compute nodes is not always possible on the K computer. so the simple answer is no. Cheers, Gilles On 11/8/2017 10:37 AM, Christopher Samuel wrote: On

Re: [OMPI devel] subcommunicator OpenMPI issues on K

2017-11-07 Thread Gilles Gouaillardet
-fv (not -fe). They were using the flux variants >>> (requires local.mk build operators.flux.c instead of operators.fv4.c) and >>> they are a couple commits behind. Regardless, this issue has persisted on K >>> for several years. By default, it will build log(N) subc

Re: [OMPI devel] subcommunicator OpenMPI issues on K

2017-11-07 Thread Gilles Gouaillardet
Samuel, The default MPI library on the K computer is Fujitsu MPI, and yes, it is based on Open MPI. /* fwiw, an alternative is RIKEN MPI, and it is MPICH based */ >From a support perspective, this should be reported to the HPCI helpdesk http://www.hpci-office.jp/pages/e_support As far as i

Re: [OMPI devel] Enable issue tracker for ompi-www repo?

2017-11-04 Thread Gilles Gouaillardet
Chris, feel free to issue a PR, or fully describe the issue so a developer can update the FAQ accordingly. Cheers, Gilles On Sat, Nov 4, 2017 at 4:44 PM, Chris Samuel wrote: > Hi folks, > > I was looking to file an issue against the website for the FAQ about XRC >

Re: [OMPI devel] how to disable memory/patcher build ?

2017-10-31 Thread Gilles Gouaillardet
Marco, can you please give the attached patch a try ? so far, it compiles for me. btw, i faced some issues (conflicting definitions between windows.h and netdb.h), did you need some patches in order to solve these issues ? Cheers, Gilles On 10/31/2017 1:25 AM, Marco Atzeri wrote:

Re: [OMPI devel] Open MPI3.0

2017-10-22 Thread Gilles Gouaillardet
George, since this is an automatically generated file (at configure time), this is likely a packaging issue in upstream PMIx i made https://github.com/pmix/pmix/pull/567 in order to fix that. fwiw, nightly tarballs for v3.0.x, v3.1.x and master are affected Cheers, Gilles On

Re: [OMPI devel] OMPI devel] mpi_yield_when_idle=1 and still 100%CPU

2017-10-12 Thread Gilles Gouaillardet
be back-ported to 1.10.x? > >Best >Paul Kapinos > > > > >On 10/12/2017 09:31 AM, Gilles Gouaillardet wrote: >> Paul, >> >> >> i made PR #4331 https://github.com/open-mpi/ompi/pull/4431 in order to >> implement >> this. >> >

Re: [OMPI devel] mpi_yield_when_idle=1 and still 100%CPU

2017-10-12 Thread Gilles Gouaillardet
Paul, i made PR #4331 https://github.com/open-mpi/ompi/pull/4431 in order to implement this. in order to enable passive wait, you simply need to mpirun --mca mpi_poll_when_idle true ... fwiw, when you use mpi_yield_when_idle, Open MPI does (highly oversimplified) for (...)

Re: [OMPI devel] Open MPI 2.1.2rc3 available for testing

2017-09-02 Thread Gilles Gouaillardet
Marco, can you please detail how you built Open MPI ? i guess you downloaded a tarball and built from that. in this case, there is no need to run autogen.pl --force and unless something is wrong with the timestamps of the tarball, autoreconf should never be invoked. Cheers, Gilles On Sat,

Re: [OMPI devel] KNL/hwloc funny message question

2017-09-01 Thread Gilles Gouaillardet
Howard, i faced the same issue after a MPSS (Intel software stack for many core) update. as Brice explained, the issue comes from the embedded (and older) hwloc does understand the file format written by MPSS (more recent) hwloc. i simply rebuilt Open MPI with the external hwloc provided by

Re: [OMPI devel] profiling interface for Fortran executables in OpenMPI 2.1.1

2017-08-04 Thread Gilles Gouaillardet
Phil, In previous versions of Open MPI, the Fortran bindings were calling the related C MPI_* subroutine, that could be wrapped in C. Now, Fortran bindings directly invoke the C PMPI_* subroutine, and hence it cannot be wrapped in C. The solution is to wrap C subroutines in C, and to wrap

Re: [OMPI devel] Memory leak

2017-07-25 Thread Gilles Gouaillardet
Samuel, fwiw, the issue is fixed in upcoming Open MPI 3.0 Cheers, Gilles On Wed, Jul 26, 2017 at 3:43 AM, Samuel Poncé wrote: > Dear OpenMPI developpers, > > I would like to report a bug for openmpi/2.0.2 > > This bug might have been corrected in earlier version.

Re: [OMPI devel] LD_LIBRARY_PATH and environment variables not getting set in remote hosts

2017-07-20 Thread Gilles Gouaillardet
Hi, you meant Open MPI 1.8.2, right ? as far as i am concerned, i always configure Open MPI with --enable-mpirun-prefix-by-default, so i do not need to set LD_LIBRARY_PATH in my .bashrc if you want us to investigate this issue, please post the full error message - is the issue reported by

Re: [OMPI devel] Signed-off-by-checker: now ignores merge commits

2017-07-13 Thread Gilles Gouaillardet
form on which to base our bots (vs. the existing > signed-off-by-checker and email checker that are custom written by me). > > Something to look into... > > >> On Jul 13, 2017, at 12:21 AM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: >> >> Thanks Jeff, >&

Re: [OMPI devel] Signed-off-by-checker: now ignores merge commits

2017-07-12 Thread Gilles Gouaillardet
Thanks Jeff, one more improvement : could you please have the bot also ignore revert commits ? Cheers, Gilles On 7/7/2017 11:56 PM, Jeff Squyres (jsquyres) wrote: FYI: the "signed-off-by-checker" CI on Github now ignores merge commits (because those usually aren't signed -- e.g., if

Re: [OMPI devel] MTT / Open MPI Visibility of New Failures

2017-07-11 Thread Gilles Gouaillardet
Josh and all, an other or complementary idea is to use MTT from jenkins and with the junit plugin. iirc, the python client can generate a junit xml report, and fwiw, i made a simplistic proof of concept in the perl client. the idea is the junit jenkins plugins display a summary of all tests (ok,

Re: [OMPI devel] [3.0.0rc1] ILP32 build failures

2017-07-03 Thread Gilles Gouaillardet
Thanks Paul, i made https://github.com/open-mpi/ompi/pull/3802 to fix this issue. once it completes CI, i will merge and PR vs the release branches Cheers, Gilles On 7/4/2017 8:19 AM, Paul Hargrove wrote: Below is the corresponding failure at build time (rather than link time) for

Re: [OMPI devel] Use of gethostbyname

2017-07-03 Thread Gilles Gouaillardet
Thanks Philipp, You can open a Pull Request vs the master branch on the github repository https://github.com/open-mpi/ompi.git All you need is sign-off your commits (please make sure you understand what it means) Then one of us will backport it to the release branches once it gets merged.

Re: [OMPI devel] Coverity strangeness

2017-06-15 Thread Gilles Gouaillardet
Ralph, my 0.02 US$ i noted the error message mentions 'holding lock "pmix_mutex_t.m_lock_pthread"', but it does not explicitly mentions 'pmix_global_lock' (!) at line 446, PMIX_WAIT_THREAD() does release 'cb.lock', which has the same type than 'pmix_global_lock', but is not the very same

Re: [OMPI devel] ompi_info "developer warning"

2017-06-04 Thread Gilles Gouaillardet
Ralph, in your environment, pml/monitoring is disabled. so instead of displaying "MCA pml monitoring", ompi_info --all displays "MCA (disabled) pml monitoring" which is larger than 24 characters. fwiw, you can observe the same behavior with OMPI_MCA_sharedfp=^lockedfile ompi_info --all

Re: [OMPI devel] about ompi_datatype_is_valid

2017-06-02 Thread Gilles Gouaillardet
, 2017 at 10:41 PM, Dahai Guo <dahai@gmail.com> wrote: > so you are saying that a user should NOT define send/recv data type as -1, > in openmpi? > > Dahai > > On Thu, Jun 1, 2017 at 6:59 PM, Gilles Gouaillardet <gil...@rist.or.jp> > wrote: >> >>

Re: [OMPI devel] about ompi_datatype_is_valid

2017-06-01 Thread Gilles Gouaillardet
+1 MPI_Datatype is an opaque handler, and in Open MPI, this is an ompi_datatype_t * so we can only test for NULL pointers or MPI_DATATYPE_NULL that cannot be used per the standard. fwiw, and iirc, MPICH made an other design choice and MPI_Datatype is a number, so the mpich equivalent of

[OMPI devel] mapper issue with heterogeneous topologies

2017-05-31 Thread Gilles Gouaillardet
Hi Ralph, this is a follow-up on Siegmar's post that started at https://www.mail-archive.com/users@lists.open-mpi.org/msg31177.html mpiexec -np 3 --host loki:2,exin hello_1_mpi -- There are not enough slots available

Re: [OMPI devel] openib oob module

2017-04-21 Thread Gilles Gouaillardet
Folks, fwiw, i made https://github.com/open-mpi/ompi/pull/3393 and it works for me on a mlx4 cluster (Mellanox QDR) Cheers, Gilles On 4/21/2017 1:31 AM, r...@open-mpi.org wrote: I’m not seeing any problem inside the OOB - the problem appears to be in the info being given to it:

Re: [OMPI devel] openib oob module

2017-04-20 Thread Gilles Gouaillardet
Ralph, in v1.10 series, BTL is still in the OMPI layer. from v2, could the BTL directly use PMIx instead of rml (orte) ? Cheers, Gilles On Thursday, April 20, 2017, r...@open-mpi.org wrote: > Hi Shiqing! > > Been a long time - hope you are doing well. > > I see no way to

Re: [OMPI devel] Problem with bind-to

2017-04-13 Thread Gilles Gouaillardet
Ralph, i can simply reproduce the issue with two nodes and the latest master all commands are ran on n1, which has the same topology (2 sockets * 8 cores each) than n2 1) everything works $ mpirun -np 16 -bind-to core --report-bindings true [n1:29794] MCW rank 0 bound to socket 0[core

Re: [OMPI devel] Problem with bind-to

2017-04-13 Thread Gilles Gouaillardet
Is your compute node included in your machine file ? If yes, what if you invoke mpirun from a compute node not listed in your machine file ? It can also be helpful to post your machinefile Cheers, Gilles On Thursday, April 13, 2017, Cyril Bordage wrote: > When I run

Re: [OMPI devel] Unable to complete a TCP connection

2017-04-13 Thread Gilles Gouaillardet
There are several kind of communications - ssh from mpirun to compute nodes, and also between compute nodes (assuming you use a machine file and no supported batch manager) to spawn orted daemons - oob/tcp connections between orted - btl/tcp connections between MPI tasks You can restrict the port

Re: [OMPI devel] external hwloc causing libevent problems?

2017-04-06 Thread Gilles Gouaillardet
Brian, there used to be two hwloc.h : the one from the hwloc libraries, and the one from the OMPI MCA framework. as a side effect, we could not --with-hwloc=external, but we had to --with-hwloc=/usr, which leads to add -I/usr/include to the CPPFLAGS. so you might end up having

Re: [OMPI devel] anybody ported OMPI to hwloc 2.0 API?

2017-04-06 Thread Gilles Gouaillardet
Folks, i started that quite a while a go, but did not go very far we did discuss this at https://github.com/ggouaillardet/ompi/commit/5410027247194be66679c0f9335ccf0f59fffebf and my patch is at https://github.com/ggouaillardet/ompi/commit/892ff1d94072fca0f468228c10d13be58a9400b2 we

Re: [OMPI devel] Print-related error when using f2py-wrapped, mpif90-compiled module in python

2017-04-03 Thread Gilles Gouaillardet
Hi, on your system where compilation fails with mpif90, you can mpif90 --showme ... in order to get the gfotran command line that is used by the wrapper then you can compare it to your own gfortran command line that is working, and then try to figure out which difference is involved in the crash.

Re: [OMPI devel] 2.0.2 SRPM in "install_in_opt" configuration creates file outside /opt

2017-03-30 Thread Gilles Gouaillardet
Kevin, you should only use the tarballs from www.open-mpi.org they are generated with our scripts. if i understand correctly, github.com has its own way of generating a tarball from a git tag. Cheers, Gilles On 3/31/2017 1:24 PM, Kevin Buckley wrote: On 29 March 2017 at 13:49, Jeff

Re: [OMPI devel] Multi-threading support for openib

2017-03-22 Thread Gilles Gouaillardet
cluster and executed MPI code using nodes only from this cluster. Again I had the same problem. I selected the only eth0, but again there is no result. In error message is mentioned that it should be reported to the developer. Best regards, Emin Nuriyev On 22 March 2017 at 15:45, Gilles

Re: [OMPI devel] Multi-threading support for openib

2017-03-22 Thread Gilles Gouaillardet
Enrico, this is fixed in Open MPI 2.1.0 fwiw, you only need MPI_THREAD_MULTIPLE if you invoke MPI subroutines within an OpenMP parallel. if MPI is only used outside of OpenMP parallel, then MPI_THREAD_SINGLE is very likely enough. Cheers, Gilles On Thursday, March 23, 2017, Enrico Calore

Re: [OMPI devel] weird error message (you'll be puzzled!)

2017-03-03 Thread Gilles Gouaillardet
Thanks Paul, It looks like we (indirectly) call MPI_Abort() when the argument is invalid. That would explain the counter intuitive error message Cheers, Gilles Paul Kapinos wrote: >Dear Open MPI developer, >please take a look at the attached 'hello MPI world'

[OMPI devel] Open MPI, ssh and limits

2017-03-03 Thread Gilles Gouaillardet
Folks, this is a follow-up on https://www.mail-archive.com/users@lists.open-mpi.org//msg30715.html on my cluster, the core file size is 0 by default, but it can be set to unlimited by any user. i think this is a pretty common default. $ ulimit -c 0 $ bash -c 'ulimit -c' 0 $ mpirun -np

Re: [OMPI devel] v2.1.0rc1 has been released

2017-02-28 Thread Gilles Gouaillardet
Hi Gilles, > > Is this the same issue I reported 4/29/2014: 'Wrong Endianness in Open > MPI for external32 representation'? > https://www.mail-archive.com/devel@lists.open-mpi.org/msg14698.html > > Best > Christoph > > ----- Original Message - > From: "Gilles

Re: [OMPI devel] v2.1.0rc1 has been released

2017-02-27 Thread Gilles Gouaillardet
Jeff, about the external32 thing "Fix external32 representation in the romio314 module. note that for now, external32 representation is not correctly supported by the ompio module Thanks Thomas Gastine for bringing this to our attention" the external32 representation makes MPI-IO write

Re: [OMPI devel] No Preset Parameters found

2017-02-20 Thread Gilles Gouaillardet
Kumar, that means this device is not know by your Open MPI lib, so i strongly encourage you to upgrade to the latest stable Open MPI 2.0.2 if you will not do that, here is the definition from the .ini file from master [Mellanox ConnectX5] vendor_id =

Re: [OMPI devel] Segfault on MPI init

2017-02-14 Thread Gilles Gouaillardet
pen MPI (e.g., the > new version in your local machine, and some other version on the remote > machines). > > > > Sent from my phone. No type good. > > > >> On Feb 13, 2017, at 8:14 AM, Gilles Gouaillardet < > gilles.gouaillar...@gmail.com <javascript:;&

Re: [OMPI devel] Segfault on MPI init

2017-02-13 Thread Gilles Gouaillardet
Cyril, Are you running your jobs via a batch manager If yes, was support for it correctly built ? If you were able to get a core dump, can you post the gdb stacktrace ? I guess your nodes have several IP interfaces, you might want to try mpirun --mca oob_tcp_if_include eth0 ... (replace eth0

Re: [OMPI devel] OMPI devel] Travis: one thing that might help

2017-02-09 Thread Gilles Gouaillardet
Jeff, Or maybe it used to be the case ... see https://github.com/jenkinsci/ghprb-plugin/issues/379 for how to activate this feature. Back to travis, this feature is scheduled to happen in 2017Q1, See https://github.com/grosser/travis_dedup Cheers, Gilles Gilles Gouaillardet <

Re: [OMPI devel] Travis: one thing that might help

2017-02-09 Thread Gilles Gouaillardet
Jeff, i made the test and it seems i got it wrong ... no travis build is cancelled when new commits are pushed into a PR :-( i could only note Mellanox Jenkins has a "stop" icon, so a build can be manually cancelled. LANL Jenkins nor Travis offer this option. Sorry for the confusion,

Re: [OMPI devel] [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread Gilles Gouaillardet
is spawn_master generating? On Jan 11, 2017, at 7:39 PM, r...@open-mpi.org <mailto:r...@open-mpi.org> wrote: Sigh - yet another corner case. Lovely. Will take a poke at it later this week. Thx for tracking it down On Jan 11, 2017, at 5:27 PM, Gilles Gouaillardet <gil...@rist.or.jp &l

[OMPI devel] Fwd: Re: [OMPI users] still segmentation fault with openmpi-2.0.2rc3 on Linux

2017-01-11 Thread Gilles Gouaillardet
ers] still segmentation fault with openmpi-2.0.2rc3 on Linux Date: Wed, 11 Jan 2017 20:39:02 +0900 From: Gilles Gouaillardet <gilles.gouaillar...@gmail.com> Reply-To: Open MPI Users <us...@lists.open-mpi.org> To: Open MPI Users <us...@lists.open-mpi.org> Siegmar

Re: [OMPI devel] OMPI devel] hwloc missing NUMANode object

2017-01-05 Thread Gilles Gouaillardet
gog...@inria.fr> wrote: > > >Le 05/01/2017 07:07, Gilles Gouaillardet a écrit : >> Brice, >> >> things would be much easier if there were an HWLOC_OBJ_NODE object in >> the topology. >> >> could you please consider backporting the relevant changes

[OMPI devel] hwloc missing NUMANode object

2017-01-04 Thread Gilles Gouaillardet
Ralph and Brice, since https://github.com/open-mpi/ompi/commit/fe68f2309912ea2afdc3339ff9a3b697f69a2dd1 we likely set the default binding policy to OPAL_BIND_TO_NUMA unfortunatly, that does not work on my VM (Virtual Box, single socket, 4 cores) since there is no HWLOC_OBJ_NODE here

Re: [OMPI devel] Wtime is 0.0

2016-12-19 Thread Gilles Gouaillardet
++) if not, then you can configure with FFLAGS='-fdefault-real-8 -fdefault-double-8', that will very likely work with clang/clang++/gfortran Cheers, Gilles On 12/19/2016 11:26 PM,  Jan Hegewald wrote: On 19 Dec 2016, at 15:19, Gilles Gouaillardet <gilles.gouaillar...@gmail.com> wrote: Jeff,

Re: [OMPI devel] Wtime is 0.0

2016-12-19 Thread Gilles Gouaillardet
Jeff, I am not aware of such flag for C/C++ compilers. Jan, I noticed checking size of Fortran DOUBLE PRECISION... 16 At first glance, this looks surprising to me Which compiler (vendor and version) are you using ? Cheers, Gilles On Monday, December 19, 2016, Jeff Squyres (jsquyres)

Re: [OMPI devel] LD_PRELOAD a C-coded shared object with a FORTRAN application

2016-12-12 Thread Gilles Gouaillardet
Clement, Ideally, your LD_PRELOAD'able library should be written in Fortran so you do not even run into this kind of issues (name mangling + parameter types) If you really want to write it in C, you have to do it all manually SUBROUTINE MPI_INIT(ierror) INTEGER IERROR can become void

Re: [OMPI devel] heads up about OMPI/master

2016-12-01 Thread Gilles Gouaillardet
2/2016 9:41 AM, Paul Hargrove wrote: On Thu, Dec 1, 2016 at 4:25 PM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote: [...] git checkout master git merge --ff-only topic/misc_fixes git push origin master [...] Gilles, You characterized th

Re: [OMPI devel] heads up about OMPI/master

2016-12-01 Thread Gilles Gouaillardet
er to pull in commits via PR process. Howard Am Donnerstag, 1. Dezember 2016 schrieb Gilles Gouaillardet : fwiw, the major change is in https://github.com/open-mpi/ompi/commit/c9aeccb84e4626c350af4daa974d37775db5b25e <https://github.com/open-mpi/ompi/commit/c9aeccb84e4626c350af4daa974

Re: [OMPI devel] heads up about OMPI/master

2016-12-01 Thread Gilles Gouaillardet
On Friday, December 2, 2016, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > Howard, > > i pushed a bunch of commits yesterday, and that was not an accident. > you might be referring https://github.com/open-mpi/ompi/commit/ > cb55c88a8b7817d5891ff06a447ea190b0e774

Re: [OMPI devel] heads up about OMPI/master

2016-12-01 Thread Gilles Gouaillardet
Howard, i pushed a bunch of commits yesterday, and that was not an accident. you might be referring https://github.com/open-mpi/ompi/commit/cb55c88a8b7817d5891ff06a447ea190b0e77479 but it has already been reverted 9 days ago with

Re: [OMPI devel] [OMPI users] funny SIGSEGV in 'ompi_info'

2016-11-22 Thread Gilles Gouaillardet
g., MPI_Init or ompi_info) can return it to the > user, who can then decide what to do. > > Disregarding the parameter is not an option as it violates our “do what the > user said to do, else return an error” policy > > On Nov 21, 2016, at 9:23 PM, Gilles Gouaillardet <gil...@rist.or.j

Re: [OMPI devel] Current progress threads status in Open MPI

2016-11-22 Thread Gilles Gouaillardet
Christoph, if you need progress thread on other interconnects, you might want to consider an external approach such as APSM https://www.osc.edu/~kmanalo/asyncrhonousmpi i was able to download it (or a similar library) a few years ago, but i cannot recall where ... Cheers, Gilles On Wednesday,

Re: [OMPI devel] MPI_Win_lock semantic

2016-11-21 Thread Gilles Gouaillardet
nd osc/pt2pt. -Nathan > On Nov 21, 2016, at 8:54 PM, Gilles Gouaillardet <gil...@rist.or.jp <mailto:gil...@rist.or.jp>> wrote: > > Thanks Nathan, > > > any thoughts about my modified version of the test ? > > do i need to MP

Re: [OMPI devel] [OMPI users] funny SIGSEGV in 'ompi_info'

2016-11-21 Thread Gilles Gouaillardet
Paul, SIGSEGV is always a bad idea, even after having displayed a comprehensive and user friendly error message -- MCA framework parameters can only take a single negation operator ("^"), and it must be at the beginning

Re: [OMPI devel] QE, mpif.h and the Intel compiler

2016-11-21 Thread Gilles Gouaillardet
Paul, short answer, i have no clue. that being said, consider the following simple program program test_mpi_sizeof implicit none include 'mpif.h' integer i double precision d integer sze,ierr call MPI_Sizeof(i, sze, ierr) write (*,*) 'MPI_Sizeof(integer) = ', sze call MPI_Sizeof(d, sze,

Re: [OMPI devel] MPI_Win_lock semantic

2016-11-21 Thread Gilles Gouaillardet
call MPI_Win_flush. I think this should work even if you have not started any RMA operations inside the epoch. -Nathan On Nov 21, 2016, at 7:53 PM, Gilles Gouaillardet <gil...@rist.or.jp> wrote: Nathan, we briefly discussed the test_lock1 test from the onesided test suite using osc

[OMPI devel] MPI_Win_lock semantic

2016-11-21 Thread Gilles Gouaillardet
Nathan, we briefly discussed the test_lock1 test from the onesided test suite using osc/pt2pt https://github.com/open-mpi/ompi-tests/blob/master/onesided/test_lock1.c#L57-L70 task 0 does MPI_Win_lock(MPI_LOCK_EXCLUSIVE, rank=1,...); MPI_Send(...,dest=2,...) and task 2 does

Re: [OMPI devel] Makefile.am configuration

2016-10-26 Thread Gilles Gouaillardet
Clement, For your first question, you can search for inspiration in https://github.com/open-mpi/ompi/blob/master/opal/mca/btl/sm/Makefile.am btl/sm uses common/sm (and common/cuda too), so it looks pretty similar to what you are trying to achieve. For your second question, and even if I do not

  1   2   3   4   5   6   7   8   >