[OMPI users] question about the Open-MPI ABI

2023-02-01 Thread Jeff Hammond via users
Why do the null handles not follow a consistent scheme, at least in
Open-MPI 4.1.2?

ompi_mpi__null is used except when handle={request,message}, which
drop the "mpi_".

The above have an associated ..null_addr except ompi_mpi_datatype_null and
ompi_message_null.

Why?

Jeff

Open MPI v4.1.2, package: Debian OpenMPI, ident: 4.1.2, repo rev: v4.1.2,
Nov 24, 2021

$ nm -gD /usr/lib/x86_64-linux-gnu/libmpi.so | grep ompi | grep null | grep
-v fn
00134040 B ompi_message_null
00126300 B ompi_mpi_comm_null
00115898 D ompi_mpi_comm_null_addr
00120f40 D ompi_mpi_datatype_null
0012cb00 B ompi_mpi_errhandler_null
00116030 D ompi_mpi_errhandler_null_addr
00134660 B ompi_mpi_file_null
001163e8 D ompi_mpi_file_null_addr
00126200 B ompi_mpi_group_null
00115890 D ompi_mpi_group_null_addr
0012cf80 B ompi_mpi_info_null
00116038 D ompi_mpi_info_null_addr
00133720 B ompi_mpi_op_null
001163c0 D ompi_mpi_op_null_addr
00135740 B ompi_mpi_win_null
00117c80 D ompi_mpi_win_null_addr
0012d080 B ompi_request_null
00116040 D ompi_request_null_addr


-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] Disabling barrier in MPI_Finalize

2022-09-10 Thread Jeff Hammond via users
You can use MPI_Abort(MPI_COMM_SELF,0) to exit a process locally.

This may abort world if errors are fatal but either way, it’s not going to
synchronize before processes go poof.

Jeff

On Fri 9. Sep 2022 at 21.34 Mccall, Kurt E. (MSFC-EV41) via users <
users@lists.open-mpi.org> wrote:

> Hi,
>
>
>
> If a single process needs to exit, MPI_Finalize will pause at a barrier,
> possibly waiting for pending communications to complete.  Does OpenMPI have
> any means to disable this behavior, so the a single process can exit
> normally if the application calls for it?
>
>
>
> Thanks,
>
> Kurt
>
-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] Segfault in ucp_dt_pack function from UCX library 1.8.0 and 1.11.2 for large sized communications using both OpenMPI 4.0.3 and 4.1.2

2022-06-05 Thread Jeff Hammond via users
Alltoallv has both a large count and large displacement problem in the API.
You can work around the latter by using neighborhood alltoallv using a
duplicate of your original communicator that’s neighborhood compatible.
Neighborhood collectives use MPI_Aint displacements instead of int.

If you need tests,
https://github.com/jeffhammond/BigMPI test suite is nothing but large count
MPI calls using derived data types.

Jeff

On Thu 2. Jun 2022 at 22.28 Eric Chamberland via users <
users@lists.open-mpi.org> wrote:

> Hi Josh,
>
> ok, thanks for the suggestion.  We are in process to test with IntelMPI
> right now.  I hope to do it with a newer version of OpenMPI too.
>
> Do you suggest a minimum version for UCX lib?
>
> Thanks,
>
> Eric
> On 2022-06-02 04:05, Josh Hursey via users wrote:
>
> I would suggest trying OMPI v4.1.4 (or the v5 snapshot)
>  * https://www.open-mpi.org/software/ompi/v4.1/
>  * https://www.mail-archive.com/announce@lists.open-mpi.org//msg00152.html
>
> We fixed some large payload collective issues in that release which might
> be what you are seeing here with MPI_Alltoallv with the tuned collective
> component.
>
>
>
> On Thu, Jun 2, 2022 at 1:54 AM Mikhail Brinskii via users <
> users@lists.open-mpi.org> wrote:
>
>> Hi Eric,
>>
>>
>>
>> Yes, UCX is supposed to be stable for large sized problems.
>>
>> Did you see the same crash with both OMPI-4.0.3 + UCX 1.8.0 and
>> OMPI-4.1.2 + UCX1.11.2?
>>
>> Have you also tried to run large sized problems test with OMPI-5.0.x?
>>
>> Regarding the application, at some point it invokes MPI_Alltoallv sending
>> more than 2GB to some of the ranks (using derived dt), right?
>>
>>
>>
>> //WBR, Mikhail
>>
>>
>>
>> *From:* users  *On Behalf Of *Eric
>> Chamberland via users
>> *Sent:* Thursday, June 2, 2022 5:31 AM
>> *To:* Open MPI Users 
>> *Cc:* Eric Chamberland ; Thomas
>> Briffard ; Vivien Clauzon <
>> vivien.clau...@michelin.com>; dave.mar...@giref.ulaval.ca; Ramses van
>> Zon ; charles.coulomb...@ulaval.ca
>> *Subject:* [OMPI users] Segfault in ucp_dt_pack function from UCX
>> library 1.8.0 and 1.11.2 for large sized communications using both OpenMPI
>> 4.0.3 and 4.1.2
>>
>>
>>
>> Hi,
>>
>> In the past, we have successfully launched large sized (finite elements)
>> computations using PARMetis as mesh partitioner.
>>
>> It was first in 2012 with OpenMPI (v2.?) and secondly in March 2019 with
>> OpenMPI 3.1.2 that we succeeded.
>>
>> Today, we have a bunch of nightly (small) tests running nicely and
>> testing all of OpenMPI (4.0.x, 4.1.x and 5.0x), MPICH-3.3.2 and IntelMPI
>> 2021.6.
>>
>> Preparing for launching the same computation we did in 2012, and even
>> larger ones, we compiled with bot OpenMPI 4.0.3+ucx-1.8.0 and OpenMPI
>> 4.1.2+ucx-1.11.2 and launched computation from small to large problems
>> (meshes).
>>
>> For small meshes, it goes fine.
>>
>> But when we reach near 2^31 faces into the 3D mesh we are using and call
>> ParMETIS_V3_PartMeshKway, we always get a segfault with the same backtrace
>> pointing into ucx library:
>>
>> Wed Jun  1 23:04:54
>> 2022:chrono::InterfaceParMetis::ParMETIS_V3_PartMeshKway::debut
>> VmSize: 1202304 VmRSS: 349456 VmPeak: 1211736 VmData: 500764 VmHWM: 359012
>> 
>> Wed Jun  1 23:07:07 2022:Erreur:  MEF++ Signal recu : 11 :
>>  segmentation violation
>> Wed Jun  1 23:07:07 2022:Erreur:
>> Wed Jun  1 23:07:07 2022:-- (Début
>> des informations destinées aux développeurs C++)
>> --
>> Wed Jun  1 23:07:07 2022:La pile d'appels contient 27 symboles.
>> Wed Jun  1 23:07:07 2022:# 000:
>> reqBacktrace(std::__cxx11::basic_string,
>> std::allocator >&)  >>>  probGD.opt
>> (probGD.opt(_Z12reqBacktraceRNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x71)
>> [0x4119f1])
>> Wed Jun  1 23:07:07 2022:# 001: attacheDebugger()  >>>
>>  probGD.opt (probGD.opt(_Z15attacheDebuggerv+0x29a) [0x41386a])
>> Wed Jun  1 23:07:07 2022:# 002:
>> /gpfs/fs0/project/d/deteix/ericc/GIREF/lib/libgiref_opt_Util.so(traitementSignal+0x1f9f)
>> [0x2ab3aef0e5cf]
>> Wed Jun  1 23:07:07 2022:# 003: /lib64/libc.so.6(+0x36400)
>> [0x2ab3bd59a400]
>> Wed Jun  1 23:07:07 2022:# 004:
>> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/libucp.so.0(ucp_dt_pack+0x123)
>> [0x2ab3c966e353]
>> Wed Jun  1 23:07:07 2022:# 005:
>> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/libucp.so.0(+0x536b7)
>> [0x2ab3c968d6b7]
>> Wed Jun  1 23:07:07 2022:# 006:
>> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/ucx/libuct_ib.so.0(uct_dc_mlx5_ep_am_bcopy+0xd7)
>> [0x2ab3ca712137]
>> Wed Jun  1 23:07:07 2022:# 007:
>> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/libucp.so.0(+0x52d3c)
>> [0x2ab3c968cd3c]
>> Wed Jun  1 23:07:07 2022:# 008:
>> /scinet/niagara/software/2022a/opt/gcc-11.2.0/ucx/1.11.2/lib/libucp.so.0(ucp_tag_send_nbx+0x5ad)
>> [0x2ab3c9696dcd]
>> Wed Jun  1 23:07:07 2022:# 009:
>> 

[OMPI users] please fix your attributes implementation in v5.0.0rc3+, which is broken by GCC 11

2022-04-30 Thread Jeff Hammond via users
Calling attribute functions segfaults the application starting in
v5.0.0rc3.  This is really bad for users, because the segfault happens in
application code, so it takes a while to figuring out what is wrong.  I
spent an entire day bisecting your tags before I figured out what was
happening.

https://jenkins.open-mpi.org/jenkins/job/open-mpi.build.compilers/8370/
indicates you are not testing GCC 11.  Please test this compiler.

https://github.com/open-mpi/ompi/pull/10343 has details.

Jeff

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] cross-compilation documentation seems to be missing

2021-09-07 Thread Jeff Hammond via users
Thanks.  In half an hour I'll know how this works (my RISC-V system is
quite slow).

For others who stumble on this thread, it is "-C" not "-c".

Jeff

On Tue, Sep 7, 2021 at 3:35 PM Gilles Gouaillardet via users <
users@lists.open-mpi.org> wrote:

> Hi Jeff,
>
>
> Here is a sample file I used some times ago (some definitions might be
> missing though ...)
>
>
> In order to automatically generate this file - this is a bit of an egg
> and the chicken problem -
>
> you can run
>
> configure -c
>
> on the RISC-V node. It will generate a config.cache file.
>
> Then you can
>
> grep ^ompi_cv_fortran_ config.cache
>
> to generate the file you can pass to --with-cross when cross compiling
> on your x86 system
>
>
> Cheers,
>
>
> Gilles
>
>
> On 9/7/2021 7:35 PM, Jeff Hammond via users wrote:
> > I am attempting to cross-compile Open-MPI for RISC-V on an x86
> > system.  I get this error, with which I have some familiarity:
> >
> > checking size of Fortran CHARACTER... configure: error: Can not
> > determine size of CHARACTER when cross-compiling
> >
> > I know that I need to specify the size explicitly using a
> > cross-compilation file.  According to configure, this is documented.
> >
> > --with-cross=FILE   Specify configure values that can not be
> > determined in a cross-compilation environment. See the Open MPI FAQ.
> >
> > Where is this documented?
> > https://www.open-mpi.org/faq/?category=building
> > <https://www.open-mpi.org/faq/?category=building> contains nothing
> > relevant.
> >
> > Thanks,
> >
> > Jeff
> >
> > --
> > Jeff Hammond
> > jeff.scie...@gmail.com <mailto:jeff.scie...@gmail.com>
> > http://jeffhammond.github.io/ <http://jeffhammond.github.io/>
>


-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


[OMPI users] cross-compilation documentation seems to be missing

2021-09-07 Thread Jeff Hammond via users
I am attempting to cross-compile Open-MPI for RISC-V on an x86 system.  I
get this error, with which I have some familiarity:

checking size of Fortran CHARACTER... configure: error: Can not determine
size of CHARACTER when cross-compiling

I know that I need to specify the size explicitly using a cross-compilation
file.  According to configure, this is documented.

--with-cross=FILE   Specify configure values that can not be determined
in a cross-compilation environment. See the Open MPI FAQ.

Where is this documented?  https://www.open-mpi.org/faq/?category=building
contains nothing relevant.

Thanks,

Jeff

--
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


[OMPI users] how to suppress "libibverbs: Warning: couldn't load driver ..." messages?

2021-06-23 Thread Jeff Hammond via users
I am running on a single node and do not need any network support.  I am
using the NVIDIA build of Open-MPI 3.1.5.  How do I tell it to never use
anything related to IB?  It seems that ^openib is not enough.

Thanks,

Jeff

$ OMP_NUM_THREADS=1
/proj/nv/Linux_aarch64/21.5/comm_libs/openmpi/openmpi-3.1.5/bin/mpirun
--mca btl ^openib -n 40
/local/home/jehammond/NWCHEM/nvhpc-mpi-pr/bin/LINUX64/nwchem
w12_b3lyp_cc-pvtz_energy.nw | tee
w12_b3lyp_cc-pvtz_energy.nvhpc-mpi-pr.n40.log

libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav25.so':
libvmw_pvrdma-rdmav25.so: cannot open shared object file: No such file or
directory
libibverbs: Warning: couldn't load driver 'libsiw-rdmav25.so':
libsiw-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'librxe-rdmav25.so':
librxe-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libqedr-rdmav25.so':
libqedr-rdmav25.so: cannot open shared object file: No such file or
directory
libibverbs: Warning: couldn't load driver 'libmlx5-rdmav25.so':
libmlx5-rdmav25.so: cannot open shared object file: No such file or
directory
libibverbs: Warning: couldn't load driver 'libmlx4-rdmav25.so':
libmlx4-rdmav25.so: cannot open shared object file: No such file or
directory
libibverbs: Warning: couldn't load driver 'libi40iw-rdmav25.so':
libi40iw-rdmav25.so: cannot open shared object file: No such file or
directory
libibverbs: Warning: couldn't load driver 'libhns-rdmav25.so':
libhns-rdmav25.so: cannot open shared object file: No such file or directory
libibverbs: Warning: couldn't load driver 'libhfi1verbs-rdmav25.so':
libhfi1verbs-rdmav25.so: cannot open shared object file: No such file or
directory
libibverbs: Warning: couldn't load driver 'libcxgb4-rdmav25.so':
libcxgb4-rdmav25.so: cannot open shared object file: No such file or
directory
libibverbs: Warning: couldn't load driver 'libbnxt_re-rdmav25.so':
libbnxt_re-rdmav25.so: cannot open shared object file: No such file or
directory
libibverbs: Warning: couldn't load driver 'libvmw_pvrdma-rdmav25.so':
libvmw_pvrdma-rdmav25.so: cannot open shared object file: No such file or
directory


Re: [OMPI users] Books/resources to learn (open)MPI from

2020-08-20 Thread Jeff Hammond via users
It's not about Open-MPI but I know of only one book on the internals of
MPI: "Inside the Message Passing Interface: Creating Fast Communication
Libraries" by Alexander Supalov.

I found it useful for understanding how MPI libraries are implemented.  It
is no substitute for spending hours reading source code or discussing
implementation details over coffee, but I think it is a useful guide to
understand what one sees in the source code of MPICH and Open-MPI.

Jeff

On Thu, Aug 6, 2020 at 6:27 AM Jeff Squyres (jsquyres) via users <
users@lists.open-mpi.org> wrote:

> FWIW, we didn't talk too much about the internals of Open MPI -- but it's
> a good place to start (i.e., you won't understand the internals until you
> understand the externals).
>
> You can find all the videos and slides for all 3 parts here:
> https://www.open-mpi.org/video/?category=general
>
> In additional, there's a now-several-years-old set of videos on the
> internals here: https://www.open-mpi.org/video/?category=internals
>
> Some of that information is dated, but the broad strokes are still very
> relevant.
>
>
>
> On Aug 6, 2020, at 7:13 AM, Oddo Da via users 
> wrote:
>
> Thank you!
>
> On Thu, Aug 6, 2020 at 5:26 AM Gilles Gouaillardet via users <
> users@lists.open-mpi.org> wrote:
>
>> You can start with the recent talks given by Jeff Squyres and Ralph
>> Castain for Easybuild
>>
>> EasyBuild Tech Talk I - The ABCs of Open MPI, part 1 (by Jeff Squyres
>> & Ralph Castain)
>> https://www.youtube.com/watch?v=WpVbcYnFJmQ
>>
>> (there are three parts)
>>
>> Then the source code and interacting with the developers via github
>> and/or the devel mailing list
>>
>> Cheers,
>>
>> Gilles
>>
>> On Thu, Aug 6, 2020 at 5:47 PM Oddo Da via users
>>  wrote:
>> >
>> > On Wed, Aug 5, 2020 at 11:06 PM Gilles Gouaillardet via users <
>> users@lists.open-mpi.org> wrote:
>> >>
>> >> Assuming you want to learn about MPI (and not the Open MPI internals),
>> >> the books by Bill Gropp et al. are the reference :
>> >> https://www.mcs.anl.gov/research/projects/mpi/usingmpi/
>> >>
>> >> (Using MPI 3rd edition is affordable on amazon)
>> >
>> >
>> > Thanks! Yes, this is what I was after. However, if I wanted to learn
>> about OpenMPI internals, what would be the go-to resource?
>>
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
>

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] OMPI 4.0.4 how to use mpirun properly in numa architecture

2020-08-20 Thread Jeff Hammond via users
On Thu, Aug 20, 2020 at 3:22 AM Carlo Nervi via users <
users@lists.open-mpi.org> wrote:

> Dear OMPI community,
> I'm a simple end-user with no particular experience.
> I compile quantum chemical programs and use them in parallel.
>

Which code?  Some QC codes behave differently than traditional MPI codes in
a NUMA context and it is worth mentioning it explicitly if you are using
NWChem, GAMES, MOLPRO, or other code that uses GA or DDI.  If you are
running VASP, CP2K, or other code that uses MPI in a more conventional
manner, don't worry about it.

Jeff

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] OpenMPI 4.0.2 with PGI 19.10, will not build with hcoll

2020-01-25 Thread Jeff Hammond via users
To be more strictly equivalent, you will want to add -D_REENTRANT to add to
the substitution, but this may not affect hcoll.

https://stackoverflow.com/questions/2127797/significance-of-pthread-flag-when-compiling/2127819#2127819

The proper fix here is a change in OMPI build system, of course, to not set
-pthread when PGI is used.

Jeff

On Fri, Jan 24, 2020 at 11:31 AM Åke Sandgren via users <
users@lists.open-mpi.org> wrote:

> PGI needs this in its, for instance, siterc or localrc:
> # replace unknown switch -pthread with -lpthread
> switch -pthread is replace(-lpthread) positional(linker);
>
>
> On 1/24/20 8:12 PM, Raymond Muno via users wrote:
> > I am having issues building OpenMPI 4.0.2 using the PGI 19.10
> > compilers.  OS is CentOS 7.7, MLNX_OFED 4.7.3
> >
> > It dies at:
> >
> > PGC/x86-64 Linux 19.10-0: compilation completed with warnings
> >   CCLD mca_coll_hcoll.la
> > pgcc-Error-Unknown switch: -pthread
> > make[2]: *** [mca_coll_hcoll.la] Error 1
> > make[2]: Leaving directory
> > `/project/muno/OpenMPI/PGI/openmpi-4.0.2/ompi/mca/coll/hcoll'
> > make[1]: *** [all-recursive] Error 1
> > make[1]: Leaving directory `/project/muno/OpenMPI/PGI/openmpi-4.0.2/ompi'
> > make: *** [all-recursive] Error 1
> >
> > I tried with PGI 19.9 and had the same issue.
> >
> > If I do not include hcoll, it builds.  I have successfully built OpenMPI
> > 4.0.2 with GCC, Intel and AOCC compilers, all using the same options.
> >
> > hcoll is provided by MLNX_OFED 4.7.3 and configure is run with
> >
> > --with-hcoll=/opt/mellanox/hcoll
> >
> >
>
> --
> Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
> Internet: a...@hpc2n.umu.se   Phone: +46 90 7866134 Fax: +46 90-580 14
> Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se
>
-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] problem with cancelling Send-Request

2019-10-02 Thread Jeff Hammond via users
“Supposedly faster” isn’t a particularly good reason to change MPI
implementations but canceling sends is hard for reasons that have nothing
to do with performance.

Also, I’d not be so eager to question the effectiveness of Open-MPI on
InfiniBand. Check the commit logs for Mellanox employees some time.

Jeff

On Wed, Oct 2, 2019 at 7:46 AM Emyr James via users <
users@lists.open-mpi.org> wrote:

> Hi Christian,
>
>
> I would suggest using mvapich2 instead. It is supposedly faster than
> OpenMpi on infiniband and it seems to have fewer options under the hood
> which means less things you have to tweak to get it working for you.
>
>
> Regards,
>
>
> Emyr James
> Head of Scientific IT
> CRG -Centre for Genomic Regulation
> C/ Dr. Aiguader, 88
> 
> Edif. PRBB
> 08003 Barcelona, Spain
> Phone Ext: #1098
>
> --
> *From:* users  on behalf of Christian
> Von Kutzleben via users 
> *Sent:* 02 October 2019 16:14:24
> *To:* users@lists.open-mpi.org
> *Cc:* Christian Von Kutzleben
> *Subject:* [OMPI users] problem with cancelling Send-Request
>
> Hi,
>
> I’m currently evaluating to use openmpi (4.0.1) in our application.
>
> We are using a construct like this for some cleanup functionality, to
> cancel some Send requests:
>
> *if* (*req != MPI_REQUEST_NULL) {
> MPI_Cancel(req);
> MPI_Wait(req, MPI_STATUS_IGNORE);
> assert(*req == MPI_REQUEST_NULL);
> }
>
> However the MPI_Wait hangs indefinitely and I’ve debugged into it and I
> came across this in pml_ob1_sendreq.c, eventually invoked from MPI_Cancel
> in my scenario:
>
> *static* *int* *mca_pml_ob1_send_request_cancel*(*struct* ompi_request_t*
> request, *int* complete)
> {
> /* we dont cancel send requests by now */
> *return* OMPI_SUCCESS;
> }
>
> The man page for MPI_Cancel does not mention that cancelling Send requests
> does not work, so I’m wondering,
> whether this is a current limitation or are we not supposed to end up in
> this specific …_request_cancel implementation?
>
> Thank you in advance!
>
> Christian
>
-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] problem with cancelling Send-Request

2019-10-02 Thread Jeff Hammond via users
Don’t try to cancel sends.

https://github.com/mpi-forum/mpi-issues/issues/27 has some useful info.

Jeff

On Wed, Oct 2, 2019 at 7:17 AM Christian Von Kutzleben via users <
users@lists.open-mpi.org> wrote:

> Hi,
>
> I’m currently evaluating to use openmpi (4.0.1) in our application.
>
> We are using a construct like this for some cleanup functionality, to
> cancel some Send requests:
>
> *if* (*req != MPI_REQUEST_NULL) {
> MPI_Cancel(req);
> MPI_Wait(req, MPI_STATUS_IGNORE);
> assert(*req == MPI_REQUEST_NULL);
> }
>
> However the MPI_Wait hangs indefinitely and I’ve debugged into it and I
> came across this in pml_ob1_sendreq.c, eventually invoked from MPI_Cancel
> in my scenario:
>
> *static* *int* *mca_pml_ob1_send_request_cancel*(*struct* ompi_request_t*
> request, *int* complete)
> {
> /* we dont cancel send requests by now */
> *return* OMPI_SUCCESS;
> }
>
> The man page for MPI_Cancel does not mention that cancelling Send requests
> does not work, so I’m wondering,
> whether this is a current limitation or are we not supposed to end up in
> this specific …_request_cancel implementation?
>
> Thank you in advance!
>
> Christian
>
-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/


Re: [OMPI users] silent failure for large allgather

2019-08-11 Thread Jeff Hammond via users
On Tue, Aug 6, 2019 at 9:54 AM Emmanuel Thomé via users <
users@lists.open-mpi.org> wrote:

> Hi,
>
> In the attached program, the MPI_Allgather() call fails to communicate
> all data (the amount it communicates wraps around at 4G...).  I'm running
> on an omnipath cluster (2018 hardware), openmpi 3.1.3 or 4.0.1 (tested
> both).
>
> With the OFI mtl, the failure is silent, with no error message reported.
> This is very annoying.
>
> With the PSM2 mtl, we have at least some info printed that 4G is a limit.
>
> I have tested it with various combinations of mca parameters. It seems
> that the one config bit that makes the test pass is the selection of the
> ob1 pml. However I have to select it explicitly, because otherwise cm is
> selected instead (priority 40 vs 20, it seems), and the program fails. I
> don't know to which extent the cm pml is the root cause, or whether I'm
> witnessing a side-effect of something.
>
> openmpi-3.1.3 (debian10 package openmpi-bin-3.1.3-11):
>
> node0 ~ $ mpiexec -machinefile /tmp/hosts --map-by node  -n 2 ./a.out
> MPI_Allgather, 2 nodes, 0x10001 chunks of 0x1 bytes, total 2 *
> 0x10001 bytes: ...
> Message size 4295032832 bigger than supported by PSM2 API. Max =
> 4294967296
> MPI error returned:
> MPI_ERR_OTHER: known error not in list
> MPI_Allgather, 2 nodes, 0x10001 chunks of 0x1 bytes, total 2 *
> 0x10001 bytes: NOK
> [node0.localdomain:14592] 1 more process has sent help message
> help-mtl-psm2.txt / message too big
> [node0.localdomain:14592] Set MCA parameter "orte_base_help_aggregate"
> to 0 to see all help / error messages
>
> node0 ~ $ mpiexec -machinefile /tmp/hosts --map-by node  -n 2 --mca
> mtl ofi ./a.out
> MPI_Allgather, 2 nodes, 0x10001 chunks of 0x1 bytes, total 2 *
> 0x10001 bytes: ...
> MPI_Allgather, 2 nodes, 0x10001 chunks of 0x1 bytes, total 2 *
> 0x10001 bytes: NOK
> node 0 failed_offset = 0x10002
> node 1 failed_offset = 0x1
>
> I attached the corresponding outputs with some mca verbose
> parameters on, plus ompi_info, as well as variations of the pml layer
> (ob1 works).
>
> openmpi-4.0.1 gives essentially the same results (similar files
> attached), but with various doubts on my part as to whether I've run this
> check correctly. Here are my doubts:
> - whether I should or not have an ucx build for an omnipath cluster
>   (IIUC https://github.com/openucx/ucx/issues/750 is now fixed ?),
>

UCX is not optimized for Omni Path.  Don't use it.


> - which btl I should use (I understand that openib goes to
>   deprecation and it complains unless I do --mca btl openib --mca
>   btl_openib_allow_ib true ; fine. But then, which non-openib non-tcp
>   btl should I use instead ?)
>

OFI->PS2 and PSM2 are the right conduits for Omni Path.


> - which layers matter, which ones matter less... I tinkered with btl
>   pml mtl.  It's fine if there are multiple choices, but if some
>   combinations lead to silent data corruption, that's not really
>   cool.
>

It sounds like Open-MPI doesn't properly support the maximum transfer size
of PSM2.  One way to work around this is to wrap your MPI collective calls
and do <4G chunking yourself.

Jeff


> Could the error reporting in this case be somehow improved ?
>
> I'd be glad to provide more feedback if needed.
>
> E.
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users



-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] When is it save to free the buffer after MPI_Isend?

2019-08-11 Thread Jeff Hammond via users
The snippets suggest you were storing a reference to an object on the
stack. Stack variables go out of scope when the function returns. Using a
reference to them out-of-scope is illegal but often fails
nondeterministically. Good compilers will issue a warning about this under
the right conditions (ie compiler flags).

Jeff

On Sat, Aug 10, 2019 at 10:59 AM carlos aguni via users <
users@lists.open-mpi.org> wrote:

> Hi all,
>
> Sorry no reply.
>
> I just figured out the solution.
>
> The problem was that I had a function that would MPI_Isend a message on
> every call to it. Then I'd store its request pointer to a list.
> My MPI_Isend snippet:
> MPI_Request req;
> MPI_Isend(blabla, )
> task_push();
>
> From time to time at the beginning of that function I'd call another
> function that would iterate over that list MPI_Testing if that message
> had completed to then free the buffers used.
> The problem was that even if the flag returned from MPI_Test(, ,
> ) previouly assigned to 0 returned 1 but my guess is that C had
> already deallocated it from the heap (idk much about C though..)
> Snippet of my clean function:
> 
> int flag = 0;
> MPI_Test(req, , );
> if (flag) { // then free..
> ...
>
> My solution that worked was previously malloc it before the MPI_Isend call
> like:
> MPI_Request *rr = (MPI_Request *)malloc(sizeof(MPI_Request));
> MPI_Isend(blabla, rr);
> task_push(rr);
>
> All I know is that it's working now..
>
> Thanks to all.
>
> Regards,
> C.
>
> On Sun, Jul 28, 2019 at 11:53 AM Jeff Squyres (jsquyres) via users <
> users@lists.open-mpi.org> wrote:
>
>> On Jul 27, 2019, at 10:43 PM, Gilles Gouaillardet via users <
>> users@lists.open-mpi.org> wrote:
>> >
>> > MPI_Isend() does not automatically frees the buffer after it sends the
>> message.
>> > (it simply cannot do it since the buffer might be pointing to a global
>> > variable or to the stack).
>>
>> Gilles is correct: MPI_Isend does not free the buffer.  I was wondering
>> if you had somehow used that same buffer -- or some subset of that buffer
>> -- in other non-blocking MPI API calls, and freeing it triggered Bad Things
>> because MPI was still using (some of) that buffer because of other pending
>> MPI requests.
>>
>> > Can you please extract a reproducer from your program ?
>>
>> Yes, please do this.
>>
>> > Out of curiosity, what if you insert an (useless) MPI_Wait() like this ?
>> >
>> > MPI_Test(req, , );
>> > if (flag){
>> >MPI_Wait(req, MPI_STATUS_IGNORE);
>> >free(buffer);
>> > }
>>
>> That should be a no-op, because "req" should have been turned into
>> MPI_REQUEST_NULL if flag==true.
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>>
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users

-- 
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users