[OMPI users] OpenMPI 1.6.5 & 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-11-25 Thread Eric Chamberland
Hi, I have random segmentation violations (signal 11) in the mentioned function when testing MPI I/O calls with 2 processes on a single machine. Most of the time (1499/1500), it works perfectly. here are the call stacks (for 1.6.3) on processes: process 0:

[OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Eric Chamberland
Hi, we were used to do oversubscribing just to do code validation in nightly automated parallel runs of our code. I just compiled openmpi 1.8.3 and launched the whole suit of sequential/parallel tests and noticed a *major* slowdown in oversubscribed parallel tests with 1.8.3 compared to

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Eric Chamberland
: "mpirun -np 32 myprog" Maybe the result of "-mca mpi_show_mca_params all" would be insightful? Eric On Dec 9, 2014, at 9:14 AM, Eric Chamberland <eric.chamberl...@giref.ulaval.ca> wrote: Hi, we were used to do oversubscribing just to do code validation in nightl

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Eric Chamberland
e diff may be interesting but I can't interpret everything that is written... The files are attached... Thanks, Eric On 12/09/2014 01:02 PM, Eric Chamberland wrote: On 12/09/2014 12:24 PM, Ralph Castain wrote: Can you provide an example cmd line you use to launch one of these tests using 1

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-09 Thread Eric Chamberland
On 12/09/2014 04:19 PM, Nathan Hjelm wrote: yield when idle is broken on 1.8. Fixing now. ok, thanks a lot! will wait for the fix! Eric

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
to dig into that notion a bit. On Dec 9, 2014, at 10:39 AM, Eric Chamberland <eric.chamberl...@giref.ulaval.ca> wrote: Hi again, I sorted and "seded" (cat outpout.1.00 |sed 's/default/default value/g'|sed 's/true/1/g' |sed 's/false/0/g') the output.1.00 file from: mpirun --o

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
On 12/10/2014 10:40 AM, Ralph Castain wrote: You should be able to apply the patch - I don’t think that section of code differs from what is in the 1.8 repo. it compiles, link, but gives me a segmentation violation now: #0 0x7f1827b00e91 in mca_allocator_component_lookup () from

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
On 12/10/2014 12:55 PM, Ralph Castain wrote: Tarball now available on web site http://www.open-mpi.org/nightly/v1.8/ _ _ _ _ On Dec 10, 2014, at 9:40 AM, Ralph Castain > wrote: I’ll run the tarball generator now so you can try the nightly tarball.

Re: [OMPI users] Oversubscribing in 1.8.3 vs 1.6.5

2014-12-10 Thread Eric Chamberland
On 12/10/2014 12:55 PM, Ralph Castain wrote: Tarball now available on web site http://www.open-mpi.org/nightly/v1.8/ I’ll run the tarball generator now so you can try the nightly tarball. ok, retrieved openmpi-v1.8.3-236-ga21cb20 and it compiled, linked, and executed nicely when

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Eric Chamberland
Hi, I finally (thanks for fixing oversubscribing) tested with 1.8.4rc3 for my problem with collective MPI I/O. A problem still there. In this 2 processes example, process rank 1 dies with segfault while process rank 0 wait indefinitely... Running with valgrind, I found these errors which

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Eric Chamberland
Hi Gilles, On 12/14/2014 09:20 PM, Gilles Gouaillardet wrote: Eric, can you make your test case (source + input file + howto) available so i can try to reproduce and fix this ? I would like to, but the complete app is big (and not public), is on top of PETSc with mkl, and in C++... :-( I

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Eric Chamberland
hich could be using shared memory on the same node? Thanks, Eric On 12/14/2014 02:06 PM, Eric Chamberland wrote: Hi, I finally (thanks for fixing oversubscribing) tested with 1.8.4rc3 for my problem with collective MPI I/O. A problem still there. In this 2 processes example, process rank 1 dies wit

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Eric Chamberland
ed in some manner... at least an error message!... Thanks, Eric i am now checking if the overflow is correctly detected (that could explain the one byte overflow reported by valgrind) Cheers, Gilles On 2014/12/15 11:52, Eric Chamberland wrote: Hi again, some new hints that migh

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-14 Thread Eric Chamberland
between 1 and 10 characters could you please give this patch a try and let us know the results ? Cheers, Gilles On 2014/12/15 11:52, Eric Chamberland wrote: Hi again, some new hints that might help: 1- With valgrind : If I run the same test case, same data, but moved to a shorter path

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-15 Thread Eric Chamberland
Hi Gilles, here is a simple setup to have valgrind caomplains now: export

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2014-12-17 Thread Eric Chamberland
e romio is currently imported from mpich. Cheers, Gilles On 2014/12/16 0:16, Eric Chamberland wrote: Hi Gilles, just created a very simple test case! with this setup, you will see the bug with valgrind: export too_long=./this/is/a_very/long/path/that/contains/a/not/so/long/filename/but/tr

Re: [OMPI users] OpenMPI 1.8.4rc3, 1.6.5 and 1.6.3: segmentation violation in mca_io_romio_dist_MPI_File_close

2015-01-16 Thread Eric Chamberland
On 01/14/2015 05:57 PM, Rob Latham wrote: On 12/17/2014 07:04 PM, Eric Chamberland wrote: Hi! Here is a "poor man's fix" that works for me (the idea is not from me, thanks to Thomas H.): #1- char* lCwd = getcwd(0,0); #2- chdir(lPathToFile); #3- MPI_File_open(...,lFileNameWithoutT

[OMPI users] Help on getting CMA works

2015-02-18 Thread Eric Chamberland
Hi, I have configured with "--with-cma" on 2 differents OS (RedHat 6.6 and OpenSuse 12.3), but in both case, I have the following error when launching a simple mpi_hello_world.c example: /opt/openmpi-1.8.4_cma/bin/mpiexec --mca btl_sm_use_cma 1 -np 2 /tmp/hw

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Eric Chamberland
: I recommend using vader for CMA. It has code to get around the ptrace setting. Run with mca_btl_vader_single_copy_mechanism cma (should be the default). Ok, I tried it, but it gives exactly the same error message! Eric -Nathan On Wed, Feb 18, 2015 at 02:56:01PM -0500, Eric Chamberland wrote

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Eric Chamberland
e to pass any options to "mpicc" when compiling/linking an mpi application to use cma? Thanks, Eric echo 1 > /proc/sys/kernel/yama/ptrace_scope as root. -Nathan On Thu, Feb 19, 2015 at 11:06:09AM -0500, Eric Chamberland wrote: By the way, I have tried two others things: #1

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Eric Chamberland
On 02/19/2015 02:58 PM, Nathan Hjelm wrote: On Thu, Feb 19, 2015 at 12:16:49PM -0500, Eric Chamberland wrote: On 02/19/2015 11:56 AM, Nathan Hjelm wrote: If you have yama installed you can try: Nope, I do not have it installed... is it absolutely necessary? (and would it change something

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Eric Chamberland
On 02/19/2015 03:53 PM, Nathan Hjelm wrote: Great! I will add an MCA variable to force CMA and also enable it if 1) no yama and 2) no PR_SET_PTRACER. cool, thanks again! You might also look at using xpmem. You can find a version that supports 3.x @ https://github.com/hjelmn/xpmem . It is a

Re: [OMPI users] Help on getting CMA works

2015-02-19 Thread Eric Chamberland
application. See: http://blogs.cisco.com/performance/the-vader-shared-memory-transport-in-open-mpi-now-featuring-3-flavors-of-zero-copy -Nathan On Thu, Feb 19, 2015 at 03:32:43PM -0500, Eric Chamberland wrote: On 02/19/2015 02:58 PM, Nathan Hjelm wrote: On Thu, Feb 19, 2015 at 12:16:49PM -0500

[OMPI users] assert in opal_datatype_is_contiguous_memory_layout

2013-04-05 Thread Eric Chamberland
Hi all, I have a well working (large) code which is using openmpi 1.6.3 (see config.log here: http://www.giref.ulaval.ca/~ericc/bug_openmpi/config.log_nodebug) (I have used it for reading with MPI I/O with success over 1500 procs with very large files) However, when I use openmpi compiled

Re: [OMPI users] assert in opal_datatype_is_contiguous_memory_layout

2013-04-05 Thread Eric Chamberland
le-g=yes So, is this a wrong "assert" in openmpi? Is there a real problem to use this code in a "release" mode? Thanks, Eric On 04/05/2013 12:57 PM, Eric Chamberland wrote: Hi all, I have a well working (large) code which is using openmpi 1.6.3 (see config.log h

[OMPI users] Misuse or bug with nested types?

2013-04-22 Thread Eric Chamberland
Hi, I have a problem receiving a vector of a MPI_datatype constructed via MPI_Type_create_struct. It looks like MPI_Send or MPI_Recv doesn't works as expected: some parts of a nested struct in the received buffer are not filled at all!. I tested the code under mpich 3.0.3 and it worked

Re: [OMPI users] MPIIO max record size

2013-05-22 Thread Eric Chamberland
I have experienced the same problem.. and worst, I have discovered a bug in MPI I/O... look here: http://trac.mpich.org/projects/mpich/ticket/1742 and here: http://www.open-mpi.org/community/lists/users/2012/10/20511.php Eric On 05/21/2013 03:18 PM, Tom Rosmond wrote: Hello: A colleague

Re: [OMPI users] MPIIO max record size

2013-05-22 Thread Eric Chamberland
for all distributions... It has been written by Rob Latham. Maybe some developers could confirm this? Eric T. Rosmond On Wed, 2013-05-22 at 11:21 -0400, Eric Chamberland wrote: I have experienced the same problem.. and worst, I have discovered a bug in MPI I/O... look here: http

Re: [OMPI users] MPIIO max record size

2013-05-22 Thread Eric Chamberland
On 05/22/2013 12:37 PM, Ralph Castain wrote: Well, ROMIO was written by Argonne/MPICH (unfair to point the finger solely at Rob) and picked up by pretty much everyone. The issue isn't a bug in MPIIO, but rather Ok, sorry about that! thanks for the historical and technical informations!

[OMPI users] MPI process hangs if OpenMPI is compiled with --enable-thread-multiple -- part II

2013-12-02 Thread Eric Chamberland
Hi, I just open a new "chapter" with the same subject. ;-) We are using OpenMPI 1.6.5 (compiled with --enable-thread-multiple) with Petsc 3.4.3 (on colosse supercomputer: http://www.calculquebec.ca/en/resources/compute-servers/colosse). We observed a deadlock with threads within the openib

[OMPI users] opal_os_dirpath_create: Error: Unable to create the, sub-directory

2014-02-03 Thread Eric Chamberland
Hi, with OpenMPI 1.6.3 I have encountered this error which "randomly" appears: [compile:20089] opal_os_dirpath_create: Error: Unable to create the sub-directory (/tmp/openmpi-sessions-cmpbib@compile_0/55528/0) of (/tmp/openmpi-sessions-cmpbib@compile_0/55528/0/0), mkdir failed [1]

Re: [OMPI users] opal_os_dirpath_create: Error: Unable to create the, sub-directory

2014-02-03 Thread Eric Chamberland
On 02/03/2014 02:49 PM, Ralph Castain wrote: Seems rather odd - is your /tmp by any chance network mounted? No it is a "normal" /tmp: "cd /tmp; df -h ." gives: Filesystem Size Used Avail Use% Mounted on /dev/sda149G 17G 30G 37% / And there is plenty of disk space... I

Re: [OMPI users] opal_os_dirpath_create: Error: Unable to create the, sub-directory

2014-02-03 Thread Eric Chamberland
ve the error message to tell there is a file with the same name of the directory chosen? Or add a new entry to the FAQ to help users find the workaround you proposed... ;-) thanks again! Eric HTH Ralph On Feb 3, 2014, at 12:31 PM, Eric Chamberland <eric.chamberl...@giref.ulaval.ca> wrot

Re: [OMPI users] opal_os_dirpath_create: Error: Unable to create the, sub-directory

2014-02-03 Thread Eric Chamberland
Hi Ralph, On 02/03/2014 04:20 PM, Ralph Castain wrote: On Feb 3, 2014, at 1:13 PM, Eric Chamberland <eric.chamberl...@giref.ulaval.ca> wrote: On 02/03/2014 03:59 PM, Ralph Castain wrote: Very strange - even if you kill the job with SIGTERM, or have processes that segfault, OMPI

Re: [OMPI users] opal_os_dirpath_create: Error: Unable to create the, sub-directory

2014-02-20 Thread Eric Chamberland
Hi Ralph, some new information about this "bug": we got a defective disk on this computer! Then filesystem errors occurred... The disk is now replaced since 2 days and everything seems to work well (the problem re-occurred since the last time I wrote about it). Sorry for bothering! Eric

[OMPI users] Newbi question about MPI_wait vs MPI_wait any

2012-02-29 Thread Eric Chamberland
Hi, I would like to know which of "waitone" vs "waitany" is optimal and of course, will never produce deadlocks. Let's say we have "lNp" processes and they want to send an array of int of length "lNbInt" to process "0" in a non-blocking MPI_Isend (instead of MPI_Gather). Let's say the

Re: [OMPI users] Newbi question about MPI_wait vs MPI_wait any

2012-03-11 Thread Eric Chamberland
Le 2012-03-09 11:16, Jeffrey Squyres a écrit : Sorry for the delay. Answers inline. No problem, thank you for taking the time to read the long example... #4- MPI_WAIT_ANY_VERSION received always the data from processes on the same host. I'm not sure what you mean by that statement.

[OMPI users] 2 GB limitation of MPI_File_write_all

2012-10-19 Thread Eric Chamberland
Hi, I get this error when trying to write 360 000 000 000 MPI_LONG: with Openmpi-1.4.5: ERROR Returned by MPI_File_write_all: 35 ERROR_string Returned by MPI_File_write_all: MPI_ERR_IO: input/output error with Openmpi-1.6.2: ERROR Returned by MPI_File_write_all: 13 ERROR_string Returned by

Re: [OMPI users] 2 GB limitation of MPI_File_write_all

2012-10-21 Thread Eric Chamberland
/02/8100.php http://www.open-mpi.org/community/lists/users/2010/11/14816.php Also, not MPI but C. I wonder if you need to declare "size" as 'long int', or maybe 'long long int', to represent/hold correctly the large value that you want (360,000,000,000 > 2,147,483,647). I hope this

[OMPI users] Lustre hints via environment variables/runtime parameters

2012-12-01 Thread Eric Chamberland
Hi, I am using openmpi 1.6.3 with lustre. I can change the stip count via "striping_unit" but if I try to change the stripe size via "striping_factor", all my options are ignored and fall back on the default values. Here is what I do: = setenv ROMIO_HINTS

Re: [OMPI users] Lustre hints via environment variables/runtime parameters

2012-12-03 Thread Eric Chamberland
On 12/03/2012 03:23 AM, pascal.dev...@bull.net wrote: try with: striping_unit 1048576 striping_factor 16 (stripe_size means striping_unit and stripe_count means striping_factor) Shame on me! ;-) Thank you, it works perfectly now!... Eric

Re: [OMPI users] Romio and OpenMPI builds

2012-12-03 Thread Eric Chamberland
Le 12/03/2012 05:37 PM, Brock Palen a écrit : I was trying to use hints with ROMIO and lustre prompted by another post on this list. I have a simple MPI-IO code and I cannot using the notes I find set the lustre striping using the config file and setting ROMIO_HINTS. Question: How can I

Re: [OMPI users] Romio and OpenMPI builds

2012-12-07 Thread Eric Chamberland
direction, but I am not an "expert"... some expert advice should be welcome. Eric Brock Palen www.umich.edu/~brockp CAEN Advanced Computing bro...@umich.edu (734)936-1985 On Dec 3, 2012, at 7:12 PM, Eric Chamberland wrote: Le 12/03/2012 05:37 PM, Brock Palen a écrit : I was

[OMPI users] Invalid filename?

2013-01-21 Thread Eric Chamberland
Hi, If you try to open a file with a ":" in the filename (ex: "file:o"), you get an MPI_ERR_NO_SUCH_FILE. ERROR Returned by MPI: 42 ERROR_string Returned by MPI: MPI_ERR_NO_SUCH_FILE: no such file or directory Just launch the simple test code attached to see the problem. MPICH has the

Re: [OMPI users] Invalid filename?

2013-01-21 Thread Eric Chamberland
On 01/21/2013 01:00 PM, Reuti wrote: although you can create such files in Linux, it's not portable. http://en.wikipedia.org/wiki/Filename (Reserved characters and words) Best is to use only characters from POSIX portable character set for filenames. Especially as this syntax with a colon is

[OMPI users] Questions about non-blocking collective calls...

2015-10-21 Thread Eric Chamberland
Hi, A long time ago (in 2002) we programmed here a non-blocking MPI_Igather with equivalent calls to MPI_Isend/MPI_Irecv (see the 2 attached files). A very convenient advantage of this version, is that I can do some work on the root process as soon as it start receiving data... Then, it

Re: [OMPI users] Questions about non-blocking collective calls...

2015-10-22 Thread Eric Chamberland
Hi Gilles and Josh, I think my reply apply to both of your answers which I thank you for. On 21/10/15 08:31 PM, Gilles Gouaillardet wrote: Eric, #2 maybe not ... a tree based approach has O(log(n)) scaling (compared to O(n) scaling with your linear method. so at scale, MPI_Igather will

Re: [OMPI users] Questions about non-blocking collective calls...

2015-12-17 Thread Eric Chamberland
Hi Gilles, Le 2015-10-21 20:31, Gilles Gouaillardet a écrit : #3 difficult question ... first, keep in mind there is currently no progress thread in Open MPI. that means messages can be received only when MPI_Wait* or MPI_Test* is invoked. you might hope messages are received when doing some

Re: [OMPI users] Questions about non-blocking collective calls...

2015-12-17 Thread Eric Chamberland
Le 2015-12-17 12:45, Jeff Squyres (jsquyres) a écrit : On Dec 17, 2015, at 8:57 AM, Eric Chamberland <eric.chamberl...@giref.ulaval.ca> wrote: But I would like to know if the MPI I am using is able to do message progression or not: So how do an end-user like me can knows that? Does-i

[OMPI users] Continuous integration question...

2016-06-22 Thread Eric Chamberland
Hi, I would like to do compile+test our code each night with the "latest" openmpi v2 release (or nightly if enough stable). Just to ease the process, I would like to "wget" the latest archive with a "permanent" link... Is it feasible for you to just put a symlink or something like it so I

Re: [OMPI users] Continuous integration question...

2016-06-22 Thread Eric Chamberland
On 22/06/16 01:49 PM, Jeff Squyres (jsquyres) wrote: We have a similar mechanism already (that is used by the Open MPI community for nightly regression testing), but with the advantage that it will give you a unique download filename (vs. "openmpi-v2.x-latest.bz2" every night). Do this:

Re: [OMPI users] Continuous integration question...

2016-06-22 Thread Eric Chamberland
Excellent! I will put all in place, then try both URLs and see which one is "manageable" for me! Thanks, Eric On 22/06/16 02:10 PM, Jeff Squyres (jsquyres) wrote: On Jun 22, 2016, at 2:06 PM, Eric Chamberland <eric.chamberl...@giref.ulaval.ca> wrote: We have a similar m

[OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
Hi, just testing the 3.x branch... I launch: mpirun -n 8 echo "hello" and I get: -- There are not enough slots available in the system to satisfy the 8 slots that were requested by the application: echo Either request

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
Ok, here it is: === first, with -n 8: === mpirun -mca ras_base_verbose 10 --display-allocation -n 8 echo "Hello" [zorg:22429] [[INVALID],INVALID] plm:rsh_lookup on agent ssh : rsh path NULL [zorg:22429] plm:base:set_hnp_name: initial bias 22429 nodename hash

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
Oh, forgot something important, since OpenMPI 1.8.x I am using: export OMPI_MCA_hwloc_base_binding_policy=none Also, I am exporting this since 1.6.x?: export OMPI_MCA_mpi_yield_when_idle=1 Eric On 25/04/17 04:31 PM, Eric Chamberland wrote: Ok, here it is: === first

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
ed by excluding the localhost RAS component by specifying # the value "^localhost" [without the quotes] to the "ras" MCA # parameter). (15:53:52) [zorg]:~> Thanks! Eric On 25/04/17 03:52 PM, r...@open-mpi.org wrote: What is in your hostfile? On Apr 25, 2017, at 11:39 AM, Er

Re: [OMPI users] In preparation for 3.x

2017-04-25 Thread Eric Chamberland
On 25/04/17 04:36 PM, r...@open-mpi.org wrote: add --oversubscribe to the cmd line good, it works! :) Is there an environment variable equivalent to --oversubscribe argument? I can't find this option in near related FAQ entries, should it be added here? :

[OMPI users] SLURM seems to ignore --output-filename option of OpenMPI

2019-09-30 Thread Eric Chamberland via users
Hi, I am using OpenMPI 3.1.2 with slurm 17.11.12 and it looks like I can't have the "--output-filename" option taken into account.  All my outputs are going into slurms output files. Can it be imposed or ignored by a slurm configuration? How is it possible to bypass that? Strangely, the

Re: [OMPI users] SLURM seems to ignore --output-filename option of OpenMPI

2019-10-10 Thread Eric Chamberland via users
nks, Eric Le 2019-09-30 à 3:34 p.m., Eric Chamberland via users a écrit : Hi, I am using OpenMPI 3.1.2 with slurm 17.11.12 and it looks like I can't have the "--output-filename" option taken into account.  All my outputs are going into slurms output files. Can it be imposed o

[OMPI users] Error code for I/O operations

2021-06-30 Thread Eric Chamberland via users
; on those error codes? Thanks, Eric -- Eric Chamberland, ing., M. Ing Professionnel de recherche GIREF/Université Laval (418) 656-2131 poste 41 22 42

[OMPI users] Status of pNFS, CephFS and MPI I/O

2021-09-23 Thread Eric Chamberland via users
supported? Thanks, Eric -- Eric Chamberland, ing., M. Ing Professionnel de recherche GIREF/Université Laval (418) 656-2131 poste 41 22 42

Re: [OMPI users] Status of pNFS, CephFS and MPI I/O

2021-09-23 Thread Eric Chamberland via users
From: users On Behalf Of Eric Chamberland via users Sent: Thursday, September 23, 2021 9:28 AM To: Open MPI Users Cc: Eric Chamberland ; Vivien Clauzon Subject: [OMPI users] Status of pNFS, CephFS and MPI I/O Hi, I am looking around for information about parallel filesystems supported for M

[OMPI users] Segfault in ucp_dt_pack function from UCX library 1.8.0 and 1.11.2 for large sized communications using both OpenMPI 4.0.3 and 4.1.2

2022-06-01 Thread Eric Chamberland via users
k I could export the data for a 512 processes reproducer with PARMetis call only... Thanks for helping, Eric -- Eric Chamberland, ing., M. Ing Professionnel de recherche GIREF/Université Laval (418) 656-2131 poste 41 22 42

Re: [OMPI users] Segfault in ucp_dt_pack function from UCX library 1.8.0 and 1.11.2 for large sized communications using both OpenMPI 4.0.3 and 4.1.2

2022-06-02 Thread Eric Chamberland via users
problems test with OMPI-5.0.x? Regarding the application, at some point it invokes MPI_Alltoallv sending more than 2GB to some of the ranks (using derived dt), right? //WBR, Mikhail *From:* users *On Behalf Of *Eric Chamberland via users *Sent:* Thursday, June 2, 2022 5:31

Re: [OMPI users] Segfault in ucp_dt_pack function from UCX library 1.8.0 and 1.11.2 for large sized communications using both OpenMPI 4.0.3 and 4.1.2

2022-06-02 Thread Eric Chamberland via users
pecific call, but I am not sure it is sending 2GB to a specific rank but maybe have 2GB divided between many rank.  The fact is that this part of the code, when it works, does not create such a bump in memory usage...  But I have to dig a bit more... Regards, Eric //WBR, Mikhail *From:* users

Re: [OMPI users] Segfault in ucp_dt_pack function from UCX library 1.8.0 and 1.11.2 for large sized communications using both OpenMPI 4.0.3 and 4.1.2

2022-06-10 Thread Eric Chamberland via users
Eric On 2022-06-01 23:31, Eric Chamberland via users wrote: Hi, In the past, we have successfully launched large sized (finite elements) computations using PARMetis as mesh partitioner. It was first in 2012 with OpenMPI (v2.?) and secondly in March 2019 with OpenMPI 3.1.2 that we succeed

Re: [OMPI users] MPI I/O, Romio vs Ompio on GPFS

2022-06-11 Thread Eric Chamberland via users
module ompio What else can I do to dig into this? Are there parameters ompio is aware of with GPFS? Thanks, Eric -- Eric Chamberland, ing., M. Ing Professionnel de recherche GIREF/Université Laval (418) 656-2131 poste 41 22 42 On 2022-06-10 16:23, Eric Chamberland via users wrote: Hi,

[OMPI users] MPI I/O, ROMIO and showing io mca parameters at run-time

2022-06-10 Thread Eric Chamberland via users
"ompi_info --all" gives me... I have tried this: mpiexec --mca io romio321  --mca mca_verbose 1  --mca mpi_show_mca_params 1 --mca io_base_verbose 1 ... But I cannot see anything about io coming out... With "ompi_info" I do... Is it possible? Thanks, Eric --

[OMPI users] CephFS and striping_factor

2022-11-28 Thread Eric Chamberland via users
e/cephfs/libcephfs.h Is it a possible futur enhancement? Thanks, Eric -- Eric Chamberland, ing., M. Ing Professionnel de recherche GIREF/Université Laval (418) 656-2131 poste 41 22 42

[OMPI users] How to force striping_factor (on lustre or other FS)?

2022-11-25 Thread Eric Chamberland via users
d   objid   group    67   357367195 0x154cfd9b    0 but still have only a striping_factor of 1 on the created file... Thanks, Eric -- Eric Chamberland, ing., M. Ing Professionnel de recherche GIREF/Université Laval (418) 656-2131 poste 41 22 42