Re: [OMPI users] mca_sharedfp_lockfile issues

2021-11-02 Thread Gabriel, Edgar via users
What file system are you running your code on ? And is the same directory shared across all nodes? I have seen this error if users try to use a non-shared directory for MPI I/O operations ( e.g. /tmp which is a different drive/folder on each node). Thanks Edgar -Original Message- From

Re: [OMPI users] Status of pNFS, CephFS and MPI I/O

2021-09-23 Thread Gabriel, Edgar via users
collective I/O for example). -Original Message- From: users On Behalf Of Gabriel, Edgar via users Sent: Thursday, September 23, 2021 5:31 PM To: Eric Chamberland ; Open MPI Users Cc: Gabriel, Edgar ; Louis Poirel ; Vivien Clauzon Subject: Re: [OMPI users] Status of pNFS, CephFS and MPI I/O

Re: [OMPI users] Status of pNFS, CephFS and MPI I/O

2021-09-23 Thread Gabriel, Edgar via users
data in order to improve performance of their code, since most I/O patterns do not require this super-strict locking behavior. This is the fs_ufs_lock_algorithm parameter. Thanks Edgar Thanks, Eric On 2021-09-23 1:57 p.m., Gabriel, Edgar wrote: > Eric, > > generally speaking, ompio

Re: [OMPI users] Status of pNFS, CephFS and MPI I/O

2021-09-23 Thread Gabriel, Edgar via users
Eric, generally speaking, ompio should be able to operate correctly on all file systems that have support for POSIX functions. The generic ufs component is for example being used on BeeGFS parallel file systems without problems, we are using that on a daily basis. For GPFS, the only reason we

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-19 Thread Gabriel, Edgar via users
work on an update of the FAQ section. -Original Message- From: users On Behalf Of Dave Love via users Sent: Monday, January 18, 2021 11:14 AM To: Gabriel, Edgar via users Cc: Dave Love Subject: Re: [OMPI users] 4.1 mpi-io test failures on lustre "Gabriel, Edgar via users&quo

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-15 Thread Gabriel, Edgar via users
I would like to correct one of my statements: -Original Message- From: users On Behalf Of Gabriel, Edgar via users Sent: Friday, January 15, 2021 7:58 AM To: Open MPI Users Cc: Gabriel, Edgar Subject: Re: [OMPI users] 4.1 mpi-io test failures on lustre > The entire infrastructure

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-15 Thread Gabriel, Edgar via users
-Original Message- From: users On Behalf Of Dave Love via users Sent: Friday, January 15, 2021 4:48 AM To: Gabriel, Edgar via users Cc: Dave Love Subject: Re: [OMPI users] 4.1 mpi-io test failures on lustre > How should we know that's expected to fail? It at least shouldn

Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-14 Thread Gabriel, Edgar via users
I will have a look at those tests. The recent fixes were not correctness, but performance fixes. Nevertheless, we used to pass the mpich tests, but I admit that it is not a testsuite that we run regularly, I will have a look at them. The atomicity tests are expected to fail, since this the one c

Re: [OMPI users] Parallel HDF5 low performance

2020-12-03 Thread Gabriel, Edgar via users
the reason for potential performance issues on NFS are very different from Lustre. Basically, depending on your use-case and the NFS configuration, you have to enforce different locking policy to ensure correct output files. The default value for chosen for ompio is the most conservative setting

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-26 Thread Gabriel, Edgar via users
I will have a look at the t_bigio tests on Lustre with ompio. We had from collaborators some reports about the performance problems similar to the one that you mentioned here (which was the reason we were hesitant to make ompio the default on Lustre), but part of the problem is that we were not

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-16 Thread Gabriel, Edgar via users
the --with-lustre option twice, once inside of the "--with-io-romio-flags=" (along with the option that you provided), and once outside (for ompio). Thanks Edgar -Original Message- From: Mark Dixon Sent: Monday, November 16, 2020 8:19 AM To: Gabriel, Edgar via users Cc: Gabr

Re: [OMPI users] MPI-IO on Lustre - OMPIO or ROMIO?

2020-11-16 Thread Gabriel, Edgar via users
this is in theory still correct, the default MPI I/O library used by Open MPI on Lustre file systems is ROMIO in all release versions. That being said, ompio does have support for Lustre as well starting from the 2.1 series, so you can use that as well. The main reason that we did not switch to

Re: [OMPI users] ompe support for filesystems

2020-11-04 Thread Gabriel, Edgar via users
the ompio software infrastructure has multiple frameworks. fs framework: abstracts out file system level operations (open, close, etc) fbtl framework: provides the abstractions and implementations of *individual* file I/O operations (seek,read,write, iread,iwrite) fcoll framework: provides the

Re: [OMPI users] MPI I/O question using MPI_File_write_shared

2020-06-05 Thread Gabriel, Edgar via users
Your code looks correct, and based on your output I would actually suspect that the I/O part finished correctly, the error message that you see is not an IO error, but from the btl (which is communication related). What version of Open MPI are using, and on what file system? Thanks Edgar -

Re: [OMPI users] Slow collective MPI File IO

2020-04-06 Thread Gabriel, Edgar via users
achieved in this scenario, as long as the number of processes are moderate. Thanks Edgar From: Dong-In Kang Sent: Monday, April 6, 2020 9:34 AM To: Collin Strassburger Cc: Open MPI Users ; Gabriel, Edgar Subject: Re: [OMPI users] Slow collective MPI File IO Hi Collin, It is written in C. So, I

Re: [OMPI users] Slow collective MPI File IO

2020-04-06 Thread Gabriel, Edgar via users
Hi, A couple of comments. First, if you use MPI_File_write_at, this is usually not considered collective I/O, even if executed by multiple processes. MPI_File_write_at_all would be collective I/O. Second, MPI I/O can not do ‘magic’, but is bound by hardware that you are providing. If already a

Re: [OMPI users] How to prevent linking in GPFS when it is present

2020-03-30 Thread Gabriel, Edgar via users
ompio only added recently support for gpfs, and its only available in master (so far). If you are using any of the released versions of Open MPI (2.x, 3.x, 4.x) you will not find this feature in ompio yet. Thus, the issue is only how to disable gpfs in romio. I could not find right away an optio

Re: [OMPI users] Read from file performance degradation whenincreasing number of processors in some cases

2020-03-06 Thread Gabriel, Edgar via users
How is the performance if you leave a few cores for the OS, e,g. running with 60 processes instead of 64? Reasoning being that the file read operation is really executed by the OS, and could potentially be quite resource intensive. Thanks Edgar From: users On Behalf Of Ali Cherry via users Sen

Re: [OMPI users] Help with One-Sided Communication: Works in Intel MPI, Fails in Open MPI

2020-02-24 Thread Gabriel, Edgar via users
I am not an expert for the one-sided code in Open MPI, I wanted to comment briefly on the potential MPI -IO related item. As far as I can see, the error message “Read -1, expected 48, errno = 1” does not stem from MPI I/O, at least not from the ompio library. What file system did you use for t

Re: [OMPI users] Deadlock in netcdf tests

2019-10-26 Thread Gabriel, Edgar via users
Orion, It might be a good idea. This bug is triggered from the fcoll/two_phase component (and having spent just two minutes in looking at it, I have a suspicion what triggers it, namely in int vs. long conversion issue), so it is probably unrelated to the other one. I need to add running the ne

Re: [OMPI users] Deadlock in netcdf tests

2019-10-25 Thread Gabriel, Edgar via users
Never mind, I see it in the backtrace :-) Will look into it, but am currently traveling. Until then, Gilles suggestion is probably the right approach. Thanks Edgar > -Original Message- > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Gabriel, > Edgar via use

Re: [OMPI users] Deadlock in netcdf tests

2019-10-25 Thread Gabriel, Edgar via users
Orion, I will look into this problem, is there a specific code or testcase that triggers this problem? Thanks Edgar > -Original Message- > From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of Orion > Poplawski via users > Sent: Thursday, October 24, 2019 11:56 PM > To: Open

Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI 3.1.3

2019-02-21 Thread Gabriel, Edgar
nt: Thursday, February 21, 2019 1:59 PM > To: Open MPI Users > Subject: Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI > 3.1.3 > > > > On Feb 21, 2019, at 2:52 PM, Gabriel, Edgar > wrote: > > > >> -Original Message- > >

Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI 3.1.3

2019-02-21 Thread Gabriel, Edgar
gt;>> -- > >>> ____ > >>> || \\UTGERS, > >>> |---*O*--- > >>> ||_// the State| Ryan Novosielski - novos...@rutgers.edu > >>> || \\ University | Sr. Techno

Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI 3.1.3

2019-02-20 Thread Gabriel, Edgar
| Ryan Novosielski - novos...@rutgers.edu > > || \\ University | Sr. Technologist - 973/972.0922 (2x0922) ~*~ RBHS > Campus > > || \\of NJ | Office of Advanced Research Computing - MSB C630, > Newark > > `' > > > >> On Feb 18, 2019, at

Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI 3.1.3

2019-02-18 Thread Gabriel, Edgar
pi.org] On Behalf Of > Gabriel, Edgar > Sent: Sunday, February 17, 2019 10:34 AM > To: Open MPI Users > Subject: Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI > 3.1.3 > > I will also run our testsuite and the HDF5 testsuite on GPFS, I have access >

Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI 3.1.3

2019-02-17 Thread Gabriel, Edgar
, 2019 at 3:08 AM Ryan Novosielski > > wrote: > >> > >> I verified that it makes it through to a bash prompt, but I’m a > >> little less confident that something make test does doesn’t clear it. > >> Any recommendation for a way to verify? > >> >

Re: [OMPI users] HDF5 1.10.4 "make check" problems w/OpenMPI 3.1.3

2019-02-16 Thread Gabriel, Edgar
What file system are you running on? I will look into this, but it might be later next week. I just wanted to emphasize that we are regularly running the parallel hdf5 tests with ompio, and I am not aware of any outstanding items that do not work (and are supposed to work). That being said, I r

Re: [OMPI users] Building OpenMPI with Lustre support using PGI fails

2018-11-27 Thread Gabriel, Edgar
Gilles submitted a patch for that, and I approved it a couple of days back, I *think* it has not been merged however. This was a bug in the Open MPI Lustre configure logic, should be fixed after this one however. https://github.com/open-mpi/ompi/pull/6080 Thanks Edgar > -Original Message--

Re: [OMPI users] ompio on Lustre

2018-10-15 Thread Gabriel, Edgar
Dave, Thank you for your detailed report and testing, that is indeed very helpful. We will definitely have to do something. Here is what I think would be potentially doable. a) if we detect a Lustre file system without flock support, we can printout an error message. Completely disabling MPI I/O

Re: [OMPI users] ompio on Lustre

2018-10-10 Thread Gabriel, Edgar
> -Original Message- > From: Dave Love [mailto:dave.l...@manchester.ac.uk] > Sent: Wednesday, October 10, 2018 3:46 AM > To: Gabriel, Edgar > Cc: Open MPI Users > Subject: Re: [OMPI users] ompio on Lustre > > "Gabriel, Edgar" writes: > > > Ok,

Re: [OMPI users] ompio on Lustre

2018-10-09 Thread Gabriel, Edgar
in place, but we didn't have the manpower (and requests to be honest) to finish that work. Thanks Edgar > -Original Message- > From: Dave Love [mailto:dave.l...@manchester.ac.uk] > Sent: Tuesday, October 9, 2018 7:05 AM > To: Gabriel, Edgar > Cc: Open MPI Users

Re: [OMPI users] ompio on Lustre

2018-10-08 Thread Gabriel, Edgar
Hm, thanks for the report, I will look into this. I did not run the romio tests, but the hdf5 tests are run regularly and with 3.1.2 you should not have any problems on a regular unix fs. How many processes did you use, and which tests did you run specifically? The main tests that I execute from

Re: [OMPI users] ompio on Lustre

2018-10-05 Thread Gabriel, Edgar
It was originally for performance reasons, but this should be fixed at this point. I am not aware of correctness problems. However, let me try to clarify your question about: What do you precisely mean by "MPI I/O on Lustre mounts without flock"? Was the Lustre filesystem mounted without flock?