Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-18 Thread Gilles Gouaillardet via users

Dave,


On 1/19/2021 2:13 AM, Dave Love via users wrote:


Generally it's not surprising if there's a shortage
of effort when outside contributions seem unwelcome.  I've tried to
contribute several times.  The final attempt wasted two or three days,
after being encouraged to get the port of current romio into a decent
state when it was being done separately "behind the scenes", but that
hasn't been released.


External contributions are not only welcome, they are encouraged.

All Pull Requests will be considered for inclusion upstream

(as long as the commits are properly signed-off).

You could not be more wrong on that part, and since you chose to bring 
your this to the public mailing list,


let me recap the facts:



ROMIO is refreshed when needed (and time allows it)

All code changes are coming from public Pull Requests.

For example :

 - ROMIO 3.3.2 refresh (https://github.com/open-mpi/ompi/pull/8249 - 
issued on November 24th 2020)


 - ROMIO 3.4b1 refresh (https://github.com/open-mpi/ompi/pull/8279 - 
issued December 10th 2020)


 - ROMIO 3.4 refresh (https://github.com/open-mpi/ompi/pull/8343 - 
January 6th 2021)



On the other hand, this is what you did:

on December 2nd you wrote to the ML:


In the meantime I've hacked in romio from mpich-4.3b1 without really
understanding what I'm doing;
and finally  posted a link to your code on December 11th (and detailed a 
shortcut you took), before deleting your repository (!) around December 
16th.


Unless I missed it, you never issued a Pull Request.





It took some time to figure out upstream ROMIO 3.3.2 did not pass the 
HDF5 test on Lustre,


and a newer ROMIO (3.4b1 at that time, 3.4 now) had to be used in order 
to fix the issue on the long term.


All the heavy lifting was already done in #8249, very likely before you 
even start hacking, and moving to 3.4b1


and 3.4 was then  very straightforward.


ROMIO 3.4 refresh will be merged in the master branch once properly 
tested and reviewed, and the goal


is to have this available in Open MPI 5.

ROMIO fixes will be applied to the release branches (and they are 
available at https://github.com/open-mpi/ompi/pull/8371)


once tested and reviewed.



Bottom line, your "hack" is the only one that was actually done behind 
the scene,


and has returned there since.

All pull requests are welcome, with- as far as I am concerned - the 
following caveat (besided signed-off commits):


Open MPI is a meritocracy.

If you had issued a proper PR (you did not, but chose to post a - now 
broken - link to your code instead),


it would likely have been rejected based on its (lack of) merits.


There are many ways to contribute to Open MPI, and in this case, 
testing/discussing the Pull Requests/Issues on github


would have been (and will be) very helpful to the Open MPI community.

On the contrary, ranting and bragging on a public ML are - in my not so 
humble opinion - counter productive, but I have a pretty high threshold


for this kind of BS. However, I have a much lower threshold for your 
gross mischaracterization of the Open MPI community, its values, and how 
the work gets done.




Cheers,


Gilles



Re: [OMPI users] 4.1 mpi-io test failures on lustre

2021-01-18 Thread Dave Love via users
"Gabriel, Edgar via users"  writes:

>> How should we know that's expected to fail?  It at least shouldn't fail like 
>> that; set_atomicity doesn't return an error (which the test is prepared for 
>> on a filesystem like pvfs2).  
>> I assume doing nothing, but appearing to, can lead to corrupt data, and I'm 
>> surprised that isn't being seen already.
>> HDF5 requires atomicity -- at least to pass its tests -- so presumably 
>> anyone like us who needs it should use something mpich-based with recent or 
>> old romio, and that sounds like most general HPC systems.  
>> Am I missing something?
>> With the current romio everything I tried worked, but we don't get that 
>> option with openmpi.
>
> First of all, it is mentioned on the FAQ sites of Open MPI, although
> admittedly it is not entirely update (it lists external32 support also
> as missing, which is however now available since 4.1).

Yes, the FAQ was full of confusing obsolete material when I last looked.
Anyway, users can't be expected to check whether any particular
operation is expected to fail silently.  I should have said that
MPI_File_set_atomicity(3) explicitly says the default is true for
multiple nodes, and doesn't say the call is a no-op with the default
implementation.  I don't know whether the MPI spec allows not
implementing it, but I at least expect an error return if it doesn't.
As far as I remember, that's what romio does on a filesystem like pvfs2
(or lustre when people know better than implementers and insist on
noflock); I mis-remembered from before, thinking that ompio would be
changed to do the same.  From that thread, I did think atomicity was on
its way.

Presumably an application requests atomicity for good reason, and can
take appropriate action if the status indicates it's not available on
that filesystem.

> You don't need atomicity for the HDF5 tests, we are passing all of them to 
> the best my knowledge, and this is one of the testsuites that we do run 
> regularly as part of our standard testing process.

I guess we're just better at breaking things.

> I am aware that they have an atomicity test -  which we pass for whatever 
> reason. This highlight also btw the issue(s) that I am having with the 
> atomicity option in MPI I/O. 

I don't know what the application is of atomicity in HDF5.  Maybe it
isn't required for typical operations, but I assume it's not used
blithely.  However, I'd have thought HDF5 should be prepared for
something like pvfs2, and at least not abort the test at that stage.

I've learned to be wary of declaring concurrent systems working after a
few tests.  In fact, the phdf5 test failed for me like this when I tried
across four lustre client nodes with 4.1's defaults.  (I'm confused
about the striping involved, because I thought I set it to four, and now
it shows as one on that directory.)

  ...
  Testing  -- dataset atomic updates (atomicity) 
  Proc 9: *** Parallel ERRProc 54: *** Parallel ERROR ***
  VRFY (H5Sset_hyperslab succeeded) failed at line 4293 in t_dset.c
  aborting MPI proceProc 53: *** Parallel ERROR ***

Unfortunately I hadn't turned on backtracing, and I wouldn't get another
job trough for a while.

> The entire infrastructure to enforce atomicity is actually in place in ompio, 
> and I can give you the option on how to enforce strict atomic behavior for 
> all files in ompio (just not on a per file basis), just be aware that the 
> performance will nose-dive. This is not just the case with ompio, but also in 
> romio, you can read up on that various discussion boards on that topic, look 
> at NFS related posts (where you need the atomicity for correctness in 
> basically all scenarios).

I'm fairly sure I accidentally ran tests successfully on NFS4, at least
single-node.  I never found a good discussion of the topic, and what I
have seen about "NFS" was probably specific to NFS3 and non-POSIX
compliance, though I don't actually care about parallel i/o on NFS.  The
information we got about lustre was direct from Rob Latham, as nothing
showed up online.

I don't like fast-but-wrong, so I think there should be the option of
correctness, especially as it's the documented default.

> Just as another data point, in the 8+ years that ompio has been available, 
> there was not one issue reported related to correctness due to missing the 
> atomicity option.

Yes, I forget some history over the years, like that one on a local
filesystem:
.

> That being said, if you feel more comfortable using romio, it is completely 
> up to you. Open MPI offers this option, and it is incredibly easy to set the 
> default parameters on a  platform for all users such that romio is being used.

Unfortunately that option fails the tests.

> We are doing with our limited resources the best we can, and while ompio is 
> by no means perfect, we try to be responsive to issues reported by users and 
> value constructive feedback and