Again, John, I'm not convinced your last statement is true. However, I think it
is "good enough" for now as it seems to work for you and it isn't seen outside
of a debugger scenario.
On Nov 12, 2019, at 3:13 PM, John DelSignore via devel
mailto:devel@lists.open-mpi.org> > wrote:
Hi Austen,
Hi Austen,
Thanks very much, the issues you show below do indeed describe what I am seeing.
Using printfs and breakpoints I inserted into the _release_fn() function, I was
able to see that with OMPI 4.0.1, at most one of the MPI processes called the
function. Most of the time rank 0 would be
George beat me to the response - I agree entirely with his statement. Let's not
go down a deadend here.
Personally, I have never been entirely comfortable with the claim that the PMIx
modification was the solution to the problem being discussed here. We have
never seen a report of an
As indicated by this discussion, the proper usage of volatile is certainly
misunderstood.
However, the usage of the volatile we are doing in this particular
instance is correct and valid even in multi-threaded cases. We are using it
for a __single__ trigger, __one way__ synchronization similar
I agree that the use of volatile is insufficient if we want to adhere to
proper multi-threaded programming standards:
"Note that volatile variables are not suitable for communication between
threads; they do not offer atomicity, synchronization, or memory ordering.
A read from a volatile
"allowing us to weakly synchronize two threads" concerns me if the
synchronization is important or must be reliable. I do not understand how
volatile alone provides reliable synchronization without a mechanism to order
visible changes to memory. If the flag(s) in question are suppposed to
If the issue was some kind of memory consistently between threads, then
printing that variable in the context of the debugger would show the value
of debugger_event_active being false.
volatile is not a memory barrier, it simply forces a load for each access
of the data, allowing us to weakly
I think you are hitting this issue here in 4.0.1:
https://github.com/open-mpi/ompi/issues/6613
MPIR was broken in 4.0.1 due to a race condition in PMIx. It was patched,
it looks to me, for 4.0.2. Here is the openpmix issue:
https://github.com/openpmix/openpmix/issues/1189
I think this lines up
Yes that was an omission on my part.
Regarding volatile being sufficient - I don't think that is the case in all
situations. It might work under most conditions - but it can lead to the
"it works on my machine..." type of bugs. In particular it doesn't
guarantee that the waiting thread will ever
Hi Austen,
Thanks for the reply. What I am seeing is consistent with your thought, in that
when I see the hang, one or more processes did not have a flag updated. I don't
understand how the Open MPI code works well enough to say if it is a memory
barrier problem or not. It almost looks like a
Just to be clear as well: you cannot use the pthread method you propose because
you must loop over opal_progress - the "usleep" is in there simply to avoid
consuming 100% cpu while we wait.
On Nov 12, 2019, at 8:52 AM, George Bosilca via devel mailto:devel@lists.open-mpi.org> > wrote:
I don't
I don't think there is a need any protection around that variable. It will
change value only once (in a callback triggered from opal_progress), and
the volatile guarantees that loads will be issued for every access, so the
waiting thread will eventually notice the change.
George.
On Tue, Nov
Could it be that some processes are not seeing the flag get updated? I
don't think just using a simple while loop with a volatile variable is
sufficient in all cases in a multi-threaded environment. It's my
understanding that the volatile keyword just tells the compiler to not
optimize or do
> On Nov 11, 2019, at 4:53 PM, Gilles Gouaillardet via devel
> wrote:
>
> John,
>
> OMPI_LAZY_WAIT_FOR_COMPLETION(active)
>
>
> is a simple loop that periodically checks the (volatile) "active" condition,
> that is expected to be updated by an other thread.
> So if you set your breakpoint
14 matches
Mail list logo