[Haskell-cafe] Re: Can you do everything without shared-memory concurrency?

Aaron Denney Thu, 11 Sep 2008 19:18:07 -0700

On 2008-09-10, David Roundy <[EMAIL PROTECTED]> wrote:
> On Wed, Sep 10, 2008 at 03:30:50PM +0200, Jed Brown wrote:
>> On Wed 2008-09-10 09:05, David Roundy wrote:
>> > I should point out, however, that in my experience MPI programming
>> > involves deadlocks and synchronization handling that are at least as
>> > nasty as any I've run into doing shared-memory threading.
>>
>> Absolutely, avoiding deadlock is the first priority (before error
>> handling).  If you use the non-blocking interface, you have to be very
>> conscious of whether a buffer is being used or the call has completed.
>> Regardless, the API requires the programmer to maintain a very clear
>> distinction between locally owned and remote memory.
>
> Even with the blocking interface, you had subtle bugs that I found
> pretty tricky to deal with.  e.g. the reduce functions in lam3 (or was
> it lam4) at one point didn't actually manage to result in the same
> values on all nodes (with differences caused by roundoff error), which
> led to rare deadlocks, when it so happened that two nodes disagreed as
> to when a loop was completed.  Perhaps someone made the mistake of
> assuming that addition was associative, or maybe it was something
> triggered by the non-IEEE floating point we were using.  But in any
> case, it was pretty nasty.  And it was precisely the kind of bug that
> won't show up except when you're doing something like MPI where you
> are pretty much forced to assume that the same (pure!) computation has
> the same effect on each node.


Ah, okay.  I think that's a real edge case, and probably not how most
use MPI.  I've used both threads and MPI; MPI, while cumbersome, never
gave me any hard-to-debug deadlock problems.

-- 
Aaron Denney
-><-

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

[Haskell-cafe] Re: Can you do everything without shared-memory concurrency?

Reply via email to