Re: Semantics of IORefs in GHC

2016-03-14 Thread Ben Lippmeier

> On 14 Mar 2016, at 8:06 pm, Simon Peyton Jones  wrote:
> 
> But my rough answer would be: IORefs are really only meant for 
> single-threaded work.  Use STM for concurrent communication.

You can also use atomicModifyIORef for simple things. 


> Why can’t GHC tighten the semantics of IORefs so that the bind operation 
> simply means sequential composition?

Part of the problem is that IO encomposes all sorts of computational effects, 
including ones that don’t have a well defined notion of time. File and network 
effects are also a problem, not just IORefs. The IO instance of bind composes 
two IO computations, but the computations themselves could do anything.

A first step is to split the IO type into more fine grained effects, perhaps 
ones that can be properly sequentialized and those which can’t (or should not 
be). Many people have done work on more expressive effect systems, though no 
system so far has been good enough to want to refactor the GHC base libraries 
using it.

On a more philosophical level, Haskell types are statements in a simple 
predicate logic which does not natively know anything about time. Functions 
don’t know about time either, so it’s a bit odd to ask a functional operator to 
do something sequential (at least relative to the real world). In the “awkward 
squad” paper note that the functional encoding of the bind operator *passes* 
the world from one place to another -- it is not part of the world, and does 
not act upon the world itself.

In recent work on effect systems there are a lot of embeddings of modal logics 
into the ambient Haskell/predicate logic, and the embeddings then suffer an 
encoding overhead. AFAIK the future lies in type systems that natively express 
temporal concepts, rather than needing tricky encodings of them, but we’re not 
there yet.

Ben.


___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


Re: Semantics of IORefs in GHC

2016-03-14 Thread Ryan Newton
Hi Madan,

Yes, GHC, like Java, is an "unsafe language" currently ;-).  It does not
protect its abstractions against buggy programs.  Namely both Haskell and
Java do not protect semi-colon/(>>).

Really, we should probably have some additional monad to distinguish
data-race-free (DRF) IO programs from other IO programs that may have data
races.

Or we could just make sequential consistency the law of the land for IO, as
you propose.  One thing I'd really like to work on -- if someone were
interested in collaborating -- is testing the overhead of making sequential
composition the norm in this way.

I had a nice conversation with your SNAPL co-author Satish about this
recently.  It seems like we could survey a broad swath of Haskell code to
see if fine-grained use of "writeIORef" and "readIORef" are at all common.
My hunch is that IORefs used in concurrent apps (web servers, etc), all use
atomicModifyIORef anyway.

Thus like you I think we could fence read/writeIORef with acceptable perf
for most real apps.  You can still have "reallyUnsafeReadIORef" for
specific purposes like implementing concurrent data structures.

Best,
 -Ryan



On Mon, Mar 14, 2016 at 10:06 AM, Simon Peyton Jones <simo...@microsoft.com>
wrote:

> Maclan
>
>
>
> I’m glad you enjoyed the awkward squad paper.
>
>
>
> I urge you to write to the Haskell Café mailing list and/or ghc-devs.
> Lots of smart people there.  Ryan Newton is working on this kind of stuff;
> I’ve cc’d him.
>
>
>
> But my rough answer would be: IORefs are really only meant for
> single-threaded work.  Use STM for concurrent communication.
>
>
>
> That’s not to say that we are done!  The Haskell community doesn’t have
> many people like you, who care about the detail of the memory model.  So
> please do help us J.   For example, perhaps we could guarantee a simple
> sequential memory model without much additional cost?
>
>
>
> Simon
>
>
>
> *From:* Madan Musuvathi
> *Sent:* 11 March 2016 19:35
> *To:* Simon Peyton Jones <simo...@microsoft.com>
> *Subject:* Semantics of IORefs in GHC
>
>
>
> Dear Simon,
>
> I really enjoyed reading your awkward squad paper
> <http://research.microsoft.com/en-us/um/people/simonpj/papers/marktoberdorf/mark.pdf>.
> Thank you for writing such an accessible paper.
>
>
>
> My current understanding is that the implementation of IORefs in GHC
> breaks the simple semantics you develop in this paper. In particular, by
> not inserting sufficient fences around reads and writes of IORefs, a
> Haskell program is exposed to the weak-memory-consistency effects of the
> underlying hardware and possibly the backend C compiler. As a result, the
> monadic bind operator no longer has the simple semantics of sequential
> composition. Is my understanding correct?
>
>
>
> This is very troublesome as this weaker semantics can lead to unforeseen
> consequences even in pure functional parts of a program. For example, when
> a reference to an object is passed through an IORef to another thread, the
> latter thread is not guaranteed to see the updates of the first thread. So,
> it is quite possible for some (pure functional) code to be processing
> objects with broken invariants or partially-constructed objects. In the
> extreme, this could lead to type-unsafety unless the GHC compiler is taking
> careful precautions to avoid this. (Many of these problems are unlikely to
> show up on x86 machines, but will be common on ARM.)
>
>
>
> I am sure the GHC community is addressing these problems one way or the
> other. But, my question is WHY?  Why can’t GHC tighten the semantics of
> IORefs so that the bind operation simply means sequential composition?
> Given that Haskell has a clean separation between pure functional parts and
> “awkward” parts of the program, the overheads of these fenced IORefs should
> be acceptable.
>
>
>
> My coauthors and I wrote a recent SNAPL article
> <http://research.microsoft.com/apps/pubs/default.aspx?id=252150> about
> this problem for other (“less-beautiful” J) imperative languages like C#
> and Java. I really believe we should support sequential composition in our
> programming languages.
>
>
>
> madan
>
>
>
___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs


RE: Semantics of IORefs in GHC

2016-03-14 Thread Simon Peyton Jones
Maclan

I’m glad you enjoyed the awkward squad paper.

I urge you to write to the Haskell Café mailing list and/or ghc-devs.  Lots of 
smart people there.  Ryan Newton is working on this kind of stuff; I’ve cc’d 
him.

But my rough answer would be: IORefs are really only meant for single-threaded 
work.  Use STM for concurrent communication.

That’s not to say that we are done!  The Haskell community doesn’t have many 
people like you, who care about the detail of the memory model.  So please do 
help us :).   For example, perhaps we could guarantee a simple sequential 
memory model without much additional cost?

Simon

From: Madan Musuvathi
Sent: 11 March 2016 19:35
To: Simon Peyton Jones <simo...@microsoft.com>
Subject: Semantics of IORefs in GHC

Dear Simon,
I really enjoyed reading your awkward squad 
paper<http://research.microsoft.com/en-us/um/people/simonpj/papers/marktoberdorf/mark.pdf>.
 Thank you for writing such an accessible paper.

My current understanding is that the implementation of IORefs in GHC breaks the 
simple semantics you develop in this paper. In particular, by not inserting 
sufficient fences around reads and writes of IORefs, a Haskell program is 
exposed to the weak-memory-consistency effects of the underlying hardware and 
possibly the backend C compiler. As a result, the monadic bind operator no 
longer has the simple semantics of sequential composition. Is my understanding 
correct?

This is very troublesome as this weaker semantics can lead to unforeseen 
consequences even in pure functional parts of a program. For example, when a 
reference to an object is passed through an IORef to another thread, the latter 
thread is not guaranteed to see the updates of the first thread. So, it is 
quite possible for some (pure functional) code to be processing objects with 
broken invariants or partially-constructed objects. In the extreme, this could 
lead to type-unsafety unless the GHC compiler is taking careful precautions to 
avoid this. (Many of these problems are unlikely to show up on x86 machines, 
but will be common on ARM.)

I am sure the GHC community is addressing these problems one way or the other. 
But, my question is WHY?  Why can’t GHC tighten the semantics of IORefs so that 
the bind operation simply means sequential composition? Given that Haskell has 
a clean separation between pure functional parts and “awkward” parts of the 
program, the overheads of these fenced IORefs should be acceptable.

My coauthors and I wrote a recent SNAPL 
article<http://research.microsoft.com/apps/pubs/default.aspx?id=252150> about 
this problem for other (“less-beautiful” :)) imperative languages like C# and 
Java. I really believe we should support sequential composition in our 
programming languages.

madan

___
ghc-devs mailing list
ghc-devs@haskell.org
http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs