Re: Where STM is unstable at the moment, and how we can fix it

Simon Marlow Mon, 01 Sep 2008 02:39:20 -0700

Sterling Clover wrote:

This email is inspired by the discussion here:http://hackage.haskell.org/trac/ghc/ticket/2401
As the ticket discusses, unsafeIOToSTM is, unlike unsafePerformIO orunsafeInterleaveIO, genuinely completely unsafe in that there is no wayto use it such that a segfault or deadlock is not at least somewhatencouraged. The code attached to the ticket creates a deadlock solelythrough using it to write to stdout. But, for the same reason thatunsafeIOToSTM is unstable, unsafeInterleaveIO now is very unstable aswell -- conceivably, data generated from functions with lazy IO(including those in the prelude) could cause deadlocks within STM, andeven segfaults.
In summary, a "validation" step is performed on all threads insideatomically blocks during garbage collection. This validation step will,on encountering invalid threads (i.e. ones which should be rolled back)immediately kill them dead and retry. This is different than theimplementation described in the STM paper, where rollbacks only occur oncommit. However, it does add a measure of efficiency.

Its not just an efficiency trick, in fact. The validation step isabsolutely necessary for correctness. The problem is that a transactionmay have seen an inconsistent view of memory, and as a result it may havegone into an infinite loop; the only way to catch and recover from thissituation is to validate at regular intervals, say before a GC (thissuffers from the problem that the transaction has to be allocating in orderto be stopped, but that's another matter). e.g. the code might besomething like


  atomically $ do
    a <- readTVar ta
    b <- readTVar tb
    if a == b then loop else return ()

now we might know that a is never equal to b under normal conditions: allthe transactions in the program satisfy the invariant. However, since weuse optimistic concurrency, it might be the case that this thread sees aninconsistent view of memory in which a==b. The case would normally becaught at commit time, but this thread isn't going to commit: it goes intoan infinite loop instead.

As Simon M. notes, the obvious solution would be to turn rollbacks intoregular exceptions, but this would open a number of cans of worms.
A start, though not sufficient, would be for stm validation to respectblocked status -- not to block on it, obviously, but simply to refuse torollback a transaction within it.

That wouldn't be correct, because the thread might be in an infinite loopinside a block. However, it would probably work in the cases you'reinterested in, so I wouldn't object to a patch that implemented thisworkaround for the time being.

I do agree that we have a problem here, and I'll re-open the ticket (sorryfor leaving it closed). I think raising an (asynchronous) exception is theright solution. We have to make sure the exception cannot be caught by anSTM catch, but I think that's do-able.

However, another problem we have is that when the IO system re-raises theexception, it'll be raised as a synchronous exception rather than anasynchronous exception. I've just spent an hour or so talking this overhere with Simon PJ and we have some ideas for fixing it, I'll try to writeit up in a ticket later.


Cheers,
        Simon

_______________________________________________
Glasgow-haskell-users mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users

Re: Where STM is unstable at the moment, and how we can fix it

Reply via email to