Alexis King <lexi.lam...@gmail.com> writes: > Hi all, > > As some of you are likely aware, I have an open GHC proposal[1] to add > native support for delimited continuations to the RTS. I also have an > incomplete implementation,[2] and the only major remaining obstacle > concerns async exceptions. The issue is subtle, so I’ve tried to > provide all the necessary context in this email. If you’re already > familiar with the ideas at play, you can skip the context about how > delimited continuations work. > ...
> This tiny optimization actually allows not one but two savings: > > 1. The immediate benefit is that we save a stack frame. > > 2. The hidden benefit is that we never need to push the old > exception masking state onto the stack. > > If we had multiple mask_ frames on the stack simultaneously, we > wouldn’t know what to do when returning: should we unmask them, > or should they stay masked? We’d need to push that information > onto the stack in the same way we must push a return address > when calling a function. > > By skipping these redundant stack frames, we can always be > certain the right thing to do on return is to unmask async > exceptions. No need to store anything else. > Indeed. However, I don't think it would be that expensive to store the masking state with a suitable implementation strategy. Specifically, don't add a "masking state" field to the return frame. Rather, I think you want to have `mask` push one of two possible return frame types: * in the case that we are already masked push an "already masked" frame. The entry code for this would be a no-op. * in the case that we aren't currently masked push the usual stg_maskAsyncExceptionszh_ret_info return frame. This incurs minimal overhead while preserving the information that you need. We avoid adding a word to the stack frame (likely saving an instruction) and in cases where today the optimsation fires the user would only pay for a return to a trivial entry frame. It seems to me that this is as close to free as we will get. Under this scheme `control` can simply look for "already masked" frames when copying the stack. > (This explanation is a slight simplification, especially once > maskUninterruptible comes into play, but it’s close enough.) > > Now you may see the looming problem: this strategy completely breaks > down in the presence of delimited continuations. With delimited > continuations, we might have a program like this: > > mask_ $ prompt $ mask_ $ ... > > If we capture a continuation up to this prompt, we expect the inner > mask_ frame to be captured along with it. But that won’t happen if we > never pushed a mask_ frame at all due to the aforementioned > optimization. > > So now I can finally state my question: what is the right solution for > this? I see three obvious possible ways forward: > > 1. Keep track of whether or not we’re inside a prompt and skip the > optimization in that case. > > 2. Keep some bookkeeping that tracks the modifications to the > async exception masking state since the most recently pushed > prompt. > > 3. Just don’t bother with the optimization at all. > > Option 3 certainly seems the most appealing from a simplicity point of > view, and I suspect the optimization doesn’t matter much in practice. > Why? Because the real `mask` implementation from Control.Exception > already avoids re-masking exceptions if they’re masked! (And that’s > okay, because prompt# and control0# are not intended to be used > directly in IO, so code that uses them can provide its own version of > `mask`.) However, it is admittedly possible for the restore action > passed to the argument of `mask` to create redundant calls, as the > check is only performed in `mask` itself. > > Is eliminating this optimization an acceptable compromise? Or is there > reason to believe this is important for performance of real programs? > It never hurts to measure. Perhaps establish the worst-case performance impact by looking at a simple program which just masks repeatedly. For what it's worth, I rather doubt that this optimisation is going to make or break any program. Exception operations in general are somewhat rare in my experience and the cost of pushing a stack frame (or, perhaps more importantly on modern hardware, returning through it) should be reasonably cheap. By the way, when you go to implement this do add a few comments to the masking primitives. It took a bit of staring to work out what was going on in there. Cheers, - Ben
signature.asc
Description: PGP signature
_______________________________________________ ghc-devs mailing list ghc-devs@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/ghc-devs