On Tue, Aug 20, 2013 at 12:15 PM, Jonathan S. Shapiro <[email protected]>wrote:
> Let me add that long-term CLR isn't a target of interest. CLR is now 10 > years old and needs to be replaced in any case. > Hmm. I said that badly. In the short term, CLR is a target of interest. I'm not convinced that we should allow it's shortcomings to dictate the language design for BitC. Though I can think of several ways to speed up the current implementation of CLR exceptions, there's really no getting around the fact that it's a slow mechanism in the current implementation. The problem, as several people have noted, is the stack trace mechanism. There are implementation tricks that would let a CLR-like runtime defer construction of the stack trace. Notably: any catch block that (a) performs a procedure call *and* (b) re-throws or throws a chain of exceptions using the InnerException property must perform it's procedure calls on a new stack fragment so that the original stack fragment is not overwritten. Note that forming a new stack fragment is cheap - a whole lot cheaper than trying to copy the existing stack. In fact, with care, it can be *appended* on the *existing* stack: +-----------------+ | Frame with | | try block | +=================+ | (Orphaned) | ~ Saved exception | | Frames | +-----------------+ <- SP from first exception throw | New stack | | fragment used | | by calls in the | | catch block | +-----------------+ The only part of this that is remotely tricky is the need to implement a "dummy" frame at the top of the catch block stack so that the SP can be properly restored if the calls made from the catch block return without producing further exceptions. If those calls *do* produce further exceptions, then the orphaned stack fragments can be extended, such that each newly thrown exception in the chain appends a new chunk of orphaned frames. The only real problem with this is that an *unchained* exception will be newer on the stack than the [earlier] exception it replaces, and we won't be able to reclaim the earlier orphan frames until the later unchained exception is handled. I don't really see that as a major issue. Alternatively, the stack can be implemented in a distinguished "stack fragment heap", and a more conventional spaghetti stack can be used. In that case the unreachable stack chunks can be copied into the conventional heap at need by the collector, and collected in the usual way. This is a known set of stack implementation techniques due to Appel. But the main point I'm trying to make here is that if the implementation is done intelligently, the cost of throwing an exception is: 1. Save any arguments to the exception object. 2. Save the PC and SP at which the exception is thrown 3. Branch to the exceptional return address. Note that if the stack is going to be handled orphan-style as described above, then the exception object can (and should) be stack allocated and the exception object*'s payload* should be implemented in an unboxed type that resides on the orphaned stack. It's actually important here that exceptions cannot be captured to side variables, because that *would* require copying the orphaned stack fragment *unless* the spaghetti stack implementation is used. Now if all of this is done, here's the cost of returning an exception: PUSH arguments to exception constructor CALL exception constructor CALL exception return address, thereby saving PC and SP Note that the exception constructor is lexically visible to the compiler, can be inlined, and once inlined is generally a NO-OP (because all the constructor does is put the constructor arguments into exception object slots, and this will all peephole away if the exception constructor is inlined. So what we really end up with is PUSH exception fields to stack-allocated exception object CALL exception return address, thereby saving PC and SP If there are only one or two exception arguments (which is typical, the second being the exception object's type tag), this cost compares very favorably with the cost of normal (or error code) return: MOVE return value to return register ADD constant to SP, erasing callee frame RETURN to caller and is significantly faster than the pattern in which the callee must discriminate an error return. So yes, folks, it's certainly possible to design a seriously boogered exception mechanism, and I think both Java and CLR managed to accomplish that. But it's not *necessary* to do so. In my mind, the real question at hand isn't whether exceptions are slower than error codes. Implemented correctly, they aren't. The real question is whether the admittedly poor performance of CLR should push us to adopt non-exception conventions for BitC standard library design. I think the answer to that should be "no". CLR is, in my view, a transient target of convenience. If we can achieve performance in CLR that is comparable to C#, I think we've met all the goals we need to meet. Jonathan
_______________________________________________ bitc-dev mailing list [email protected] http://www.coyotos.org/mailman/listinfo/bitc-dev
