On Thu, Jan 07, 2021 at 12:01:23AM +0000, sighoya via Digitalmars-d-learn wrote:
> On Wednesday, 6 January 2021 at 21:27:59 UTC, H. S. Teoh wrote:
[...]
> > You don't need to box anything.  The unique type ID already tells
> > you what type the context is, whether it's integer or pointer and
> > what the type of the latter is.
> 
> The question is how can a type id as integer value do that, is there
> any mask to retrieve this kind of information from the type id field,
> e.g. the first three bits say something about the context data type or
> did we use some kind of log2n hashing of the typeid to retrieve that
> kind of information.

Your catch block either knows exactly what type value(s) it's looking
for, or it's just a generic catch for all errors.

In the former case, you already know at compile-time how to interpret
the context information, and can cast it directly to the correct type.
(This can, of course, be implicitly inserted by the compiler.)

In the latter case, you don't actually care what the interpretation is,
so it doesn't matter.  The most you might want to do in this case is to
generate some string error message; this could be implemented in various
ways. If the type field is a pointer to a static global, it could be a
pointer to a function that takes the context argument and returns a
string, for example. Of course, it can also be a pointer to a static
global struct containing more information, if needed.


> > When you `throw` something, this is what is returned from the
> > function.  To propagate it, you just return it, using the usual
> > function return mechanisms.  It's "zero-cost" because it the cost is
> > exactly the same as normal returns from a function.
> 
> Except that bit check after each call is required which is neglectable
> for some function calls, but it's summing up rapidly for the whole
> amount of modularization.

But you already have to do that if you're checking error codes after the
function call.  The traditional implementation of exceptions doesn't
incur this particular overhead, but introduces (many!) others.

Optimizers are constrained, for example, when a particular function call
may throw (under the traditional unwinding implementation): it cannot
assume control flow will always return to the caller.  Handling the
exception by returning the error using normal function return mechanisms
allows the optimizer to assume control always returns to the caller,
which enables certain optimizations not possible otherwise.


> Further, the space for the return value in the caller needs to be
> widened in some cases.

Perhaps. But this should not be a big problem if the error type is at
most 2 pointers big. Most common architectures like x86 have plenty of
registers that can be used for this purpose.


> > Only if you want to use traditional dynamically-allocated
> > exceptions. If you only need error codes, no polymorphism is needed.
> 
> Checking the bit flag is runtime polymorphism, checking the type field
> against the catches is runtime polymorphism, checking what the typeid
> tells about the context type is runtime polymorphism. Checking the
> type of information behind the context pointer in case of non error
> codes is runtime polymorphism.

The catch block either knows exactly what error types it's catching, or
it's a generic catch-all.

In the former case, it already knows at compile-time what type the
context field is. So no runtime polymorphism there. Unless the error
type indicates a traditional exception class hierarchy, in which case
the context field can just be a pointer to the exception object and you
can use the traditional RTTI mechanisms to get at the information.

In the latter case, you don't care what the context field is anyway, or
only want to perform some standard operation like convert to string, as
described earlier. I suppose that's runtime polymorphism, but it's
optional.


> The only difference is it is coded somewhat more low level and is a
> bit more compact than a class object.
> What if we use structs for exceptions where the first field is the
> type and the second field the string message pointer/or error code?

That's exactly what struct Error is.


[...]
> > Turns out, the latter is not quite so simple in practice.  In order
> > to properly destroy objects on the way up to the catch block, you
> > need to store information about what to destroy somewhere.
> 
> I can't imagine why this is different in your case, this is generally
> the problem of exception handling independent of the underlying
> mechanism. Once the pointer of the first landing pad is known, the
> control flow continues as known before until the next error is thrown.

The difference is that for unwinding you need to duplicate / reflect
this information outside the function body, and you're constrained in
how you use the runtime stack (it must follow some standard stack frame
format so that the unwinder knows how to unwind it).

If exceptions are handled by normal function return mechanisms, the
optimizer is more free to change the way it uses the stack -- you can
omit stack frames for functions that don't need it, for instance. And
you don't need to duplicate dtor knowledge outside of the function body:
the function just exits via the usual return mechanism that already
handles the destruction of local variables. You don't even need to know
where the catch blocks are: this is already encoded into the catching
function via the error bit check after the function call. The exception
table can be completely elided.


[...]
> > Why would you want to insert it into a list?  The context field is a
> > type-erased pointer-sized value. It may not even be a pointer.
> 
> Good point, I don't know if anyone tries to gather errors in an
> intermediate list which is passed to certain handlers. Sometimes
> exceptions are used as control flow elements though that isn't good
> practice.

Exceptions should never be used as control flow.  That's definitely a
code smell. :-D

But anyway, if you ever want to store errors in a list, just store the
entire Error struct.  It's only 2 pointers long, and includes all the
information necessary to interpret it.


> > It's not about class vs. non-class (though Error being a struct
> > rather than a class is important for @nogc support). It's about how
> > exception throwing is handled.  The current stack unwinding
> > implementation is too heavyweight for what it does; we want it
> > replaced with something simpler and more pay-as-you-go.
> 
> I agree, that fast exceptions are worthwhile for certain areas as
> opt-in, but I don't want them to replace non-fast exceptions because
> of the runtime impact of normal running code.

It will *improve* normal running code.

Please note that the proposed mechanism does NOT exclude traditional
class-based exceptions. All you need is to reserve a specific Error.type
value to mean "class-based exception", and store the class reference in
Error.context:

        enum classBasedException = ... /* some magic value */;

        // This:
        throw new Exception(...);

        // Gets translated to this:
        Error e;
        e.type = classBasedException;
        e.context = cast(size_t) new Exception(...);
        return e;

        // ... then in the catch block, this:
        catch(MyExceptionSubclass e) {
                handleError(e);
        }

        // gets translated to this:
        catch(Error e) {
                if (e.type == classBasedException) {
                        auto ex = cast(Exception) e.context;
                        auto mex = cast(MyExceptionSubclass) ex; // query RTTI
                        if (mex !is null) {
                                handleError(ex);
                                goto next;
                        }
                }
                ... // propagate to next catch block or return e
        }
        next: // continue normal control flow

Nothing breaks in traditional class-based exception code. You earn the
free benefit of no external tables for libunwind, as well as better
optimizer friendlines.

And you get a really cheap code path if you opt to use error codes
instead of class objects.  *And* it works for @nogc.


[...]
> > This is about *replacing* the entire exception handling mechanism,
> > not adding another alternative (which would make things even more
> > complicated and heavyweight for no good reason).
> 
> Oh, no please not. Interestingly we don't use longjmp in default
> exception handling, but that would be a good alternative to Herb
> Sutter’s proposal because exceptions are likewise faster, but have
> likewise an impact on normal running code in case a new landing pad
> have to be registered.  But interestingly, the occurrence of this is
> much more seldom than checking the return value after each function.
[...]

I don't understand why you would need to register a new landing pad.
There is no need to register anything; catch blocks become just part of
the function body and are automatically handled as part of the function
call mechanism.

The reason we generally don't use longjmp is because it doesn't unwind
the stack properly (does not destruct local variables that need
destruction). You *could* make it work, e.g., each function pushes dtor
code onto a global list of dtors, and the setjmp handler just runs all
the dtors in this list.  But that just brings us back to the same
performance problems that libunwind has, just implemented differently.
(Every function has to push/pop dtors to the global list, for instance.
That's a LOT of overhead, and is very cache-unfriendly. Even libunwind
does better than this.)


T

-- 
VI = Visual Irritation

Reply via email to