On Wed, Jan 06, 2021 at 05:36:07PM +0000, sighoya via Digitalmars-d-learn wrote:
> On Tuesday, 5 January 2021 at 21:46:46 UTC, H. S. Teoh wrote:
> > 4) The universal error type contains two fields: a type field and a
> > context field.
> > 
> >     a) The type field is an ID unique to every thrown exception --
> >     uniqueness can be guaranteed by making this a pointer to some
> >     static global object that the compiler implicitly inserts per
> >     throw statement, so it will be unique even across shared
> >     libraries. The catch block can use this field to determine what
> >     the error was, or it can just call some standard function to
> >     turn this into a string message, print it and abort.
> 
> Why it must be unique? Doesn't it suffice to return the typeid here?

It must be unique because different functions may return different sets
of error codes. If these sets overlap, then once the error propagates up
the call stack it becomes ambiguous which error it is.

Contrived example:

        enum FuncAError { fileNotFound = 1, ioError = 2 }
        enum FuncBError { outOfMem = 1, networkError = 2 }

        int funcA() { throw FuncAError.fileNotFound; }
        int funcB() { throw FuncBError.outOfMem; }

        void main() {
                try {
                        funcA();
                        funcB();
                } catch (Error e) {
                        // cannot distinguish between FuncAError and
                        // FuncBError
                }
        }

Using the typeid is no good because: (1) typeid in D is a gigantic
historic hack containing cruft that even Walter doesn't fully
understand; (2) when all you want is to return an integer return code,
using typeid is overkill.


> >     b) The context field contains exception-specific data that gives
> >     more information about the nature of the specific instance of
> >     the error that occurred, e.g., an integer value, or a pointer to
> >     a string description or block of additional information about
> >     the error (set by the thrower), or even a pointer to a
> >     dynamically-allocated exception object if the user wishes to use
> >     traditional polymorphic exceptions.
> 
> Okay, but in 99% you need dynamically allocated objects because the
> context is most of the time simply unknown.

If the context is sufficiently represented in a pointer-sized integer,
there is no need for allocation at all. E.g., if you're returning an
integer error code.

If you're in @nogc code, you can point to a statically-allocated block
that the throwing code updates with relevant information about the
error, e.g., a struct that contains further details about the error.

If you're using traditional polymorphic exceptions, you already have to
allocate anyway, so this does not add any overhead.


> But yes, in specific cases a simple error code suffice, but even then
> it would be better to be aware that an error code is returned instead
> of a runtime object. It sucks to me to box over the context
> pointer/value to find out if it is an error code or not when I only
> want an error code.

You don't need to box anything.  The unique type ID already tells you
what type the context is, whether it's integer or pointer and what the
type of the latter is.


> >     c) The universal error type is constrained to have trivial move
> >     semantics, i.e., propagating it up the call stack is as simple
> >     as blitting the bytes over. (Any object(s) it points to need not
> >     be thus constrained, though.)
> > 
> > The value semantics of the universal error type ensures that there
> > is no overhead in propagating it up the call stack.  The
> > universality of the universal error type allows it to represent
> > errors of any kind without needing runtime polymorphism, thus
> > eliminating the overhead the current exception implementation
> > incurs.
> 
> So it seems the universal error type just tells me if there is or
> isn't error and checking for it is just a bitflip?

No, it's a struct that represents the error. Basically:

        struct Error {
                size_t type;
                size_t context;
        }

When you `throw` something, this is what is returned from the function.
To propagate it, you just return it, using the usual function return
mechanisms.  It's "zero-cost" because it the cost is exactly the same as
normal returns from a function.


> > The context field, however, still allows runtime polymorphism to be
> > supported, should the user wish to.
> 
> Which in most of the cases will be required.

Only if you want to use traditional dynamically-allocated exceptions. If
you only need error codes, no polymorphism is needed.


[...]
> > Of course, this was proposed for C++, so a D implementation will
> > probably be somewhat different.  But the underlying thrust is:
> > exceptions become value types by default, thus eliminating most of
> > the overhead associated with the current exception implementation.
> 
> I didn't know exactly how this is implemented in D, but class objects
> are passed as simple pointer and pointers are likewise value types.
> Using value types itself doesn't guarantee anything about performance,
> because the context field of an exception can be anything you need
> some kind of boxing involving runtime polymorphism anyway.

You don't need boxing for POD types. Just store the value directly in
Error.context.


> >  Stack unwinding is replaced by normal function return mechanisms,
> >  which is much more optimizer-friendly.
> 
> I heard that all the time, but why is that true?

The traditional implementation of stack unwinding bypasses normal
function return mechanisms.  It's basically a glorified longjmp() to the
catch block, augmented with the automatic destruction of any objects
that might need destruction on the way up the call stack.

Turns out, the latter is not quite so simple in practice.  In order to
properly destroy objects on the way up to the catch block, you need to
store information about what to destroy somewhere.  You also need to
know where the catch blocks are so that you know where to land. Once you
land, you need to know how to match the exception type to what the catch
block expects, etc.. To implement this, every function needs to setup
standard stack frames so that libunwind knows how to unwind the stack.
It also requires exception tables, an LSDA (language-specific data area)
for each function, personality functions, etc..  A whole bunch of heavy
machinery just to get things to work properly.

By contrast, by returning a POD type like the example Error above, none
of the above is necessary: all that's required is:

1) A small ABI addition for an error indicator per function call (to a
throwing function). This can either be a single CPU register, or
probably better, a 1-bit CPU flag that's either set or cleared by the
called function.

2) The addition of a branch in the caller to check this error indicator:
if there's no error, continue as usual; if there's an error, propagate
it (return it) or branch to the catch block.

The catch block then checks the Error.type field to discriminate between
errors if it needs to -- if not, just bail out with a standard error
message. If it's catching a specific exception, which will be a unique
Error.type value, then it already knows at compile-time how to interpret
Error.context, so it can take whatever corresponding action is
necessary.

None of the heavy machinery would be needed.


> > This also lets us support exceptions in @nogc code.
> 
> Okay, this would be optionally great. However, if we insert the
> context pointer into a List we may get a problem of cyclicity.

Why would you want to insert it into a list?  The context field is a
type-erased pointer-sized value. It may not even be a pointer.


[...]
> > If we implement Sutter's proposal, or something similar suitably
> > adapted to D, it would eliminate the runtime overhead, solve the
> > @nogc exceptions issue, and still support traditional polymorphic
> > exception objects that some people still want.
> 
> If we don't care of the exception type nor on the kind of message of an
> exception did we have either runtime overhead excluding unwinding?
> I refer here to the kind of exception as entity. Does a class object
> really require more runtime polymorphism than a tagged union?

It's not about class vs. non-class (though Error being a struct rather
than a class is important for @nogc support). It's about how exception
throwing is handled.  The current stack unwinding implementation is too
heavyweight for what it does; we want it replaced with something simpler
and more pay-as-you-go.


> The other point is how to unify the same frontend (try catch) with
> different backends (nonlocal jumps+unwinding vs value type errors
> implicitly in return types).

That's the whole point of Sutter's proposal: they are all unified with
the universal Error struct.  There is only one "backend": normal
function return values, augmented as a tagged union to distinguish
between normal return and error return.  We are throwing out nonlocal
jumps in favor of normal function return mechanisms.  We are throwing
out libunwind and all the heavy machinery it entails.

This is about *replacing* the entire exception handling mechanism, not
adding another alternative (which would make things even more
complicated and heavyweight for no good reason).


> You can use Sutter's proposal in your whole project, but what is with
> libraries expecting the other kind of error handling backend.

We will not support a different "backend".  Having more than one
exception-handling mechanism just over-complicates things with no real
benefit.


> Did we provide an implicit conversion from one backend to another
> either by turning an error object into an exception or vice versa?

No.  Except perhaps for C++ interop, in which case we can confine the
heavy machinery to the C++/D boundary. Internally, all D code will use
the Sutter mechanism.


T

-- 
There are four kinds of lies: lies, damn lies, and statistics.

Reply via email to