Notes toward re-implementing EH in gimple

Richard Henderson Thu, 06 Aug 2009 12:35:44 -0700

I've also been thinking a good deal about a grand EH reorg; I'll try
to post something for you to comment on later today.


Here ya go.  Thoughts?


r~

New constructs:

  { exc_ptr, filter } = EH_LANDING_PAD;

        Placeholder for the landing-pad rtl.  Has 2 outputs
        for both the exception pointer and the filter.

  EH_DISPATCH (filter);

        Placeholder for the switch statement that we'll
        generate to jump between the alternatives for 
        different catches.

  ERT_NOP

        A new type of exception region that merely implies
        a different landing pad to the containing region.
        Inserting one of these below the "current" region
        is how we'll split critical edges, instead of copying
        regions.

  exc_ptr and filter members in each eh_region

        The code within each region will use these decls.
        When falling through from an inner region to an
        outer region, these variables must be initialized.
        These replace the EXC_PTR_EXPR and FILTER_EXPR
        constructs that we currently use.

  RESX (exc_ptr, filter);

        Of course we have this now, but we also store the
        region number in the statement directly, instead
        of via add_stmt_to_eh_region.  Which leads to special
        cases all over.  There does not seem to be a real
        point to this.  Also, we need to propagate ext_ptr
        and filter up to the next level, if we're not calling
        _Unwind_Resume.

Examples:

  Catch with nested cleanup:

        struct S { ~S(); };
        void bar();
        int x, y;
        void foo()
        {
          try {
            S s;
            bar();
          } catch (int) {
            x++;
          } catch (float) {
            y++;
          }
        }

  Compiles to (before eh_dispatch/resx expansion)

        EH Region Tree:
          3 catch int -> post-land = BB8
          4 catch float -> post-land = BB9
          2 try -> { e2, f2 }, land = BB6, post-land = BB7
            1 cleanup -> { e1, f1 }, land = BB4, post-land = BB5
          5 must_not_throw

        BB1:
          bar();                        [EH 1]
          # succ 2 4(eh)
        BB2:
          S::~S(&s);                    [EH 2]
          # succ 3
        BB3:
          return;
          # succ exit
        BB4:
          { e1, f1 } = EH_LANDING_PAD;  [EH 1]
          # succ 5
        BB5:
          S::~S(&s);                    [EH 5]
          RESX (e1, f1);
          # succ 6(eh)
        BB6:
          { e2, f2 } = EH_LANDING_PAD;  [EH 2]
          # succ 7
        BB7:
          EH_DISPATCH (f2);             [EH 2]
          # succ 8(ab) 9(ab) 10(ab)
        BB8:
          __cxa_begin_catch (e2);
          x++;
          __cxa_end_catch (e2);
          # succ 3
        BB9:
          __cxa_begin_catch (e2);
          y++;
          __cxa_end_catch (e2);
          # succ 3
        BB10:
          RESX (e2, f2);                [EH 2]

  After inlining, we have a new pass that expands both RESX and EH_DISPATCH:

        BB5:
          e2 = e1;
          f2 = f1;
          # succ 7

        BB7:
          switch (f2)
            {
            case 1:  goto BB8;
            case 2:  goto BB9;
            default: goto BB10
            }
          # succ 8 9 10

        BB10:
          _Unwind_Resume (e2);

  There are a couple of advantages that ought to be clear immediately.
  First, no more silliness that tree_empty_eh_handler_p has to clean up.
  Second, everything can be properly put into SSA form.
  Third, we're not too late to easily generate switch statements.  See

          /* ??? It is mighty inconvenient to call back into the
             switch statement generation code in expand_end_case.
             Rapid prototyping sez a sequence of ifs.  */

  which has been in except.c for nearly 10 years.

  Critical edge splitting:

        struct S { ~S() throw(); };
        void bar();
        void foo()
        {
          S s1;
          bar();
          bar();
          S s0;
          bar();
        }

  Compiles to:

        EH Region Tree:
        1 cleanup -> { e1, f1 }, land = BB7, post-land = BB8
          2 cleanup -> { e2, f2 }, land = BB5, post-land = BB6

        BB0:
          bar();                [EH 1]
          # succ 1 7(eh)
        BB1:
          bar();                [EH 1]
          # succ 2 7(eh)
        BB2:
          bar();                [EH 2]
          # succ 3 5(eh)
        BB3:
          S::~S(&s0);
          S::~S(&s1);
          # succ 4
        BB4:
          return;
        BB5:
          { e2, f2 } = EH_LANDING_PAD;
          # succ 6
        BB6:
          S::~S(&s0);
          RESX (e2, f2);
          # succ 7(eh)
        BB7:
          { e1, f1 } = EH_LANDING_PAD;
          # succ 8
        BB8:
          S::~S(&s1);
          RESX (e1, f1)

  Expand RESX and we get

        BB6:
          S::~S(&s0);
          e1 = e2;
          f1 = f2;
          # succ 8

        BB8:
          S::~S(&s1);
          _Unwind_Resume (e1);

  We decide we want to split edge BB1->BB7, so we modify:

        EH Region Tree:
        1 cleanup -> { e1, f1 }, land = BB7, post-land = BB8
          2 cleanup -> { e2, f2 }, land = BB5, post-land = BB6
          3 nop -> { e1, f1 }, land = BB9, post-land = BB8

        BB1:
          bar();                [EH 3]
          # succ 2 9(eh)

        BB9:
          { e1, f1 } = EH_LANDING_PAD;
          # succ 8

  Which, if I'm not mistaken, duplicates far less code than you
  do at present to split these edges.

  Exception optimization:

        void f();
        void g() throw();
        void h() throw();
        void foo() {
          try {
            try {
              f();
            } catch (int) {
              g();
            }
          } catch (int) {
            h();
          }
        }

  Compiles to:

        EH Region Tree:
        4 catch int
        1 try -> { e1, f1 }, land = BB6, post-land = BB7
          3 catch int
          2 try -> { e2, f2 }, land = BB2, post-land = BB3

        BB0:
          f();
          # succ 1 2(eh)                [EH 2]
        BB1:
          return;
          # succ exit
        BB2:
          { e2, f2 } = EH_LANDING_PAD;  [EH 2]
          # succ 3
        BB3:
          EH_DISPATCH (f2);             [EH 2]
          # succ 4(ab) 5(ab)
        BB4:
          __cxa_begin_catch (e2);
          g();
          __cxa_end_catch (e2);
          # succ 1
        BB5:
          RESX (e2, f2);                [EH 2]
          # succ 6(eh)
        BB6:
          { e1, f1 } = EH_LANDING_PAD;  [EH 1]
          # succ 7
        BB7:
          EH_DISPATCH (f1);             [EH 1]
          # succ 8(ab) 9(ab)
        BB8:
          __cxa_begin_catch (e1);
          h();
          __cxa_end_catch (e1);
          # succ 1
        BB9:
          RESX (e1, f1);

  Expanding RESX and EH_DISPATCH:

        BB3:
          switch (f2)
            {
            case 0:  goto BB4;
            default: goto BB5;
            }
          # succ 4 5

        BB5:
          e1 = e2;
          f1 = f2;
          # succ 7

        BB7:
          switch (f1)
            {
            case 0:  goto BB8;
            default: goto BB9;
            }
          # succ 8 9

        BB9:
          _Unwind_Resume (e1);

  See below for more discussion.

Implementation:

  EH_LANDING_PAD

        The existing code in except.c should be easily adaptable
        to expanding this gimple stmt.  All we need to do is pass
        along the two output arguments and use those instead of
        the hard-coded crtl->eh.exc_ptr and crtl->eh.filter.

  EH_DISPATCH:

        Should be expanded after inlining, and maybe before EH optimization.
        Again, the existing code within except.c can be adapted.  We
        just need to call assign_filter_values to get all the numbers
        assigned.  At which point it is trivial to build the actual
        switch.

  RESX:

        Should be expanded after inlining, and probably before EH
        optimization.  All we need to do here is either notice that
        there's no outer region and expand to _Unwind_Resume, or
        copy the exc_ptr/filter variables around and goto.

  EH optimization:

        This pass would remove shadowed exception handlers and other
        unreachable code.  It's supposed to be currently handled in
        reachable_next_level, and consists of some moderately complicated
        type checking every time we ask stmt_throws_internal.  Instead
        of having the complicated code run all the time, we'd have a
        special pass.

        (As an aside, the dead handler elimination we attempt in
        reachable_next_level doesn't work because the C++ front end
        tells us that __cxa_end_catch can throw, which is only true
        when the thrown object has a destructor that can throw.  With
        a single extra check in the front end, we can set the NOTHROW
        bit on that __cxa_end_catch call and re-enable this.)

        It could be implemented as a special form of value-range
        propagation.  We know that the runtime will only enter a
        landing pad under certain conditions, and the values that
        the FILTER variable will contain under those conditions.
        With that, it's trivial to propagate the values around.

        For instance, in the example above, we know that the only
        value that F1 and F0 will ever be assigned by EH_LANDING_PAD
        is 0.  From that we can determine that BB5 and BB9 are dead.

        If there were a cleanup handler somewhere in the middle,
        we'd have to bail, because the FILTER values for a cleanup
        can be anything (I think; but in any case it might have a
        different number as we'd generate for "int").

        We'll need something slightly different than the min/max
        values currently supported by VRP.  We'll need to track exact
        sets of values, but happily only for (generally) small numbers.

Notes toward re-implementing EH in gimple

Reply via email to