08 - V3

Jeff Law Sun, 30 Jul 2017 22:37:02 -0700

OK, so about a week later than I wanted.  Too many fires, not enough
water.  The V3 patch has expanded a bit...





  1. For constant sized dynamic allocations we'll allocate/probe up to 4
     STACK_CLASH_PROTECTION_PROBE_INTERVAL regions inline and unrolled.

  2. For larger constant sized dynamic allocations we rotate the loop,
     saving a compare/jump.

  3. blockage insns added to prevent scheduler reordering, particularly
     in the inline/unrolled loop case.

  4. Generic code for dynamic handles case where target makes optimistic
     assumptions about probing state in its prologue (ie, aarch64).

  5. PARAMs to control the assumed size of the guard and the probing
     interval.  Both default to 4k.  Note that the backends may not
     support all possible values for these PARAMs.

     a. The size of the guard helps determine how big of a local static
        frame can be allocated without probing on targets that have an
        implicit probe in the caller

     b. The interval determines how often we probe once we decide
        probing is required.

     c. Backends can override the default values.  aarch64 for example
        overrides the guard size

  6. More aarch64 improvements based on discussions with Wilco, Richard
     and Ramana.

     a. Support for a probing interval > 4k.

     b. Assume guard of 64k, with 1k for outgoing arglist.  Thus frames
        less than 63k require no probing.

     c. Fix missed probe for outgoing arguments

     d. Add missing notes and barriers

     e. Some aarch64 specific testcases for issues identified by Wilco
        and some of my own

     f. Some simplifications based on invariants I was previously
        unaware of for aarch64 prologues

  7. Scheduler honors the stack probing notes and avoids breaking memory
     dependencies when it encounters them

  8. PPC port takes advantage of improved generic code for dynamic
     stack allocations

  9  Many s390 improvements from IBM

 10. Additional tests for the unrolled inline dynamic case, rotated
     loop case and use of a large guard value to avoid probing
     (x86 and ppc only)


Jeff

[PATCH][RFA/RFC] Stack clash mitigation patch 00/08 - V3

Reply via email to