Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

Jeff Law Thu, 13 Jul 2017 16:10:51 -0700

On 07/13/2017 04:48 PM, Segher Boessenkool wrote:
> On Thu, Jul 13, 2017 at 11:28:17AM -0600, Jeff Law wrote:
>> On 07/12/2017 04:44 PM, Segher Boessenkool wrote:
>>> On Tue, Jul 11, 2017 at 03:19:36PM -0600, Jeff Law wrote:
>>>> Examples of implicit probes include
>>>
>>>>   2. ABI mandates that *sp always contain a backchain pointer (ppc)
>>>
>>> In the ELFv2 ABI a backchain is not required.  GCC still always has
>>> one afaik.  I'll find out more.
>> Please do.  I was under the impression it was mandated by the earlier
>> ABIs as well.  If it isn't, then I don't think we can depend on it for
>> the older ABIs.
> 
> I checked most ABIs, and all but ELFv2 require it.  You can assume we
> require it everywhere (we do assume it currently, and there is no
> intention to change this).  The statement in the ABI surprised me
> yesterday, sorry for panicking.
Y'all are the experts here.  It would be advisable to get the ABI
documents tweaked if indeed we are going to rely on the existence of the
backchain as an implicit probe.  Otherwise we end up in the same
scenario as aarch64 where we have to make some unpleasant assumptions.



> 
>> THe code we generate for alloca was so awful it's hard to see how
>> hitting each page once would matter either.  *However* I was looking at
>> x86 in this case and due to potential stack realignments x86's alloca
>> code might be notably worse than others for constant sizes.
> 
> There is generic code that aligns too often, too.  You might be seeing
> that same thing.
Exactly.  It's the generic code that's driven by various macros in the
x86 backend.


> 
>> There's further improvements that could be made as well.   It ought to
>> be possible to write an optimizer pass that uses some of the ideas from
>> DSE and SLSR to identify explicit probes that are made redundant by
>> nearby implicit probes -- this would seem most useful for the dynamic space.
>>
>> The problem is we'd want to do that in gimple, but probing of the
>> dynamic space happens at the gimple/rtl border.  So we'd probably want
>> to make probing happen earlier to expose stuff at the gimple level.
> 
> This would just get rid of one probe per dynamic allocation, correct?
> Doesn't seem worth complicating anything for.There's enough implicit probes 
> lying around in the IL that I suspect we
could likely prove the first and last are unnecessary on a reasonably
consistent basis.  It didn't seem critical to address at this stage, but
something we could look at later if we feel the need.

THe other thing I've pondered lightly would be to attach frame & probe
info to decl nodes, perhaps doing some IPA propagation.

THe idea here is if we have a function that is static to the CU, but its
not a good inline candidate, we can use information about the callers to
build a less pessimistic state at function entry.  This would likely
only help aarch64 and s390.  It would also fall into something we could
explore in the future if the need arises.


Thanks for all the feedback,
Jeff

Re: [PATCH][RFA/RFC] Stack clash mitigation 0/9

Reply via email to