Re: [PATCH] Improve alloca alignment

2017-12-19 Thread Jeff Law
On 12/13/2017 04:17 PM, Eric Botcazou wrote:
>> No clue, but ISTM that it should.  Eric, can you try that and see if it
>> addresses these problems?  I'd really like to get this wrapped up, but I
>> don't have access to any sparc systems to test it myself.
> 
> Yes, the INIT_EXPANDERS trick works for SPARC (but this has nothing to do 
> with 
> SPARC_STACK_BIAS) and avoid hardcoding the bogus alignment assumption in the 
> get_dynamic_stack_size function.  As a matter of fact, this was the approach 
> originally used by Dominik Vogt last year.
> 
> Of course this doesn't address the same potential issue on other targets but 
> you don't seem to care much about that, so who am I to do it after all? ;-)
> 
> Tested on x86_64-suse-linux and SPARC/Solaris, applied on the mainline.
> 
> 
> 2017-12-13  Eric Botcazou  
> Dominik Vogt  
> 
>   PR middle-end/78468
>   * emit-rtl.c (init_emit): Remove ??? comment.
>   * explow.c (get_dynamic_stack_size): Take known alignment of stack
>   pointer + STACK_DYNAMIC_OFFSET into account in lieu of STACK_BOUNDARY
>   * config/sparc/sparc.h (INIT_EXPANDERS): In 32-bit mode, lower the
>   alignment of 3 virtual registers to BITS_PER_WORD.
> 
>   * config/sparc/sparc.c (sparc_compute_frame_size): Simplify.
> 
Thanks for taking care of this.

jeff


Re: [PATCH] Improve alloca alignment

2017-12-13 Thread Eric Botcazou
> No clue, but ISTM that it should.  Eric, can you try that and see if it
> addresses these problems?  I'd really like to get this wrapped up, but I
> don't have access to any sparc systems to test it myself.

Yes, the INIT_EXPANDERS trick works for SPARC (but this has nothing to do with 
SPARC_STACK_BIAS) and avoid hardcoding the bogus alignment assumption in the 
get_dynamic_stack_size function.  As a matter of fact, this was the approach 
originally used by Dominik Vogt last year.

Of course this doesn't address the same potential issue on other targets but 
you don't seem to care much about that, so who am I to do it after all? ;-)

Tested on x86_64-suse-linux and SPARC/Solaris, applied on the mainline.


2017-12-13  Eric Botcazou  
Dominik Vogt  

PR middle-end/78468
* emit-rtl.c (init_emit): Remove ??? comment.
* explow.c (get_dynamic_stack_size): Take known alignment of stack
pointer + STACK_DYNAMIC_OFFSET into account in lieu of STACK_BOUNDARY
* config/sparc/sparc.h (INIT_EXPANDERS): In 32-bit mode, lower the
alignment of 3 virtual registers to BITS_PER_WORD.

* config/sparc/sparc.c (sparc_compute_frame_size): Simplify.

-- 
Eric BotcazouIndex: emit-rtl.c
===
--- emit-rtl.c	(revision 255578)
+++ emit-rtl.c	(working copy)
@@ -5764,8 +5764,6 @@ init_emit (void)
   REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = STACK_BOUNDARY;
   REGNO_POINTER_ALIGN (ARG_POINTER_REGNUM) = STACK_BOUNDARY;
 
-  /* ??? These are problematic (for example, 3 out of 4 are wrong on
- 32-bit SPARC and cannot be all fixed because of the ABI).  */
   REGNO_POINTER_ALIGN (VIRTUAL_INCOMING_ARGS_REGNUM) = STACK_BOUNDARY;
   REGNO_POINTER_ALIGN (VIRTUAL_STACK_VARS_REGNUM) = STACK_BOUNDARY;
   REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM) = STACK_BOUNDARY;
Index: explow.c
===
--- explow.c	(revision 255578)
+++ explow.c	(working copy)
@@ -1206,7 +1206,6 @@ get_dynamic_stack_size (rtx *psize, unsi
 			unsigned required_align,
 			HOST_WIDE_INT *pstack_usage_size)
 {
-  unsigned extra = 0;
   rtx size = *psize;
 
   /* Ensure the size is in the proper mode.  */
@@ -1242,16 +1241,16 @@ get_dynamic_stack_size (rtx *psize, unsi
  example), so we must preventively align the value.  We leave space
  in SIZE for the hole that might result from the alignment operation.  */
 
-  /* Since the stack is presumed to be aligned before this allocation,
- we only need to increase the size of the allocation if the required
- alignment is more than the stack alignment.  */
-  if (required_align > STACK_BOUNDARY)
+  unsigned known_align = REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM);
+  if (known_align == 0)
+known_align = BITS_PER_UNIT;
+  if (required_align > known_align)
 {
-  extra = (required_align - STACK_BOUNDARY) / BITS_PER_UNIT;
+  unsigned extra = (required_align - known_align) / BITS_PER_UNIT;
   size = plus_constant (Pmode, size, extra);
   size = force_operand (size, NULL_RTX);
-  if (size_align > STACK_BOUNDARY)
-	size_align = STACK_BOUNDARY;
+  if (size_align > known_align)
+	size_align = known_align;
 
   if (flag_stack_usage_info && pstack_usage_size)
 	*pstack_usage_size += extra;
Index: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 255578)
+++ config/sparc/sparc.c	(working copy)
@@ -5483,10 +5483,8 @@ sparc_compute_frame_size (HOST_WIDE_INT
 frame_size = apparent_frame_size = 0;
   else
 {
-  /* We subtract TARGET_STARTING_FRAME_OFFSET, remember it's negative.  */
-  apparent_frame_size
-	= ROUND_UP (size - targetm.starting_frame_offset (), 8);
-  apparent_frame_size += n_global_fp_regs * 4;
+  /* Start from the apparent frame size.  */
+  apparent_frame_size = ROUND_UP (size, 8) + n_global_fp_regs * 4;
 
   /* We need to add the size of the outgoing argument area.  */
   frame_size = apparent_frame_size + ROUND_UP (args_size, 8);
Index: config/sparc/sparc.h
===
--- config/sparc/sparc.h	(revision 255578)
+++ config/sparc/sparc.h	(working copy)
@@ -771,13 +771,29 @@ extern enum cmodel sparc_cmodel;
 /* The soft frame pointer does not have the stack bias applied.  */
 #define FRAME_POINTER_REGNUM 101
 
-/* Given the stack bias, the stack pointer isn't actually aligned.  */
 #define INIT_EXPANDERS			 \
   do {	 \
-if (crtl->emit.regno_pointer_align && SPARC_STACK_BIAS)	 \
+if (crtl->emit.regno_pointer_align)	 \
   {	 \
-	REGNO_POINTER_ALIGN (STACK_POINTER_REGNUM) = BITS_PER_UNIT;	 \
-	REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = BITS_PER_UNIT; \
+	/* The biased stack pointer is only aligned on BITS_PER_UNIT.  */\

Re: [PATCH] Improve alloca alignment

2017-10-26 Thread Richard Biener
On Thu, Oct 26, 2017 at 4:55 PM, Jeff Law  wrote:
> On 10/17/2017 06:04 AM, Wilco Dijkstra wrote:
>> Wilco Dijkstra wrote:
>>>
>>> Yes STACK_BOUNDARY applies to virtual_stack_dynamic_rtx and all other
>>> virtual frame registers. It appears it's main purpose is to enable alignment
>>> optimizations since PREFERRED_STACK_BOUNDARY is used to align
>>> local and outgoing argument area etc. So if you don't want the alignment
>>> optimizations it is feasible to set STACK_BOUNDARY to a lower value
>>> without changing the stack layout.
>>>
>>> There is also STACK_DYNAMIC_OFFSET which computes the total offset
>>> from the stack. It's not obvious whether the default version should align 
>>> (since
>>> outgoing arguments are already aligned there is no easy way to record the
>>> extra padding), but we could assert if the offset isn't aligned.
>>
>> Also there is something odd in the sparc backend:
>>
>> /* Given the stack bias, the stack pointer isn't actually aligned.  */
>> #define INIT_EXPANDERS   \
>>   do {   \
>> if (crtl->emit.regno_pointer_align && SPARC_STACK_BIAS)  \
>>   {  \
>> REGNO_POINTER_ALIGN (STACK_POINTER_REGNUM) = BITS_PER_UNIT;  \
>> REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = BITS_PER_UNIT; \
>>   }  \
>>   } while (0)
>>
>> That lowers the alignment for the stack and frame pointer. So assuming that 
>> works
>> and blocks alignment optimizations, why isn't this done for the dynamic 
>> offset as well?
> No clue, but ISTM that it should.  Eric, can you try that and see if it
> addresses these problems?  I'd really like to get this wrapped up, but I
> don't have access to any sparc systems to test it myself.

Or maybe adjust all non-hardreg stack pointers by the bias so they
_are_ aligned.  And of course
make sure we always use the aligned pointers when allocating.

Weird ABI ...

Richard.

> Jeff


Re: [PATCH] Improve alloca alignment

2017-10-26 Thread Jeff Law
On 10/17/2017 06:04 AM, Wilco Dijkstra wrote:
> Wilco Dijkstra wrote:
>>
>> Yes STACK_BOUNDARY applies to virtual_stack_dynamic_rtx and all other
>> virtual frame registers. It appears it's main purpose is to enable alignment
>> optimizations since PREFERRED_STACK_BOUNDARY is used to align
>> local and outgoing argument area etc. So if you don't want the alignment
>> optimizations it is feasible to set STACK_BOUNDARY to a lower value
>> without changing the stack layout.
>>
>> There is also STACK_DYNAMIC_OFFSET which computes the total offset
>> from the stack. It's not obvious whether the default version should align 
>> (since
>> outgoing arguments are already aligned there is no easy way to record the
>> extra padding), but we could assert if the offset isn't aligned.
> 
> Also there is something odd in the sparc backend:
> 
> /* Given the stack bias, the stack pointer isn't actually aligned.  */
> #define INIT_EXPANDERS   \
>   do {   \
> if (crtl->emit.regno_pointer_align && SPARC_STACK_BIAS)  \
>   {  \
> REGNO_POINTER_ALIGN (STACK_POINTER_REGNUM) = BITS_PER_UNIT;  \
> REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = BITS_PER_UNIT; \
>   }  \
>   } while (0)
> 
> That lowers the alignment for the stack and frame pointer. So assuming that 
> works
> and blocks alignment optimizations, why isn't this done for the dynamic 
> offset as well?
No clue, but ISTM that it should.  Eric, can you try that and see if it
addresses these problems?  I'd really like to get this wrapped up, but I
don't have access to any sparc systems to test it myself.

Jeff


Re: [PATCH] Improve alloca alignment

2017-10-26 Thread Jeff Law
On 10/05/2017 03:16 AM, Richard Biener wrote:
>  On Thu, Oct 5, 2017 at 1:07 AM, Jeff Law  wrote:
>> On 10/04/2017 08:53 AM, Eric Botcazou wrote:
 This seems like a SPARC target problem to me -- essentially it's
 claiming a higher STACK_BOUNDARY than it really has.
>>>
>>> No, it is not, I can guarantee you that the stack pointer is always aligned 
>>> to
>>> 64-bit boundaries on SPARC, otherwise all hell would break loose...
>> Then something is inconsistent somehwere.  Either the stack is aligned
>> prior to that code or it is not.  If it is aligned, then Wilco's patch
>> ought to keep it aligned.  If is not properly aligned, then well, that's
>> the problem ISTM.
>>
>> Am I missing something here?
> 
> What I got from the discussion and the PR is that the stack hardregister
> is properly aligned but what GCC maps to it (virtual or frame or whatever)
> might not be at all points.
Ah!  But I'd probably claim that having the virtual unaligned is erroneous.

> 
> allocate_dynamic_stack_space uses virtual_stack_dynamic_rtx and I'm not
> sure STACK_BOUNDARY applies to it?
> 
> Not that I know anything about this here ;)
My first thought is that sure it should apply.  It just seems wrong that
STACK_BOUNDARY wouldn't apply to the virtual.  But I doubt we've ever
documented that as a requirement/assumption.

Jeff



Re: [PATCH] Improve alloca alignment

2017-10-17 Thread Wilco Dijkstra
Wilco Dijkstra wrote:
>
> Yes STACK_BOUNDARY applies to virtual_stack_dynamic_rtx and all other
> virtual frame registers. It appears it's main purpose is to enable alignment
> optimizations since PREFERRED_STACK_BOUNDARY is used to align
> local and outgoing argument area etc. So if you don't want the alignment
> optimizations it is feasible to set STACK_BOUNDARY to a lower value
> without changing the stack layout.
>
> There is also STACK_DYNAMIC_OFFSET which computes the total offset
> from the stack. It's not obvious whether the default version should align 
> (since
> outgoing arguments are already aligned there is no easy way to record the
> extra padding), but we could assert if the offset isn't aligned.

Also there is something odd in the sparc backend:

/* Given the stack bias, the stack pointer isn't actually aligned.  */
#define INIT_EXPANDERS   \
  do {   \
if (crtl->emit.regno_pointer_align && SPARC_STACK_BIAS)  \
  {  \
REGNO_POINTER_ALIGN (STACK_POINTER_REGNUM) = BITS_PER_UNIT;  \
REGNO_POINTER_ALIGN (HARD_FRAME_POINTER_REGNUM) = BITS_PER_UNIT; \
  }  \
  } while (0)

That lowers the alignment for the stack and frame pointer. So assuming that 
works
and blocks alignment optimizations, why isn't this done for the dynamic offset 
as well?

Wilco

Re: [PATCH] Improve alloca alignment

2017-10-05 Thread Wilco Dijkstra
Richard Biener wrote:
> On Thu, Oct 5, 2017 at 1:07 AM, Jeff Law  wrote:
> > On 10/04/2017 08:53 AM, Eric Botcazou wrote:
 This seems like a SPARC target problem to me -- essentially it's
 claiming a higher STACK_BOUNDARY than it really has.
>>>
>>> No, it is not, I can guarantee you that the stack pointer is always aligned 
>>> to
>>> 64-bit boundaries on SPARC, otherwise all hell would break loose...
>> Then something is inconsistent somehwere.  Either the stack is aligned
>> prior to that code or it is not.  If it is aligned, then Wilco's patch
>> ought to keep it aligned.  If is not properly aligned, then well, that's
>> the problem ISTM.
>>
>> Am I missing something here?
>
> What I got from the discussion and the PR is that the stack hardregister
> is properly aligned but what GCC maps to it (virtual or frame or whatever)
> might not be at all points.
>
> allocate_dynamic_stack_space uses virtual_stack_dynamic_rtx and I'm not
> sure STACK_BOUNDARY applies to it?

Yes STACK_BOUNDARY applies to virtual_stack_dynamic_rtx and all other
virtual frame registers. It appears it's main purpose is to enable alignment
optimizations since PREFERRED_STACK_BOUNDARY is used to align
local and outgoing argument area etc. So if you don't want the alignment
optimizations it is feasible to set STACK_BOUNDARY to a lower value
without changing the stack layout.

There is also STACK_DYNAMIC_OFFSET which computes the total offset
from the stack. It's not obvious whether the default version should align (since
outgoing arguments are already aligned there is no easy way to record the
extra padding), but we could assert if the offset isn't aligned.

Wilco

Re: [PATCH] Improve alloca alignment

2017-10-05 Thread Richard Biener
 On Thu, Oct 5, 2017 at 1:07 AM, Jeff Law  wrote:
> On 10/04/2017 08:53 AM, Eric Botcazou wrote:
>>> This seems like a SPARC target problem to me -- essentially it's
>>> claiming a higher STACK_BOUNDARY than it really has.
>>
>> No, it is not, I can guarantee you that the stack pointer is always aligned 
>> to
>> 64-bit boundaries on SPARC, otherwise all hell would break loose...
> Then something is inconsistent somehwere.  Either the stack is aligned
> prior to that code or it is not.  If it is aligned, then Wilco's patch
> ought to keep it aligned.  If is not properly aligned, then well, that's
> the problem ISTM.
>
> Am I missing something here?

What I got from the discussion and the PR is that the stack hardregister
is properly aligned but what GCC maps to it (virtual or frame or whatever)
might not be at all points.

allocate_dynamic_stack_space uses virtual_stack_dynamic_rtx and I'm not
sure STACK_BOUNDARY applies to it?

Not that I know anything about this here ;)

Richard.

> jeff


Re: [PATCH] Improve alloca alignment

2017-10-04 Thread Jeff Law
On 10/04/2017 08:53 AM, Eric Botcazou wrote:
>> This seems like a SPARC target problem to me -- essentially it's
>> claiming a higher STACK_BOUNDARY than it really has.
> 
> No, it is not, I can guarantee you that the stack pointer is always aligned 
> to 
> 64-bit boundaries on SPARC, otherwise all hell would break loose...
Then something is inconsistent somehwere.  Either the stack is aligned
prior to that code or it is not.  If it is aligned, then Wilco's patch
ought to keep it aligned.  If is not properly aligned, then well, that's
the problem ISTM.

Am I missing something here?

jeff


Re: [PATCH] Improve alloca alignment

2017-10-04 Thread Eric Botcazou
> This seems like a SPARC target problem to me -- essentially it's
> claiming a higher STACK_BOUNDARY than it really has.

No, it is not, I can guarantee you that the stack pointer is always aligned to 
64-bit boundaries on SPARC, otherwise all hell would break loose...

> Presumably there's a good reason for this and some kind of hack may be
> needed to deal with it in dynamically allocated space.  But it does not
> seem like we should be forcing all targets to allocate unnecessary space
> to deal with this.

I agree but SPARC is presumably not the only affected platform, so I think 
that it's wrong to sureptitiously change the interface with the ~50 back-ends 
and hope that the maintainers will repair the damage; they won't and we'll 
have introduced very nasty bugs for a few wasted bytes on the stack.

-- 
Eric Botcazou


Re: [PATCH] Improve alloca alignment

2017-09-11 Thread Wilco Dijkstra
Jeff Law wrote:
> On 09/09/2017 02:51 AM, Eric Botcazou wrote:
> >> No, the stack never gets misaligned - my patch doesn't change that at all.
> > 
> > Yes, it does.  Dynamic allocation works like this: the amount to be 
> > allocated 
> > is added to VIRTUAL_STACK_DYNAMIC_REGNUM and the result is then dynamically 
> > aligned.  Your patch assumes that VIRTUAL_STACK_DYNAMIC_REGNUM is as 
> > aligned 
> > as the stack pointer, i.e. STACK_BOUNDARY, but that's wrong for 32-bit 
> > SPARC 
> > at least (that's why I put the ??? note at line 5746 in emit-rtl.c).
> This seems like a SPARC target problem to me -- essentially it's
> claiming a higher STACK_BOUNDARY than it really has.
> 
> Presumably there's a good reason for this and some kind of hack may be
> needed to deal with it in dynamically allocated space.  But it does not
> seem like we should be forcing all targets to allocate unnecessary space
> to deal with this.

It's not just STACK_BOUNDARY, the outgoing argument offset is incorrect too.
These snippets of code from PR78468 (comment 20) look very wrong:

sub %sp, %g2, %sp
add %sp, 108, %g3   ; g3 = fp - 28 (x)
sub %sp, %g2, %sp
add %sp, 108, %g2   ; g2 = fp - 44 (d)
sub %sp, %g1, %sp
add %sp, 112, %g1   ; g1 = fp - 56 (e)

There are several different outgoing argument offsets used here (108 and 112). 
This not only results in alloca blocks being unaligned (when they should be at 
least
aligned to STACK_BOUNDARY, ideally PREFERRED_STACK_BOUNDARY),
but this also means you end up with different alloca blocks overlapping and 
corrupting
each other in non-trivial ways...

Wilco

Re: [PATCH] Improve alloca alignment

2017-09-11 Thread Jeff Law
On 09/09/2017 02:51 AM, Eric Botcazou wrote:
>> No, the stack never gets misaligned - my patch doesn't change that at all.
> 
> Yes, it does.  Dynamic allocation works like this: the amount to be allocated 
> is added to VIRTUAL_STACK_DYNAMIC_REGNUM and the result is then dynamically 
> aligned.  Your patch assumes that VIRTUAL_STACK_DYNAMIC_REGNUM is as aligned 
> as the stack pointer, i.e. STACK_BOUNDARY, but that's wrong for 32-bit SPARC 
> at least (that's why I put the ??? note at line 5746 in emit-rtl.c).
This seems like a SPARC target problem to me -- essentially it's
claiming a higher STACK_BOUNDARY than it really has.

Presumably there's a good reason for this and some kind of hack may be
needed to deal with it in dynamically allocated space.  But it does not
seem like we should be forcing all targets to allocate unnecessary space
to deal with this.

Jeff


Re: [PATCH] Improve alloca alignment

2017-09-11 Thread Wilco Dijkstra
Eric Botcazou wrote:
    
>> No, the stack never gets misaligned - my patch doesn't change that at all.
>
> Yes, it does.  

No. Look at the diffs, there is not a single change in alignment anywhere for 
all
of the alloca variants. If the alignment is incorrect after my patch, it is also
incorrect before my patch. This is the diff for pr78668.c on AArch64:

>diff pr78668.before pr78668.after
<   add x0, x0, 18
---
>   add x0, x0, 15
90c90
<   add x0, x0, 18
---
>   add x0, x0, 15
120c120
<   add x0, x0, 22
---
>   add x0, x0, 15
149c149
<   add x0, x0, 22
---
>   add x0, x0, 15
179c179
<   add x0, x0, 30
---
>   add x0, x0, 15
208c208
<   add x0, x0, 30
---
>   add x0, x0, 15
238c238
<   add x0, x0, 46
---
>   add x0, x0, 31
268c268
<   add x0, x0, 46
---
>   add x0, x0, 31


The mid-end always ensures that the stack is decremented by a value that is a
multiple of STACK_BOUNDARY. My change does not make any difference here,
but if SP is not aligned to STACK_BOUNDARY then it's obviously broken before
my patch. For example the relevant instructions for t2_a8 are:

add x0, x0, 15
and x0, x0, -16
sub sp, sp, x0
add x0, sp, 32

As you can see, it always rounds up and aligns to STACK_BOUNDARY
before adjusting SP. When computing the final alloca address (the last add 
above)
you must also keep that STACK_BOUNDARY alignment.

To conclude my change just avoids inserting redundant padding based on the
alignment promise that is made by the backend. The alignment itself is unchanged
and is already incorrect on Sparc before my change.

>> The issue is that Sparc backend doesn't correctly set STACK_BOUNDSARY and
>> neither aligns the outgoing args. Run my test which proves alloca was
>> broken before my patch.
>
> How could this have been broken for so long, realistically?  The SPARC back-
> end is parameterized according to the ABI and the documented interface 
> between 
> middle-end and back-end.

It clearly gets the ABI wrong as it sets STACK_BOUNDARY and then doesn't keep
the alignment promise it makes.

Wilco

Re: [PATCH] Improve alloca alignment

2017-09-09 Thread Eric Botcazou
> No, the stack never gets misaligned - my patch doesn't change that at all.

Yes, it does.  Dynamic allocation works like this: the amount to be allocated 
is added to VIRTUAL_STACK_DYNAMIC_REGNUM and the result is then dynamically 
aligned.  Your patch assumes that VIRTUAL_STACK_DYNAMIC_REGNUM is as aligned 
as the stack pointer, i.e. STACK_BOUNDARY, but that's wrong for 32-bit SPARC 
at least (that's why I put the ??? note at line 5746 in emit-rtl.c).

The previous attempt by Dominic Vogt was more correct in this respect:

2016-08-23  Dominik Vogt  

* explow.c (get_dynamic_stack_size): Take known alignment of stack
pointer + STACK_DYNAMIC_OFFSET into account when calculating the
size needed.

since it was using the declared alignment of VIRTUAL_STACK_DYNAMIC_REGNUM and 
not STACK_BOUNDARY directly.  But the outcome was the same, since the declared 
alignment is still wrong for 32-bit SPARC.

> The issue is that Sparc backend doesn't correctly set STACK_BOUNDSARY and
> neither aligns the outgoing args. Run my test which proves alloca was
> broken before my patch.

How could this have been broken for so long, realistically?  The SPARC back-
end is parameterized according to the ABI and the documented interface between 
middle-end and back-end.

-- 
Eric Botcazou


Re: [PATCH] Improve alloca alignment

2017-09-08 Thread Wilco Dijkstra
Hi Rainer,

Can you post the disassembly for say the 8-byte aligned tests? It may not be 
built correctly or hit an offset that is accidentally aligned, however 
pass/fail status can't change due to my patch as it doesn't change alignment at 
all.

Wilco

Re: [PATCH] Improve alloca alignment

2017-09-08 Thread Rainer Orth
Hi Wilco,

> Eric Botcazou wrote:
>
>> The stack is aligned before the allocation but it gets misaligned during the
>> allocation because the dynamic offset is not a multiple of STACK_BOUNDARY.
>
> No, the stack never gets misaligned - my patch doesn't change that at
> all. The issue is that Sparc backend doesn't correctly set STACK_BOUNDSARY
> and neither aligns the outgoing args. Run my test which proves alloca was
> broken before my patch.

I'm currently running my SPARC bootstraps with your patch backed out so
regressions don't go overboard.  The test PASSes this way.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Improve alloca alignment

2017-09-08 Thread Wilco Dijkstra
Eric Botcazou wrote:

> The stack is aligned before the allocation but it gets misaligned during the
> allocation because the dynamic offset is not a multiple of STACK_BOUNDARY.

No, the stack never gets misaligned - my patch doesn't change that at all. The 
issue is that Sparc backend doesn't correctly set STACK_BOUNDSARY and neither 
aligns the outgoing args. Run my test which proves alloca was broken before my 
patch.

Wilco


Re: [PATCH] Improve alloca alignment

2017-09-08 Thread Eric Botcazou
> This patch brought back PR libgomp/78468, which had caused its
> predecessor to be backed out of gcc-7.

Yes, it's exactly the same mistake:

+  /* Since the stack is presumed to be aligned before this allocation,
+ we only need to increase the size of the allocation if the required
+ alignment is more than the stack alignment.  */

The stack is aligned before the allocation but it gets misaligned during the 
allocation because the dynamic offset is not a multiple of STACK_BOUNDARY.

The code had been realigning the stack pointer for more than 2 decades to 
enforce STACK_BOUNDARY but suddenly stopped again with Wilco's patch.

The failure mode is very nasty (random corruption of the stack contents) and 
there are very likely other affected targets among the ~50 supported ones.

-- 
Eric Botcazou


Re: [PATCH] Improve alloca alignment

2017-09-06 Thread Rainer Orth
Jeff Law  writes:

> On 08/22/2017 08:15 AM, Wilco Dijkstra wrote:
>> Jeff Law wrote:
>> On 07/26/2017 05:29 PM, Wilco Dijkstra wrote:
>> 
 But then the check size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0
 seems wrong too given that round_push uses a different alignment to
 align to.
>>> I had started to dig into the history of this code, but just didn't have
>>> the time to do so fully before needing to leave for the day.  To some
>>> degree I was hoping you knew the rationale behind the test against
>>> MAX_SUPPORTED_STACK_ALIGNMENT and I wouldn't have to do a ton of digging :-)
>> 
>> I looked further into this - it appears this works correctly since it is
>> only bypassed if
>> size_align is already maximally aligned. round_push aligns to the
>> preferred alignment,
>> which may be lower or equal to MAX_SUPPORTED_STACK_ALIGNMENT (which is
>> at least STACK_BOUNDARY).
>> 
>> Here is the updated version:
>> 
>> ChangeLog:
>> 2017-08-22  Wilco Dijkstra  
>> 
>>  * explow.c (get_dynamic_stack_size): Improve dynamic alignment.
> OK.  I wonder if this will make it easier to write stack-clash tests of
> the dynamic space for boundary conditions :-)  I was always annoyed that
> I had to fiddle around with magic adjustments to the sizes of objects to
> tickle boundary cases.

This patch brought back PR libgomp/78468, which had caused its
predecessor to be backed out of gcc-7.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Improve alloca alignment

2017-08-24 Thread Jeff Law
On 08/22/2017 08:15 AM, Wilco Dijkstra wrote:
> Jeff Law wrote:
> On 07/26/2017 05:29 PM, Wilco Dijkstra wrote:
> 
>>> But then the check size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0
>>> seems wrong too given that round_push uses a different alignment to align 
>>> to. 
>> I had started to dig into the history of this code, but just didn't have
>> the time to do so fully before needing to leave for the day.  To some
>> degree I was hoping you knew the rationale behind the test against
>> MAX_SUPPORTED_STACK_ALIGNMENT and I wouldn't have to do a ton of digging :-)
> 
> I looked further into this - it appears this works correctly since it is only 
> bypassed if
> size_align is already maximally aligned. round_push aligns to the preferred 
> alignment,
> which may be lower or equal to MAX_SUPPORTED_STACK_ALIGNMENT (which is
> at least STACK_BOUNDARY).
> 
> Here is the updated version:
> 
> ChangeLog:
> 2017-08-22  Wilco Dijkstra  
> 
>   * explow.c (get_dynamic_stack_size): Improve dynamic alignment.
OK.  I wonder if this will make it easier to write stack-clash tests of
the dynamic space for boundary conditions :-)  I was always annoyed that
I had to fiddle around with magic adjustments to the sizes of objects to
tickle boundary cases.

jeff


Re: [PATCH] Improve alloca alignment

2017-08-22 Thread Wilco Dijkstra
Jeff Law wrote:
On 07/26/2017 05:29 PM, Wilco Dijkstra wrote:

> > But then the check size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0
> > seems wrong too given that round_push uses a different alignment to align 
> > to. 
> I had started to dig into the history of this code, but just didn't have
> the time to do so fully before needing to leave for the day.  To some
> degree I was hoping you knew the rationale behind the test against
> MAX_SUPPORTED_STACK_ALIGNMENT and I wouldn't have to do a ton of digging :-)

I looked further into this - it appears this works correctly since it is only 
bypassed if
size_align is already maximally aligned. round_push aligns to the preferred 
alignment,
which may be lower or equal to MAX_SUPPORTED_STACK_ALIGNMENT (which is
at least STACK_BOUNDARY).

Here is the updated version:

ChangeLog:
2017-08-22  Wilco Dijkstra  

* explow.c (get_dynamic_stack_size): Improve dynamic alignment.

diff --git a/gcc/explow.c b/gcc/explow.c
index 
50074e281edd5270c76d29feac6b7a92f598d11d..d3148273030a010ece1f8ea1c14eef64bbf4e78a
 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1234,15 +1234,20 @@ get_dynamic_stack_size (rtx *psize, unsigned size_align,
  example), so we must preventively align the value.  We leave space
  in SIZE for the hole that might result from the alignment operation.  */
 
-  extra = (required_align - BITS_PER_UNIT) / BITS_PER_UNIT;
-  size = plus_constant (Pmode, size, extra);
-  size = force_operand (size, NULL_RTX);
-
-  if (flag_stack_usage_info && pstack_usage_size)
-*pstack_usage_size += extra;
+  /* Since the stack is presumed to be aligned before this allocation,
+ we only need to increase the size of the allocation if the required
+ alignment is more than the stack alignment.  */
+  if (required_align > STACK_BOUNDARY)
+{
+  extra = (required_align - STACK_BOUNDARY) / BITS_PER_UNIT;
+  size = plus_constant (Pmode, size, extra);
+  size = force_operand (size, NULL_RTX);
+  if (size_align > STACK_BOUNDARY)
+   size_align = STACK_BOUNDARY;
 
-  if (extra && size_align > BITS_PER_UNIT)
-size_align = BITS_PER_UNIT;
+  if (flag_stack_usage_info && pstack_usage_size)
+   *pstack_usage_size += extra;
+}
 
   /* Round the size to a multiple of the required stack alignment.
  Since the stack is presumed to be rounded before this allocation,


Re: [PATCH] Improve alloca alignment

2017-07-27 Thread Jeff Law
On 07/26/2017 05:29 PM, Wilco Dijkstra wrote:
> Jeff Law wrote:
> 
>> +  if (required_align > MAX_SUPPORTED_STACK_ALIGNMENT)
>> +{
>> +  extra = (required_align - MAX_SUPPORTED_STACK_ALIGNMENT)
>> + / BITS_PER_UNIT;
>> +  size = plus_constant (Pmode, size, extra);
>> +  size = force_operand (size, NULL_RTX);
>>   
>> -  if (extra && size_align > BITS_PER_UNIT)
>> -size_align = BITS_PER_UNIT;
>> +  if (flag_stack_usage_info && pstack_usage_size)
>> + *pstack_usage_size += extra;
>> +}
> 
>> So it's unclear to me why you increase the size iff REQUIRED_ALIGN is
>> greater than MAX_SUPPORTED_STACK_ALIGNMENT.
>>
>> ISTM the safe assumption we can make in this code is that the stack is
>> aligned to STACK_BOUNDARY.  If so, isn't the right test
>>
>> (required_align > STACK_BOUNDARY)?
>>
>> And once we decide to make an adjustment, isn't the right adjustment to
>> make (required_align - STACK_BOUNDARY) / BITS_PER_UNIT?
>>
>> I feel like I must be missing something really important here.
> 
> Yes I think you're right that if STACK_BOUNDARY is the guaranteed stack
> alignment, it should check against STACK_BOUNDARY indeed.
Seems that way to me.

> 
> But then the check size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0
> seems wrong too given that round_push uses a different alignment to align to. 
I had started to dig into the history of this code, but just didn't have
the time to do so fully before needing to leave for the day.  To some
degree I was hoping you knew the rationale behind the test against
MAX_SUPPORTED_STACK_ALIGNMENT and I wouldn't have to do a ton of digging :-)

Jeff


Re: [PATCH] Improve alloca alignment

2017-07-26 Thread Wilco Dijkstra
Jeff Law wrote:

> +  if (required_align > MAX_SUPPORTED_STACK_ALIGNMENT)
> +    {
> +  extra = (required_align - MAX_SUPPORTED_STACK_ALIGNMENT)
> + / BITS_PER_UNIT;
> +  size = plus_constant (Pmode, size, extra);
> +  size = force_operand (size, NULL_RTX);
>  
> -  if (extra && size_align > BITS_PER_UNIT)
> -    size_align = BITS_PER_UNIT;
> +  if (flag_stack_usage_info && pstack_usage_size)
> + *pstack_usage_size += extra;
> +    }

> So it's unclear to me why you increase the size iff REQUIRED_ALIGN is
> greater than MAX_SUPPORTED_STACK_ALIGNMENT.
>
> ISTM the safe assumption we can make in this code is that the stack is
> aligned to STACK_BOUNDARY.  If so, isn't the right test
>
> (required_align > STACK_BOUNDARY)?
>
> And once we decide to make an adjustment, isn't the right adjustment to
> make (required_align - STACK_BOUNDARY) / BITS_PER_UNIT?
>
> I feel like I must be missing something really important here.

Yes I think you're right that if STACK_BOUNDARY is the guaranteed stack
alignment, it should check against STACK_BOUNDARY indeed.

But then the check size_align % MAX_SUPPORTED_STACK_ALIGNMENT != 0
seems wrong too given that round_push uses a different alignment to align to. 

Wilco










Re: [PATCH] Improve alloca alignment

2017-07-26 Thread Jeff Law
On 07/26/2017 11:39 AM, Wilco Dijkstra wrote:
> This patch improves alloca alignment.  Currently alloca reserves
> too much space as it aligns twice, and generates unnecessary stack
> alignment code.  For example alloca (16) generates:
> 
>   sub sp, sp, #32   ???
>   mov x1, sp
> 
> Similarly alloca (x) generates:
> 
>   add x0, x0, 30???
>   and x0, x0, -16
>   sub sp, sp, x0
>   mov x0, sp
> 
> __builtin_alloca_with_align (x, 256):
>   add x0, x0, 78???
>   and x0, x0, -16
>   sub sp, sp, x0
>   add x0, sp, 63
>   and x0, x0, -64
> 
> As can be seen the alignment adjustment value is incorrect.
> When the requested alignment is lower than the stack alignment, no
> extra alignment is needed.  If the requested alignment is higher,
> we need to increase the size by the difference of the requested 
> alignment and the stack alignment.  As a result, the alloca alignment
> is exactly as expected:
> 
> alloca (16):
>   sub sp, sp, #16
>   mov x1, sp
> 
> alloca (x):
>   add x0, x0, 15
>   and x0, x0, -16
>   sub sp, sp, x0
>   mov x0, sp
> 
> __builtin_alloca_with_align (x, 256):
>   add x0, x0, 63
>   and x0, x0, -16
>   sub sp, sp, x0
>   add x0, sp, 63
>   and x0, x0, -64
> 
> ChangeLog:
> 2017-07-26  Wilco Dijkstra  
> 
>   * explow.c (get_dynamic_stack_size): Improve dynamic alignment.
Hehe.  This stuff is a mess.  Dominik went round and round in this code
about a year ago, I'm not sure we ever got it right for the cases he
wanted to improve.

This is something I wanted to look at but had deferred (it makes writing
some stack-clash tests of this code more painful than it really needs to
be).


> 
> --
> diff --git a/gcc/explow.c b/gcc/explow.c
> index 
> 50074e281edd5270c76d29feac6b7a92f598d11d..fbdda5fa1e303664e346f975270415b40aed252d
>  100644
> --- a/gcc/explow.c
> +++ b/gcc/explow.c
> @@ -1234,15 +1234,22 @@ get_dynamic_stack_size (rtx *psize, unsigned 
> size_align,
>   example), so we must preventively align the value.  We leave space
>   in SIZE for the hole that might result from the alignment operation.  */
>  
> -  extra = (required_align - BITS_PER_UNIT) / BITS_PER_UNIT;
> -  size = plus_constant (Pmode, size, extra);
> -  size = force_operand (size, NULL_RTX);
> -
> -  if (flag_stack_usage_info && pstack_usage_size)
> -*pstack_usage_size += extra;
> +  /* Since the stack is presumed to be aligned before this allocation,
> + we only need to increase the size of the allocation if the required
> + alignment is more than the stack alignment.
> + Note size_align doesn't need to be updated - if it is larger than the
> + stack alignment, size remains a multiple of the stack alignment, so
> + we can skip rounding up to the stack alignment.  */
> +  if (required_align > MAX_SUPPORTED_STACK_ALIGNMENT)
> +{
> +  extra = (required_align - MAX_SUPPORTED_STACK_ALIGNMENT)
> + / BITS_PER_UNIT;
> +  size = plus_constant (Pmode, size, extra);
> +  size = force_operand (size, NULL_RTX);
>  
> -  if (extra && size_align > BITS_PER_UNIT)
> -size_align = BITS_PER_UNIT;
> +  if (flag_stack_usage_info && pstack_usage_size)
> + *pstack_usage_size += extra;
> +}
So it's unclear to me why you increase the size iff REQUIRED_ALIGN is
greater than MAX_SUPPORTED_STACK_ALIGNMENT.

ISTM the safe assumption we can make in this code is that the stack is
aligned to STACK_BOUNDARY.  If so, isn't the right test

(required_align > STACK_BOUNDARY)?

And once we decide to make an adjustment, isn't the right adjustment to
make (required_align - STACK_BOUNDARY) / BITS_PER_UNIT?

I feel like I must be missing something really important here.

jeff