Re: [PR64164] drop copyrename, integrate into expand

2015-12-04 Thread Dominik Vogt
On Fri, Mar 27, 2015 at 03:04:05PM -0300, Alexandre Oliva wrote:
> This patch reworks the out-of-ssa expander to enable coalescing of SSA
> partitions that don't share the same base name.  This is done only when
> optimizing.
> 
> The test we use to tell whether two partitions can be merged no longer
> demands them to have the same base variable when optimizing, so they
> become eligible for coalescing, as they would after copyrename.  We then
> compute the partitioning we'd get if all coalescible partitions were
> coalesced, using this partition assignment to assign base vars numbers.
> These base var numbers are then used to identify conflicts, which used
> to be based on shared base vars or base types.
> 
> We now propagate base var names during coalescing proper, only towards
> the leader variable.  I'm no longer sure this is still needed, but
> something about handling variables and results led me this way and I
> didn't revisit it.  I might rework that with a later patch, or a later
> revision of this patch; it would require other means to identify
> partitions holding result_decls during merging, or allow that and deal
> with param and result decls in a different way during expand proper.
> 
> I had to fix two lingering bugs in order for the whole thing to work: we
> perform conflict detection after abnormal coalescing, but we computed
> live ranges involving only the partition leaders, so conflicts with
> other names already coalesced wouldn't be detected.  The other problem
> was that we didn't track default defs for parms as live at entry, so
> they might end up coalesced.  I guess none of these problems would have
> been exercised in practice, because we wouldn't even consider merging
> ssa names associated with different variables.
> 
> In the end, I verified that this fixed the codegen regression in the
> PR64164 testcase, that failed to merge two partitions that could in
> theory be merged, but that wasn't even considered due to differences in
> the SSA var names.
> 
> I'd agree that disregarding the var names and dropping 4 passes is too
> much of a change to fix this one problem, but...  it's something we
> should have long tackled, and it gets this and other jobs done, so...
> 
> Regstrapped on x86_64-linux-gnu native and on i686-pc-linux-gnu native
> on x86_64, so without lto.  Is this ok to install?

The patch that got committed as a result of this discussion causes
a performance regression on s390[x].  Bug report:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68695

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany



Re: [PR64164] drop copyrename, integrate into expand

2015-11-23 Thread Jeff Law

On 11/16/2015 05:07 PM, Alexandre Oliva wrote:


The check is not in my patch, indeed.  That's because the previous block
performs the runtime check, and it only lets through two cases: the one
we handle, and the one nobody uses.
That was the conclusion I was starting to come to, but expressed so 
poorly in my last message.  Sadly it was non-obvious from staring at the 
current code.  Though I must admit that after a week, I can see it 
better now.  Maybe that's a result of re-reading your message a 
half-dozen more times with the current code and your patch all visible 
in windows next to each other :-)



Prior to your change we'd just blindly copy from ENTRY_PARM to MEM, 
which would result in missing the implicit shift if MEM wasn't actually 
a memory.


You're just moving that conditional up and handling the shift 
explicitly.  You've got asserts for the cases you're not handling (and 
no, I'm not aware of the need for this on any LE architecture, while I 
am aware of BE architectures that align in both directions).




Any suggestions on how to improve the comments so that they convey
enough of this reasoning to make sense, without our having to write a
book :-) on the topic?
Refer back to this thread? :-)  Seriously though, looking at things a 
week later, I can see it much better now.  Thanks for your patience on this.


OK for the trunk,
jeff


Re: [PR64164] drop copyrename, integrate into expand

2015-11-16 Thread Alexandre Oliva
On Nov 13, 2015, Jeff Law  wrote:

> On 11/11/2015 11:10 AM, Alexandre Oliva wrote:
>> On Nov 10, 2015, Jeff Law  wrote:
>> 
 * function.c (assign_parm_setup_block): Right-shift
 upward-padded big-endian args when bypassing the stack slot.
>>> Don't you need to check the value of BLOCK_REG_PADDING at runtime?
>>> The padding is essentially allowed to vary.
>> 
>> Well, yeah, it's the result of BLOCK_REG_PADDING that tells whether
>> upward-padding occurred and shifting is required.
>> 
>>> If you  look at the other places where BLOCK_REG_PADDING is used, it's
>>> checked in a #ifdef, then again inside a if conditional.
>> 
>> That's what I do in the patch too.
> ?  I don't see the runtime check in your patch.  I see a couple
> gcc_asserts, but no runtime check of BLOCK_REG_PADDING.

The check is not in my patch, indeed.  That's because the previous block
performs the runtime check, and it only lets through two cases: the one
we handle, and the one nobody uses.

The previous block tests this:

  if (mode != BLKmode
#ifdef BLOCK_REG_PADDING
  && (size == UNITS_PER_WORD
  || (BLOCK_REG_PADDING (mode, data->passed_type, 1)
  != (BYTES_BIG_ENDIAN ? upward : downward)))
#endif
  )

i.e., whether we know the mode of the passed value, and its word-sized,
or its padded such that the passed value is in the lowpart of the word.
Since this is in a block that runs when size <= UNITS_PER_WORD, this
catches (and works for) all cases of default padding (when
BLOCK_REG_PADDING is not defined), and for cases in which
BLOCK_REG_PADDING is defined so as to behave like the default padding,
at least for smaller-than-word modes.

So, since this handles little-endian bytes with upward padding and
big-endian bytes with downward padding, what remains to be handled is
little-endian bytes with downward padding and big-endian bytes with
upward padding.  I found no evidence that the former is ever used
anywhere, or why anyone would ever force shifting for both REG and MEM
use, and I don't see how the code would have dealt with this case
anyway, so I left it unhandled.  The other case, big-endian bytes with
upward padding, is precisely the one that my previous patch broke on
AArch64: we have the passed values pushed to the upper part of the REG
so that it could be stored in memory as a whole word and then accessed
in the smaller mode at the same address.  After checking that this is
the case at hand, we shift the value as if we stored it in memory as a
word and loaded it in the value mode.

>> That said, the initial conditions in the if/else-if/else chain for the
>> no-larger-than-a-word case cover all of the non-BLOCK_REG_PADDING cases
>> correctly, so that, if BLOCK_REG_PADDING is not defined, we can just
>> skip the !MEM_P block altogether.  That's also the reason why we can go
>> straight to shifting when we get there.
>> 
>> I tried to document my reasoning in the comments, but maybe it was still
>> too obscure?
> Certainly seems that way.  Is it your assertion that the new code is
> what we want regardless of the *value* of REG_BLOCK_PADDING?

Sort of.  If we get to that point, there's only one reasonable value of
BLOCK_REG_PADDING (*), although there's another possible value that we
historically haven't handled and that makes very little sense to
support.  We'd have silently corrupted it before, while now we'd get an
assertion failure.  I count that as an improvement, though it's unlikely
we'd ever hit it: anyone trying to define BLOCK_REG_PADDING so as to pad
small args downward on little-endian bytes would AFAICT soon find out it
doesn't work.

(*) unless mode is BLKmode, which the newly-added code implicitly
excludes by testing that we don't have a MEM, but rather a REG.

> Essentially meaning the check in the IF is covering both cases?

Among all cases for arguments that are word-sized or smaller, the
initial IF (not present in the patch) covers all of the "usual" cases.
The remaining blocks, including the one I added, cover the remaining
handled case, namely, BIG_ENDIAN_BYTES and upward BLOCK_REG_PADDING, or
BLKmode BIG_ENDIAN_BYTES and downward BLOCK_REG_PADDING (that needs the
opposite padding when storing to big-endian mem, so that the value can
be accessed at the address in which the full word is stored).

> What am I missing here?

I agree that the way the remaining tests are written doesn't make it
clear that they're all handling a single case, which makes things
confusing.  In part, that's because they really aren't; they also deal
with BLKmode MEMs with "usual" padding.  But that's not a case that the
patch affects, because we don't have BLKmode REGs.

Any suggestions on how to improve the comments so that they convey
enough of this reasoning to make sense, without our having to write a
book :-) on the topic?

Thanks,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change 

Re: [PR64164] drop copyrename, integrate into expand

2015-11-12 Thread Jeff Law

On 11/11/2015 11:10 AM, Alexandre Oliva wrote:

On Nov 10, 2015, Jeff Law  wrote:


* function.c (assign_parm_setup_block): Right-shift
upward-padded big-endian args when bypassing the stack slot.

Don't you need to check the value of BLOCK_REG_PADDING at runtime?
The padding is essentially allowed to vary.


Well, yeah, it's the result of BLOCK_REG_PADDING that tells whether
upward-padding occurred and shifting is required.


If you  look at the other places where BLOCK_REG_PADDING is used, it's
checked in a #ifdef, then again inside a if conditional.


That's what I do in the patch too.
?  I don't see the runtime check in your patch.  I see a couple 
gcc_asserts, but no runtime check of BLOCK_REG_PADDING.




That said, the initial conditions in the if/else-if/else chain for the
no-larger-than-a-word case cover all of the non-BLOCK_REG_PADDING cases
correctly, so that, if BLOCK_REG_PADDING is not defined, we can just
skip the !MEM_P block altogether.  That's also the reason why we can go
straight to shifting when we get there.

I tried to document my reasoning in the comments, but maybe it was still
too obscure?
Certainly seems that way.  Is it your assertion that the new code is 
what we want regardless of the *value* of REG_BLOCK_PADDING? 
Essentially meaning the check in the IF is covering both cases?


What am I missing here?

Jeff



Re: [PR64164] drop copyrename, integrate into expand

2015-11-11 Thread Alexandre Oliva
On Nov 10, 2015, Jeff Law  wrote:

>> * function.c (assign_parm_setup_block): Right-shift
>> upward-padded big-endian args when bypassing the stack slot.
> Don't you need to check the value of BLOCK_REG_PADDING at runtime?
> The padding is essentially allowed to vary.

Well, yeah, it's the result of BLOCK_REG_PADDING that tells whether
upward-padding occurred and shifting is required.

> If you  look at the other places where BLOCK_REG_PADDING is used, it's
> checked in a #ifdef, then again inside a if conditional.

That's what I do in the patch too.

That said, the initial conditions in the if/else-if/else chain for the
no-larger-than-a-word case cover all of the non-BLOCK_REG_PADDING cases
correctly, so that, if BLOCK_REG_PADDING is not defined, we can just
skip the !MEM_P block altogether.  That's also the reason why we can go
straight to shifting when we get there.

I tried to document my reasoning in the comments, but maybe it was still
too obscure?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-11-10 Thread Alan Lawrence

On 05/11/15 05:08, Alexandre Oliva wrote:

[PR67753] fix copy of PARALLEL entry_parm to CONCAT target_reg
for  gcc/ChangeLog

PR rtl-optimization/67753
PR rtl-optimization/64164
* function.c (assign_parm_setup_block): Avoid allocating a
stack slot if we don't have an ABI-reserved one.  Emit the
copy to target_reg in the conversion seq if the copy from
entry_parm is in it too.  Don't use the conversion seq to copy
a PARALLEL to a REG or a CONCAT.


Since this change, we have on aarch64_be:

FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O1
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O2
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O3 -g
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -Os
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -Og -g

The difference in the assembler looks as follows (this is at -Og):

 func_return_val_10:
-   sub sp, sp, #16
-   lsr x2, x1, 48
-   lsr x1, x1, 32
+   ubfxx2, x1, 16, 16
fmovx3, d0
// Start of user assembly
 // 23 "func-ret-4.c" 1
mov x0, x30
 // 0 "" 2
// End of user assembly
adrpx3, saved_return_address
str x0, [x3, #:lo12:saved_return_address]
adrpx0, myfunc
add x0, x0, :lo12:myfunc
// Start of user assembly
 // 23 "func-ret-4.c" 1
mov x30, x0
 // 0 "" 2
// End of user assembly
bfi w0, w2, 16, 16
bfi w0, w1, 0, 16
lsl x0, x0, 32
-   add sp, sp, 16

(ubfx is a bitfield extract, the first immediate is the lsbit, the second the 
width. lsr = logical shift right.) And in the RTL dump, this (before the patch):


(insn 4 3 5 2 (set (mem/c:DI (plus:DI (reg/f:DI 68 virtual-stack-vars)
(const_int -8 [0xfff8])) [0 t+0 S8 A64])
(reg:DI 1 x1)) func-ret-4.c:23 -1
 (nil))
(insn 5 4 6 2 (set (reg:HI 78 [ t ])
(mem/c:HI (plus:DI (reg/f:DI 68 virtual-stack-vars)
(const_int -8 [0xfff8])) [0 t+0 S2 A64])) 
func-ret-4.c:23 -1

 (nil))
(insn 6 5 7 2 (set (reg:HI 79 [ t+2 ])
(mem/c:HI (plus:DI (reg/f:DI 68 virtual-stack-vars)
(const_int -6 [0xfffa])) [0 t+2 S2 A16])) 
func-ret-4.c:23 -1

 (nil))

becomes (after the patch):

(insn 4 3 5 2 (set (subreg:SI (reg:CHI 80) 0)
(reg:SI 1 x1 [ t ])) func-ret-4.c:23 -1
 (nil))
(insn 5 4 6 2 (set (reg:SI 81)
(subreg:SI (reg:CHI 80) 0)) func-ret-4.c:23 -1
 (nil))
(insn 6 5 7 2 (set (subreg:DI (reg:HI 82) 0)
(zero_extract:DI (subreg:DI (reg:SI 81) 0)
(const_int 16 [0x10])
(const_int 16 [0x10]))) func-ret-4.c:23 -1
 (nil))
(insn 7 6 8 2 (set (reg:HI 78 [ t ])
(reg:HI 82)) func-ret-4.c:23 -1
 (nil))
(insn 8 7 9 2 (set (reg:SI 83)
(subreg:SI (reg:CHI 80) 0)) func-ret-4.c:23 -1
 (nil))
(insn 9 8 10 2 (set (reg:HI 79 [ t+2 ])
(subreg:HI (reg:SI 83) 2)) func-ret-4.c:23 -1
 (nil))

--Alan



Re: [PR64164] drop copyrename, integrate into expand

2015-11-10 Thread Alexandre Oliva
On Nov 10, 2015, Alan Lawrence  wrote:

> FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O2

Ugh, sorry.  I even checked that testcase by hand before submitting the
patch, because I knew it took the paths I was changing, but I didn't
realize the stack store and load would amount to shifts when the stack
slot was bypassed.

With the following patch, we get a lsr and a ubfx, without the sp
adjustments.  Please let me know if it causes any further problems.  So
far, I've tested it on x86_64-linux-gnu, i686-linux-gnu, and
ppc64le-linux-gnu; the ppc64-linux-gnu test run is running slower and
probably won't be done before I call it a day, but I wanted to give you
something before taking off for the day.

Is this ok to install if ppc64-linux-gnu also regstraps successfully?


[PR67753] adjust for padding when bypassing memory in assign_parm_setup_block

From: Alexandre Oliva 

Storing a register in memory as a full word and then accessing the
same memory address under a smaller-than-word mode amounts to
right-shifting of the register word on big endian machines.  So, if
BLOCK_REG_PADDING chooses upward padding for BYTES_BIG_ENDIAN, and
we're copying from the entry_parm REG directly to a pseudo, bypassing
any stack slot, perform the shifting explicitly.

This fixes the miscompile of function_return_val_10 in
gcc.target/aarch64/aapcs64/func-ret-4.c for target aarch64_be-elf
introduced in the first patch for 67753.

for  gcc/ChangeLog

PR rtl-optimization/67753
PR rtl-optimization/64164
* function.c (assign_parm_setup_block): Right-shift
upward-padded big-endian args when bypassing the stack slot.
---
 gcc/function.c |   44 +---
 1 file changed, 41 insertions(+), 3 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index a637cb3..1ee092c 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -3002,6 +3002,38 @@ assign_parm_setup_block (struct assign_parm_data_all 
*all,
  emit_move_insn (change_address (mem, mode, 0), reg);
}
 
+#ifdef BLOCK_REG_PADDING
+ /* Storing the register in memory as a full word, as
+move_block_from_reg below would do, and then using the
+MEM in a smaller mode, has the effect of shifting right
+if BYTES_BIG_ENDIAN.  If we're bypassing memory, the
+shifting must be explicit.  */
+ else if (!MEM_P (mem))
+   {
+ rtx x;
+
+ /* If the assert below fails, we should have taken the
+mode != BLKmode path above, unless we have downward
+padding of smaller-than-word arguments on a machine
+with little-endian bytes, which would likely require
+additional changes to work correctly.  */
+ gcc_checking_assert (BYTES_BIG_ENDIAN
+  && (BLOCK_REG_PADDING (mode,
+ data->passed_type, 1)
+  == upward));
+
+ int by = (UNITS_PER_WORD - size) * BITS_PER_UNIT;
+
+ x = gen_rtx_REG (word_mode, REGNO (entry_parm));
+ x = expand_shift (RSHIFT_EXPR, word_mode, x, by,
+   NULL_RTX, 1);
+ x = force_reg (word_mode, x);
+ x = gen_lowpart_SUBREG (GET_MODE (mem), x);
+
+ emit_move_insn (mem, x);
+   }
+#endif
+
  /* Blocks smaller than a word on a BYTES_BIG_ENDIAN
 machine must be aligned to the left before storing
 to memory.  Note that the previous test doesn't
@@ -3023,14 +3055,20 @@ assign_parm_setup_block (struct assign_parm_data_all 
*all,
  tem = change_address (mem, word_mode, 0);
  emit_move_insn (tem, x);
}
- else if (!MEM_P (mem))
-   emit_move_insn (mem, entry_parm);
  else
move_block_from_reg (REGNO (entry_parm), mem,
 size_stored / UNITS_PER_WORD);
}
   else if (!MEM_P (mem))
-   emit_move_insn (mem, entry_parm);
+   {
+ gcc_checking_assert (size > UNITS_PER_WORD);
+#ifdef BLOCK_REG_PADDING
+ gcc_checking_assert (BLOCK_REG_PADDING (GET_MODE (mem),
+ data->passed_type, 0)
+  == upward);
+#endif
+ emit_move_insn (mem, entry_parm);
+   }
   else
move_block_from_reg (REGNO (entry_parm), mem,
 size_stored / UNITS_PER_WORD);


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-11-10 Thread Jeff Law

On 11/10/2015 03:58 PM, Alexandre Oliva wrote:

On Nov 10, 2015, Alan Lawrence  wrote:


FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c execution,  -O2


Ugh, sorry.  I even checked that testcase by hand before submitting the
patch, because I knew it took the paths I was changing, but I didn't
realize the stack store and load would amount to shifts when the stack
slot was bypassed.

With the following patch, we get a lsr and a ubfx, without the sp
adjustments.  Please let me know if it causes any further problems.  So
far, I've tested it on x86_64-linux-gnu, i686-linux-gnu, and
ppc64le-linux-gnu; the ppc64-linux-gnu test run is running slower and
probably won't be done before I call it a day, but I wanted to give you
something before taking off for the day.

Is this ok to install if ppc64-linux-gnu also regstraps successfully?


[PR67753] adjust for padding when bypassing memory in assign_parm_setup_block

From: Alexandre Oliva 

Storing a register in memory as a full word and then accessing the
same memory address under a smaller-than-word mode amounts to
right-shifting of the register word on big endian machines.  So, if
BLOCK_REG_PADDING chooses upward padding for BYTES_BIG_ENDIAN, and
we're copying from the entry_parm REG directly to a pseudo, bypassing
any stack slot, perform the shifting explicitly.

This fixes the miscompile of function_return_val_10 in
gcc.target/aarch64/aapcs64/func-ret-4.c for target aarch64_be-elf
introduced in the first patch for 67753.

for  gcc/ChangeLog

PR rtl-optimization/67753
PR rtl-optimization/64164
* function.c (assign_parm_setup_block): Right-shift
upward-padded big-endian args when bypassing the stack slot.
Don't you need to check the value of BLOCK_REG_PADDING at runtime?  The 
padding is essentially allowed to vary.


If you  look at the other places where BLOCK_REG_PADDING is used, it's 
checked in a #ifdef, then again inside a if conditional.




Jeff



Re: [PR64164] drop copyrename, integrate into expand

2015-11-05 Thread Richard Biener
On Thu, Nov 5, 2015 at 6:08 AM, Alexandre Oliva  wrote:
> On Sep 23, 2015, Alexandre Oliva  wrote:
>
>> @@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all 
>> *all,
> [snip]
>> +  if (GET_CODE (reg) != CONCAT)
>> +stack_parm = reg;
>> +  else
>> +/* This will use or allocate a stack slot that we'd rather
>> +   avoid.  FIXME: Could we avoid it in more cases?  */
>> +target_reg = reg;
>
> It turns out that we can, and that helps fixing PR67753.  In the end, I
> ended up using the ABI-reserved stack slot if there is one, but just
> allocating an unsplit complex pseudo fixes all remaining cases that used
> to require the allocation of a stack slot.  Yay!
>
> As for pr67753 proper, we emitted the store of the PARALLEL entry_parm
> into the stack parm only in the conversion seq, which was ultimately
> emitted after the copy from stack_parm to target_reg that was supposed
> to copy the value originally in entry_parm.  So we copied an
> uninitialized stack slot, and the subsequent store in the conversion seq
> was optimized out as dead.
>
> This caused a number of regressions on hppa-linux-gnu.  The fix for this
> is to arrange for the copy to target_reg to be emitted in the conversion
> seq if the copy to stack_parm was.  I can't determine whether this fix
> all reported regressions, but from visual inspection of the generated
> code I'm pretty sure it fixes at least gcc.c-torture/execute/pr38969.c.
>
>
> When we do NOT have an ABI-reserved stack slot, the store of the
> PARALLEL entry_parm into the intermediate pseudo doesn't need to go in
> the conversion seq (emit_group_store from a PARALLEL to a pseudo only
> uses registers, according to another comment in function.c), so I've
> simplified that case.
>
>
> This was regstrapped on x86_64-linux-gnu, i686-linux-gnu,
> ppc64-linux-gnu, ppc64el-linux-gnu, and cross-build-tested for all
> targets for which I've tested the earlier patches in the patchset.
> Ok to install?

Ok.

Thanks,
Richard.

>
>
> [PR67753] fix copy of PARALLEL entry_parm to CONCAT target_reg
>
> From: Alexandre Oliva 
>
> In assign_parms_setup_block, the copy of args in PARALLELs from
> entry_parm to stack_parm is deferred to the parm conversion insn seq,
> but the copy from stack_parm to target_reg was inserted in the normal
> copy seq, that is executed before the conversion insn seq.  Oops.
>
> We could do away with the need for an actual stack_parm in general,
> which would have avoided the need for emitting the copy to target_reg
> in the conversion seq, but at least on pa, due to the need for stack
> to copy between SI and SF modes, it seems like using the reserved
> stack slot is beneficial, so I put in logic to use a pre-reserved
> stack slot when there is one, and emit the copy to target_reg in the
> conversion seq if stack_parm was set up there.
>
> for  gcc/ChangeLog
>
> PR rtl-optimization/67753
> PR rtl-optimization/64164
> * function.c (assign_parm_setup_block): Avoid allocating a
> stack slot if we don't have an ABI-reserved one.  Emit the
> copy to target_reg in the conversion seq if the copy from
> entry_parm is in it too.  Don't use the conversion seq to copy
> a PARALLEL to a REG or a CONCAT.
> ---
>  gcc/function.c |   39 ++-
>  1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/function.c b/gcc/function.c
> index aaf49a4..156c72b 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -2879,6 +2879,7 @@ assign_parm_setup_block (struct assign_parm_data_all 
> *all,
>rtx entry_parm = data->entry_parm;
>rtx stack_parm = data->stack_parm;
>rtx target_reg = NULL_RTX;
> +  bool in_conversion_seq = false;
>HOST_WIDE_INT size;
>HOST_WIDE_INT size_stored;
>
> @@ -2895,9 +2896,23 @@ assign_parm_setup_block (struct assign_parm_data_all 
> *all,
>if (GET_CODE (reg) != CONCAT)
> stack_parm = reg;
>else
> -   /* This will use or allocate a stack slot that we'd rather
> -  avoid.  FIXME: Could we avoid it in more cases?  */
> -   target_reg = reg;
> +   {
> + target_reg = reg;
> + /* Avoid allocating a stack slot, if there isn't one
> +preallocated by the ABI.  It might seem like we should
> +always prefer a pseudo, but converting between
> +floating-point and integer modes goes through the stack
> +on various machines, so it's better to use the reserved
> +stack slot than to risk wasting it and allocating more
> +for the conversion.  */
> + if (stack_parm == NULL_RTX)
> +   {
> + int save = generating_concat_p;
> + generating_concat_p = 0;
> + stack_parm = gen_reg_rtx (mode);
> + generating_concat_p = save;
> +   }
> +   }
>

Re: [PR64164] drop copyrename, integrate into expand

2015-11-04 Thread Alexandre Oliva
On Sep 23, 2015, Alexandre Oliva  wrote:

> @@ -2982,38 +2887,39 @@ assign_parm_setup_block (struct assign_parm_data_all 
> *all,
[snip]
> +  if (GET_CODE (reg) != CONCAT)
> +stack_parm = reg;
> +  else
> +/* This will use or allocate a stack slot that we'd rather
> +   avoid.  FIXME: Could we avoid it in more cases?  */
> +target_reg = reg;

It turns out that we can, and that helps fixing PR67753.  In the end, I
ended up using the ABI-reserved stack slot if there is one, but just
allocating an unsplit complex pseudo fixes all remaining cases that used
to require the allocation of a stack slot.  Yay!

As for pr67753 proper, we emitted the store of the PARALLEL entry_parm
into the stack parm only in the conversion seq, which was ultimately
emitted after the copy from stack_parm to target_reg that was supposed
to copy the value originally in entry_parm.  So we copied an
uninitialized stack slot, and the subsequent store in the conversion seq
was optimized out as dead.

This caused a number of regressions on hppa-linux-gnu.  The fix for this
is to arrange for the copy to target_reg to be emitted in the conversion
seq if the copy to stack_parm was.  I can't determine whether this fix
all reported regressions, but from visual inspection of the generated
code I'm pretty sure it fixes at least gcc.c-torture/execute/pr38969.c.


When we do NOT have an ABI-reserved stack slot, the store of the
PARALLEL entry_parm into the intermediate pseudo doesn't need to go in
the conversion seq (emit_group_store from a PARALLEL to a pseudo only
uses registers, according to another comment in function.c), so I've
simplified that case.


This was regstrapped on x86_64-linux-gnu, i686-linux-gnu,
ppc64-linux-gnu, ppc64el-linux-gnu, and cross-build-tested for all
targets for which I've tested the earlier patches in the patchset.
Ok to install?



[PR67753] fix copy of PARALLEL entry_parm to CONCAT target_reg

From: Alexandre Oliva 

In assign_parms_setup_block, the copy of args in PARALLELs from
entry_parm to stack_parm is deferred to the parm conversion insn seq,
but the copy from stack_parm to target_reg was inserted in the normal
copy seq, that is executed before the conversion insn seq.  Oops.

We could do away with the need for an actual stack_parm in general,
which would have avoided the need for emitting the copy to target_reg
in the conversion seq, but at least on pa, due to the need for stack
to copy between SI and SF modes, it seems like using the reserved
stack slot is beneficial, so I put in logic to use a pre-reserved
stack slot when there is one, and emit the copy to target_reg in the
conversion seq if stack_parm was set up there.

for  gcc/ChangeLog

PR rtl-optimization/67753
PR rtl-optimization/64164
* function.c (assign_parm_setup_block): Avoid allocating a
stack slot if we don't have an ABI-reserved one.  Emit the
copy to target_reg in the conversion seq if the copy from
entry_parm is in it too.  Don't use the conversion seq to copy
a PARALLEL to a REG or a CONCAT.
---
 gcc/function.c |   39 ++-
 1 file changed, 34 insertions(+), 5 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index aaf49a4..156c72b 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2879,6 +2879,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
   rtx entry_parm = data->entry_parm;
   rtx stack_parm = data->stack_parm;
   rtx target_reg = NULL_RTX;
+  bool in_conversion_seq = false;
   HOST_WIDE_INT size;
   HOST_WIDE_INT size_stored;
 
@@ -2895,9 +2896,23 @@ assign_parm_setup_block (struct assign_parm_data_all 
*all,
   if (GET_CODE (reg) != CONCAT)
stack_parm = reg;
   else
-   /* This will use or allocate a stack slot that we'd rather
-  avoid.  FIXME: Could we avoid it in more cases?  */
-   target_reg = reg;
+   {
+ target_reg = reg;
+ /* Avoid allocating a stack slot, if there isn't one
+preallocated by the ABI.  It might seem like we should
+always prefer a pseudo, but converting between
+floating-point and integer modes goes through the stack
+on various machines, so it's better to use the reserved
+stack slot than to risk wasting it and allocating more
+for the conversion.  */
+ if (stack_parm == NULL_RTX)
+   {
+ int save = generating_concat_p;
+ generating_concat_p = 0;
+ stack_parm = gen_reg_rtx (mode);
+ generating_concat_p = save;
+   }
+   }
   data->stack_parm = NULL;
 }
 
@@ -2938,7 +2953,9 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
   mem = validize_mem (copy_rtx (stack_parm));
 
   /* Handle values in multiple non-contiguous locations.  */
-  if (GET_CODE (entry_parm) == PARALLEL)
+  if 

Re: [PR67828] don't unswitch loops on undefined SSA values (was: Re: [PR64164] drop copyrename, integrate into expand)

2015-10-09 Thread Richard Biener
On Fri, Oct 9, 2015 at 7:26 AM, Alexandre Oliva  wrote:
> This patch fixes a latent bug in loop unswitching exposed by the PR64164
> changes.
>
> We would move a test out of a loop that might never have been executed,
> and that accessed an uninitialized variable.  The uninitialized SSA
> name, due to uncprop, now gets coalescesd with other SSA names,
> expanding the ill effects of the undefined behavior we introduce: in
> spite of the zero initialization introduced in later rtl stages for the
> uninitialized pseudo, by then we've already expanded a PHI node that
> referenced the unitialized variable in the path coming from a path in
> which it would necessarily be zero, to a copy from the coalesced pseudo,
> that gets modified between the zero-initialization and the copy, so the
> copied zero is no longer zero.  Oops.
>
> We might want to be stricter in coalesce conflict detection to avoid
> this sort of problem, and perhaps to avoid undefined values in uncprop,
> but this would all be attempting to limit the effects of undefined
> behavior, which is probably a waste of effort.  As long as we avoid
> introducing undefined behavior ourselves, we shouldn't have to do any of
> that.  So, this patch fixes loop unswitching so as to not introduce
> undefined behavior.
>
> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

Ok.

Thanks,
Richard.

>
> [PR67828] don't unswitch on default defs of non-parms
>
> From: Alexandre Oliva 
>
> for  gcc/ChangeLog
>
> PR rtl-optimizatoin/67828
> * tree-ssa-loop-unswitch.c: Include tree-ssa.h.
> (tree_may_unswitch_on): Don't unswitch on expressions
> involving undefined values.
>
> for  gcc/testsuite/ChangeLog
>
> PR rtl-optimization/67828
> * gcc.dg/torture/pr67828.c: New.
> ---
>  gcc/testsuite/gcc.dg/torture/pr67828.c |   43 
> 
>  gcc/tree-ssa-loop-unswitch.c   |5 
>  2 files changed, 48 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr67828.c
>
> diff --git a/gcc/testsuite/gcc.dg/torture/pr67828.c 
> b/gcc/testsuite/gcc.dg/torture/pr67828.c
> new file mode 100644
> index 000..c7b6965
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr67828.c
> @@ -0,0 +1,43 @@
> +/* Check that we don't misoptimize the final value of d.  We used to
> +   apply loop unswitching on if(j), introducing undefined behavior
> +   that the original code wouldn't exercise, and this undefined
> +   behavior would get later passes to misoptimize the loop.  */
> +
> +/* { dg-do run } */
> +
> +#include 
> +#include 
> +
> +int x;
> +
> +int __attribute__ ((noinline, noclone))
> +xprintf (int d) {
> +  if (d)
> +{
> +  if (x)
> +   printf ("%d", d);
> +  abort ();
> +}
> +}
> +
> +int a, b;
> +short c;
> +
> +int
> +main ()
> +{
> +  int j, d = 1;
> +  for (; c >= 0; c++)
> +{
> +  a = d;
> +  d = 0;
> +  if (b)
> +   {
> + xprintf (0);
> + if (j)
> +   xprintf (0);
> +   }
> +}
> +  xprintf (d);
> +  exit (0);
> +}
> diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
> index 4328d6a..d6faa37 100644
> --- a/gcc/tree-ssa-loop-unswitch.c
> +++ b/gcc/tree-ssa-loop-unswitch.c
> @@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "internal-fn.h"
>  #include "gimplify.h"
>  #include "tree-cfg.h"
> +#include "tree-ssa.h"
>  #include "tree-ssa-loop-niter.h"
>  #include "tree-ssa-loop.h"
>  #include "tree-into-ssa.h"
> @@ -139,6 +140,10 @@ tree_may_unswitch_on (basic_block bb, struct loop *loop)
>/* Condition must be invariant.  */
>FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
>  {
> +  /* Unswitching on undefined values would introduce undefined
> +behavior that the original program might never exercise.  */
> +  if (ssa_undefined_value_p (use, true))
> +   return NULL_TREE;
>def = SSA_NAME_DEF_STMT (use);
>def_bb = gimple_bb (def);
>if (def_bb
>
>
> --
> Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR67766] reorder return value copying from PARALLELs and CONCATs (was: Re: [PR64164] drop copyrename, integrate into expand)

2015-10-09 Thread Richard Biener
On Fri, Oct 9, 2015 at 7:36 AM, Alexandre Oliva  wrote:
> This fixes fallout from the PR64164 expander revamp.  On alpha, PARALLEL
> hard return values may be modeless, and this confuses the code that
> wants to copy the pseudo/s in the returned value to the return hard
> regs.
>
> It used to work because PARALLELs and CONCATs used to lead to DECL_RTL
> with the same mode, but now we try harder to create a pseudo or MEM with
> a reasonable mode.
>
> The solution was as simple as moving down the code that handled mode
> differences, so that PARALLELs and CONCATs are handled as they should.
> Since AFAICT they don't ever have to deal with mode promotion anyway, we
> should be fine with this simple change, that Uroš kindly tested with an
> alpha-linux-gnu regstrap.  I tested it myself on x86_64-linux-gnu and
> i686-linux-gnu.
>
> Ok to install?

Ok.

Richard.

>
> [PR67766] reorder handling of parallels, concats and promoted values in return
>
> From: Alexandre Oliva 
>
> for  gcc/ChangeLog
>
> PR middle-end/67766
> * function.c (expand_function_end): Move return value
> promotion past the handling of PARALLELs and CONCATs.
> ---
>  gcc/function.c |   24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
>
> diff --git a/gcc/function.c b/gcc/function.c
> index e76ba2b..d16d6d8 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -5446,18 +5446,6 @@ expand_function_end (void)
>   decl_rtl);
>   shift_return_value (GET_MODE (decl_rtl), true, real_decl_rtl);
> }
> - /* If a named return value dumped decl_return to memory, then
> -we may need to re-do the PROMOTE_MODE signed/unsigned
> -extension.  */
> - else if (GET_MODE (real_decl_rtl) != GET_MODE (decl_rtl))
> -   {
> - int unsignedp = TYPE_UNSIGNED (TREE_TYPE (decl_result));
> - promote_function_mode (TREE_TYPE (decl_result),
> -GET_MODE (decl_rtl), ,
> -TREE_TYPE (current_function_decl), 1);
> -
> - convert_move (real_decl_rtl, decl_rtl, unsignedp);
> -   }
>   else if (GET_CODE (real_decl_rtl) == PARALLEL)
> {
>   /* If expand_function_start has created a PARALLEL for decl_rtl,
> @@ -5488,6 +5476,18 @@ expand_function_end (void)
>   emit_move_insn (tmp, decl_rtl);
>   emit_move_insn (real_decl_rtl, tmp);
> }
> + /* If a named return value dumped decl_return to memory, then
> +we may need to re-do the PROMOTE_MODE signed/unsigned
> +extension.  */
> + else if (GET_MODE (real_decl_rtl) != GET_MODE (decl_rtl))
> +   {
> + int unsignedp = TYPE_UNSIGNED (TREE_TYPE (decl_result));
> + promote_function_mode (TREE_TYPE (decl_result),
> +GET_MODE (decl_rtl), ,
> +TREE_TYPE (current_function_decl), 1);
> +
> + convert_move (real_decl_rtl, decl_rtl, unsignedp);
> +   }
>   else
> emit_move_insn (real_decl_rtl, decl_rtl);
> }
>
>
> --
> Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-10-09 Thread Jeff Law

On 10/07/2015 04:36 PM, Alexandre Oliva wrote:

On Sep 29, 2015, Szabolcs Nagy  wrote:


this commit



commit 33cc9081157a8c90460e4c0bdda2ac461a3822cc
Author: aoliva 
Date:   2015-09-27 09:02:00 +



 revert to assign_parms assignments using default defs
 ...



introduced a test failure on arm-none-eabi (using newlib, compiling
with -mthumb -march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard ):



FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2


Thanks for the report.

The problem here is that we don't allocate the pseudo assigned to
 to r0.  That's because we coalesce  versions with
another variable that crosses a function call.  We do that because
uncprop brings these unrelated variables, that happen to contain the
same -1 value we want to return, into the PHI node with the final
 value.

We can't coalesce both start and end with , because start and
end conflict, but by chance we try start first, and that succeeds.  If
we tried end first (e.g., by giving it a higher coalesce priority,
because fewer calls are crossed by its value in the path to the relevant
edge), we could have got the coalesced variable assigned to r0, and that
would enable us to optimize out the copy to r0 before return, and so
merge the return-only basic block with other blocks.  But ATM we don't
take the definition point or path to the edge into account when
computing coalesce costs, so we can't deterministically do better for
this testcase, and I'm not sure using these additional information would
make it better overall.

Compiling with -fno-tree-dominator-opts skips uncprop so that we don't
even try to coalesce other variables with , so we get the code
expected by the testcase.  But we obviously don't want to disable this
optimization in general.

Any other thoughts, anyone?
I keep coming back to my idea to avoid uncprop when doing so creates 
conflicts.  See c#22, c#24 & c#28.


It looks like I tossed my WIP around those ideas when you fixed 64164.

Jeff



Re: [PR64164] drop copyrename, integrate into expand

2015-10-08 Thread Richard Biener
On Thu, Oct 8, 2015 at 12:36 AM, Alexandre Oliva  wrote:
> On Sep 29, 2015, Szabolcs Nagy  wrote:
>
>> this commit
>
>> commit 33cc9081157a8c90460e4c0bdda2ac461a3822cc
>> Author: aoliva 
>> Date:   2015-09-27 09:02:00 +
>
>> revert to assign_parms assignments using default defs
>> ...
>
>> introduced a test failure on arm-none-eabi (using newlib, compiling
>> with -mthumb -march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard ):
>
>> FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2
>
> Thanks for the report.
>
> The problem here is that we don't allocate the pseudo assigned to
>  to r0.  That's because we coalesce  versions with
> another variable that crosses a function call.  We do that because
> uncprop brings these unrelated variables, that happen to contain the
> same -1 value we want to return, into the PHI node with the final
>  value.
>
> We can't coalesce both start and end with , because start and
> end conflict, but by chance we try start first, and that succeeds.  If
> we tried end first (e.g., by giving it a higher coalesce priority,
> because fewer calls are crossed by its value in the path to the relevant
> edge), we could have got the coalesced variable assigned to r0, and that
> would enable us to optimize out the copy to r0 before return, and so
> merge the return-only basic block with other blocks.  But ATM we don't
> take the definition point or path to the edge into account when
> computing coalesce costs, so we can't deterministically do better for
> this testcase, and I'm not sure using these additional information would
> make it better overall.
>
> Compiling with -fno-tree-dominator-opts skips uncprop so that we don't
> even try to coalesce other variables with , so we get the code
> expected by the testcase.  But we obviously don't want to disable this
> optimization in general.
>
> Any other thoughts, anyone?

Bad luck?  Add some heuristics that always help?

Ok, that wasn't really useful :/

Richard.

>
> --
> Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
> You must be the change you wish to see in the world. -- Gandhi
> Be Free! -- http://FSFLA.org/   FSF Latin America board member
> Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


[PR67766] reorder return value copying from PARALLELs and CONCATs (was: Re: [PR64164] drop copyrename, integrate into expand)

2015-10-08 Thread Alexandre Oliva
This fixes fallout from the PR64164 expander revamp.  On alpha, PARALLEL
hard return values may be modeless, and this confuses the code that
wants to copy the pseudo/s in the returned value to the return hard
regs.

It used to work because PARALLELs and CONCATs used to lead to DECL_RTL
with the same mode, but now we try harder to create a pseudo or MEM with
a reasonable mode.

The solution was as simple as moving down the code that handled mode
differences, so that PARALLELs and CONCATs are handled as they should.
Since AFAICT they don't ever have to deal with mode promotion anyway, we
should be fine with this simple change, that Uroš kindly tested with an
alpha-linux-gnu regstrap.  I tested it myself on x86_64-linux-gnu and
i686-linux-gnu.

Ok to install?


[PR67766] reorder handling of parallels, concats and promoted values in return

From: Alexandre Oliva 

for  gcc/ChangeLog

PR middle-end/67766
* function.c (expand_function_end): Move return value
promotion past the handling of PARALLELs and CONCATs.
---
 gcc/function.c |   24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index e76ba2b..d16d6d8 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5446,18 +5446,6 @@ expand_function_end (void)
  decl_rtl);
  shift_return_value (GET_MODE (decl_rtl), true, real_decl_rtl);
}
- /* If a named return value dumped decl_return to memory, then
-we may need to re-do the PROMOTE_MODE signed/unsigned
-extension.  */
- else if (GET_MODE (real_decl_rtl) != GET_MODE (decl_rtl))
-   {
- int unsignedp = TYPE_UNSIGNED (TREE_TYPE (decl_result));
- promote_function_mode (TREE_TYPE (decl_result),
-GET_MODE (decl_rtl), ,
-TREE_TYPE (current_function_decl), 1);
-
- convert_move (real_decl_rtl, decl_rtl, unsignedp);
-   }
  else if (GET_CODE (real_decl_rtl) == PARALLEL)
{
  /* If expand_function_start has created a PARALLEL for decl_rtl,
@@ -5488,6 +5476,18 @@ expand_function_end (void)
  emit_move_insn (tmp, decl_rtl);
  emit_move_insn (real_decl_rtl, tmp);
}
+ /* If a named return value dumped decl_return to memory, then
+we may need to re-do the PROMOTE_MODE signed/unsigned
+extension.  */
+ else if (GET_MODE (real_decl_rtl) != GET_MODE (decl_rtl))
+   {
+ int unsignedp = TYPE_UNSIGNED (TREE_TYPE (decl_result));
+ promote_function_mode (TREE_TYPE (decl_result),
+GET_MODE (decl_rtl), ,
+TREE_TYPE (current_function_decl), 1);
+
+ convert_move (real_decl_rtl, decl_rtl, unsignedp);
+   }
  else
emit_move_insn (real_decl_rtl, decl_rtl);
}


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


[PR67828] don't unswitch loops on undefined SSA values (was: Re: [PR64164] drop copyrename, integrate into expand)

2015-10-08 Thread Alexandre Oliva
This patch fixes a latent bug in loop unswitching exposed by the PR64164
changes.

We would move a test out of a loop that might never have been executed,
and that accessed an uninitialized variable.  The uninitialized SSA
name, due to uncprop, now gets coalescesd with other SSA names,
expanding the ill effects of the undefined behavior we introduce: in
spite of the zero initialization introduced in later rtl stages for the
uninitialized pseudo, by then we've already expanded a PHI node that
referenced the unitialized variable in the path coming from a path in
which it would necessarily be zero, to a copy from the coalesced pseudo,
that gets modified between the zero-initialization and the copy, so the
copied zero is no longer zero.  Oops.

We might want to be stricter in coalesce conflict detection to avoid
this sort of problem, and perhaps to avoid undefined values in uncprop,
but this would all be attempting to limit the effects of undefined
behavior, which is probably a waste of effort.  As long as we avoid
introducing undefined behavior ourselves, we shouldn't have to do any of
that.  So, this patch fixes loop unswitching so as to not introduce
undefined behavior.

Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?


[PR67828] don't unswitch on default defs of non-parms

From: Alexandre Oliva 

for  gcc/ChangeLog

PR rtl-optimizatoin/67828
* tree-ssa-loop-unswitch.c: Include tree-ssa.h.
(tree_may_unswitch_on): Don't unswitch on expressions
involving undefined values.

for  gcc/testsuite/ChangeLog

PR rtl-optimization/67828
* gcc.dg/torture/pr67828.c: New.
---
 gcc/testsuite/gcc.dg/torture/pr67828.c |   43 
 gcc/tree-ssa-loop-unswitch.c   |5 
 2 files changed, 48 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr67828.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr67828.c 
b/gcc/testsuite/gcc.dg/torture/pr67828.c
new file mode 100644
index 000..c7b6965
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr67828.c
@@ -0,0 +1,43 @@
+/* Check that we don't misoptimize the final value of d.  We used to
+   apply loop unswitching on if(j), introducing undefined behavior
+   that the original code wouldn't exercise, and this undefined
+   behavior would get later passes to misoptimize the loop.  */
+
+/* { dg-do run } */
+
+#include 
+#include 
+
+int x;
+
+int __attribute__ ((noinline, noclone))
+xprintf (int d) {
+  if (d)
+{
+  if (x)
+   printf ("%d", d);
+  abort ();
+}
+}
+
+int a, b;
+short c;
+
+int
+main ()
+{
+  int j, d = 1;
+  for (; c >= 0; c++)
+{
+  a = d;
+  d = 0;
+  if (b)
+   {
+ xprintf (0);
+ if (j)
+   xprintf (0);
+   }
+}
+  xprintf (d);
+  exit (0);
+}
diff --git a/gcc/tree-ssa-loop-unswitch.c b/gcc/tree-ssa-loop-unswitch.c
index 4328d6a..d6faa37 100644
--- a/gcc/tree-ssa-loop-unswitch.c
+++ b/gcc/tree-ssa-loop-unswitch.c
@@ -32,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "internal-fn.h"
 #include "gimplify.h"
 #include "tree-cfg.h"
+#include "tree-ssa.h"
 #include "tree-ssa-loop-niter.h"
 #include "tree-ssa-loop.h"
 #include "tree-into-ssa.h"
@@ -139,6 +140,10 @@ tree_may_unswitch_on (basic_block bb, struct loop *loop)
   /* Condition must be invariant.  */
   FOR_EACH_SSA_TREE_OPERAND (use, stmt, iter, SSA_OP_USE)
 {
+  /* Unswitching on undefined values would introduce undefined
+behavior that the original program might never exercise.  */
+  if (ssa_undefined_value_p (use, true))
+   return NULL_TREE;
   def = SSA_NAME_DEF_STMT (use);
   def_bb = gimple_bb (def);
   if (def_bb


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-10-07 Thread Alexandre Oliva
On Sep 29, 2015, Szabolcs Nagy  wrote:

> this commit

> commit 33cc9081157a8c90460e4c0bdda2ac461a3822cc
> Author: aoliva 
> Date:   2015-09-27 09:02:00 +

> revert to assign_parms assignments using default defs
> ...

> introduced a test failure on arm-none-eabi (using newlib, compiling
> with -mthumb -march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard ):

> FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2

Thanks for the report.

The problem here is that we don't allocate the pseudo assigned to
 to r0.  That's because we coalesce  versions with
another variable that crosses a function call.  We do that because
uncprop brings these unrelated variables, that happen to contain the
same -1 value we want to return, into the PHI node with the final
 value.

We can't coalesce both start and end with , because start and
end conflict, but by chance we try start first, and that succeeds.  If
we tried end first (e.g., by giving it a higher coalesce priority,
because fewer calls are crossed by its value in the path to the relevant
edge), we could have got the coalesced variable assigned to r0, and that
would enable us to optimize out the copy to r0 before return, and so
merge the return-only basic block with other blocks.  But ATM we don't
take the definition point or path to the edge into account when
computing coalesce costs, so we can't deterministically do better for
this testcase, and I'm not sure using these additional information would
make it better overall.

Compiling with -fno-tree-dominator-opts skips uncprop so that we don't
even try to coalesce other variables with , so we get the code
expected by the testcase.  But we obviously don't want to disable this
optimization in general.

Any other thoughts, anyone?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-09-29 Thread Szabolcs Nagy

On 23/09/15 21:07, Alexandre Oliva wrote:

On Sep 18, 2015, Alan Lawrence  wrote:


With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on
branch aoliva/pr64164, I am now able to build a cross toolchain for
aarch64 and aarch64_be, and can confirm the ABI failure is fixed on
the branch.




this commit

commit 33cc9081157a8c90460e4c0bdda2ac461a3822cc
Author: aoliva 
Date:   2015-09-27 09:02:00 +

revert to assign_parms assignments using default defs
...

introduced a test failure on arm-none-eabi (using newlib, compiling
with -mthumb -march=armv8-a -mfpu=crypto-neon-fp-armv8 -mfloat-abi=hard ):

FAIL: gcc.target/arm/pr43920-2.c scan-assembler-times pop 2

spawn arm-none-eabi-size pr43920-2.o
   textdata bss dec hex filename
 56   0   0  56  38 pr43920-2.o
text size is 56
FAIL: gcc.target/arm/pr43920-2.c object-size text <= 54

(i haven't looked into the failure, attached asm output before and after).


Thanks for the confirmation.  I've made one further tweak for cris and
lm32, dropping the assert that caused build failures for libstdc++
atomics parms that required more alignment than
MAX_SUPPORTED_STACK_ALIGNMENT, consolidated the patchset and retested it
with a more recent baseline (r228019), with native regstraps on
x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
powerpc64le-linux-gnu, and cross toolchain builds for the following 73
platforms: aarch64_be-elf aarch64-elf arm-eabi armeb-eabihf
arm-symbianelf avr-elf bfin-elf c6x-elf cr16-elf cris-elf crisv32-elf
epiphany-elf fido-elf fr30-elf frv-elf ft32-elf h8300-elf i686-elf
ia64-elf iq2000-elf lm32-elf m32c-elf m32r-elf m32rle-elf m68k-elf
mcore-elf mep-elf microblaze-elf mips64el-elf mips64-elf mips64orion-elf
mips64vr-elf mipsel-elf mipsisa32-elfoabi mipsisa64-elfoabi
mipsisa64r2el-elf mipsisa64r2-sde-elf mipsisa64sb1-elf
mipsisa64sr71k-elf mipstx39-elf mn10300-elf moxie-elf msp430-elf
nds32be-elf nds32le-elf nios2-elf pdp11-aout powerpc-eabialtivec
powerpc-eabi powerpc-eabisimaltivec powerpc-eabisim powerpc-eabispe
powerpcle-eabi powerpcle-eabisim powerpcle-elf powerpc-xilinx-eabi
ppc64-eabi ppc-eabi ppc-elf rl78-elf rx-elf sh64-elf sh-elf
sh-superh-elf sparc64-elf sparc-elf sparc-leon-elf spu-elf v850e-elf
v850-elf visium-elf xstormy16-elf xtensa-elf.  Not all of them succeeded
in building, but those that didn't failed at the very same spots before
and after this patch.


This patch doesn't really add much functionality.  It rather
reimplements a lot of the ugly and fragile stuff I put in in the
previous big patchset in a far more robust and pleasant way.  It fixes a
number of regressions in the process, mainly because, instead of
modifying assign_parms so as to let cfgexpand do part of its job, it
reverts all of the RTL assignment for parameters and results to
assign_parms.  cfgexpand now leaves the RTL assignment of partitions
containing default defs or parms and results to assign_parms, and
assign_parms uses a single callback, set_parm_rtl, to tell cfgexpand the
assignment for the partition containing the default def of each
parameter.

This required introducing default defs for all parms and results, even
if unused; we could refrain from creating them, and refrain from
initializing those parameters (at least when optimizing), but that would
require messing with the fragile bits in assign_parms again, and it
would bring little benefit, since RTL optimization will likely notice
the initialization is unused and drop it anyway.  Besides, adding the
default defs was actually needed to fix a regression in the previous
patch, and even with the current patch it helps make sure we don't
assign more than one default def to the same SSA partition (the previous
patch attempted to do that, but there was a bug, fixed in the current
patch).  Having unused default defs makes it easier for us to decide
whether to use an entry_value rtx for the initial debug insn of a parm.
We track partitions holding default defs for parms and results with a
bitmap; we used to have a bitmap that tracked partitions holding default
defs, but it was unused!  I just renamed it and repurposed it.

I've also added checking asserts to set_rtl, to verify that, when we
expect a REG, we get a REG, and that it has the expected mode.  set_rtl
was also adjusted to record anonymous SSA names or their base types in
attrs of REGs or MEMs, respectively, so that code that relied on the
attrs to detect properties of the decl types no longer regress just
because we no longer generate decls for anonymous SSA names.  Since
there were prior uses of types in MEM attrs, that was expected to go
smoothly, but I was surprised at how smoothly adding SSA names to REG
attrs went.  No adjustments required!

I also tightened a bit the conditions for coalescing: we used to require
the same canonical type; I've added tests for same alignment
requirements, and for same 

Re: [PR64164] drop copyrename, integrate into expand

2015-09-25 Thread Richard Biener
On Wed, Sep 23, 2015 at 10:07 PM, Alexandre Oliva  wrote:
> On Sep 18, 2015, Alan Lawrence  wrote:
>
>> With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on
>> branch aoliva/pr64164, I am now able to build a cross toolchain for
>> aarch64 and aarch64_be, and can confirm the ABI failure is fixed on
>> the branch.
>
> Thanks for the confirmation.  I've made one further tweak for cris and
> lm32, dropping the assert that caused build failures for libstdc++
> atomics parms that required more alignment than
> MAX_SUPPORTED_STACK_ALIGNMENT, consolidated the patchset and retested it
> with a more recent baseline (r228019), with native regstraps on
> x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
> powerpc64le-linux-gnu, and cross toolchain builds for the following 73
> platforms: aarch64_be-elf aarch64-elf arm-eabi armeb-eabihf
> arm-symbianelf avr-elf bfin-elf c6x-elf cr16-elf cris-elf crisv32-elf
> epiphany-elf fido-elf fr30-elf frv-elf ft32-elf h8300-elf i686-elf
> ia64-elf iq2000-elf lm32-elf m32c-elf m32r-elf m32rle-elf m68k-elf
> mcore-elf mep-elf microblaze-elf mips64el-elf mips64-elf mips64orion-elf
> mips64vr-elf mipsel-elf mipsisa32-elfoabi mipsisa64-elfoabi
> mipsisa64r2el-elf mipsisa64r2-sde-elf mipsisa64sb1-elf
> mipsisa64sr71k-elf mipstx39-elf mn10300-elf moxie-elf msp430-elf
> nds32be-elf nds32le-elf nios2-elf pdp11-aout powerpc-eabialtivec
> powerpc-eabi powerpc-eabisimaltivec powerpc-eabisim powerpc-eabispe
> powerpcle-eabi powerpcle-eabisim powerpcle-elf powerpc-xilinx-eabi
> ppc64-eabi ppc-eabi ppc-elf rl78-elf rx-elf sh64-elf sh-elf
> sh-superh-elf sparc64-elf sparc-elf sparc-leon-elf spu-elf v850e-elf
> v850-elf visium-elf xstormy16-elf xtensa-elf.  Not all of them succeeded
> in building, but those that didn't failed at the very same spots before
> and after this patch.
>
>
> This patch doesn't really add much functionality.  It rather
> reimplements a lot of the ugly and fragile stuff I put in in the
> previous big patchset in a far more robust and pleasant way.  It fixes a
> number of regressions in the process, mainly because, instead of
> modifying assign_parms so as to let cfgexpand do part of its job, it
> reverts all of the RTL assignment for parameters and results to
> assign_parms.  cfgexpand now leaves the RTL assignment of partitions
> containing default defs or parms and results to assign_parms, and
> assign_parms uses a single callback, set_parm_rtl, to tell cfgexpand the
> assignment for the partition containing the default def of each
> parameter.
>
> This required introducing default defs for all parms and results, even
> if unused; we could refrain from creating them, and refrain from
> initializing those parameters (at least when optimizing), but that would
> require messing with the fragile bits in assign_parms again, and it
> would bring little benefit, since RTL optimization will likely notice
> the initialization is unused and drop it anyway.  Besides, adding the
> default defs was actually needed to fix a regression in the previous
> patch, and even with the current patch it helps make sure we don't
> assign more than one default def to the same SSA partition (the previous
> patch attempted to do that, but there was a bug, fixed in the current
> patch).  Having unused default defs makes it easier for us to decide
> whether to use an entry_value rtx for the initial debug insn of a parm.
> We track partitions holding default defs for parms and results with a
> bitmap; we used to have a bitmap that tracked partitions holding default
> defs, but it was unused!  I just renamed it and repurposed it.
>
> I've also added checking asserts to set_rtl, to verify that, when we
> expect a REG, we get a REG, and that it has the expected mode.  set_rtl
> was also adjusted to record anonymous SSA names or their base types in
> attrs of REGs or MEMs, respectively, so that code that relied on the
> attrs to detect properties of the decl types no longer regress just
> because we no longer generate decls for anonymous SSA names.  Since
> there were prior uses of types in MEM attrs, that was expected to go
> smoothly, but I was surprised at how smoothly adding SSA names to REG
> attrs went.  No adjustments required!
>
> I also tightened a bit the conditions for coalescing: we used to require
> the same canonical type; I've added tests for same alignment
> requirements, and for same signedness.  OTOH, I've added a few more
> coalesce candidates for RESULT_DECLs and the newly-added default defs of
> parms and results.
>
> Other relevant changes were in mode promotion.  TYPE_MODE would often
> return BLKmode for some vector types, which was fine for some return
> decl RTL with PARALLEL, but that didn't quite work for SSA partitions.
> There were other cases of mode promotion of result decls that failed the
> asserts in set_rtl, that revealed promote_decl_mode didn't call
> promote_function_mode as expected for results.
>
> 

Re: [PR64164] drop copyrename, integrate into expand

2015-09-23 Thread Alexandre Oliva
On Sep 18, 2015, Alan Lawrence  wrote:

> With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on
> branch aoliva/pr64164, I am now able to build a cross toolchain for
> aarch64 and aarch64_be, and can confirm the ABI failure is fixed on
> the branch.

Thanks for the confirmation.  I've made one further tweak for cris and
lm32, dropping the assert that caused build failures for libstdc++
atomics parms that required more alignment than
MAX_SUPPORTED_STACK_ALIGNMENT, consolidated the patchset and retested it
with a more recent baseline (r228019), with native regstraps on
x86_64-linux-gnu, i686-linux-gnu, powerpc64-linux-gnu,
powerpc64le-linux-gnu, and cross toolchain builds for the following 73
platforms: aarch64_be-elf aarch64-elf arm-eabi armeb-eabihf
arm-symbianelf avr-elf bfin-elf c6x-elf cr16-elf cris-elf crisv32-elf
epiphany-elf fido-elf fr30-elf frv-elf ft32-elf h8300-elf i686-elf
ia64-elf iq2000-elf lm32-elf m32c-elf m32r-elf m32rle-elf m68k-elf
mcore-elf mep-elf microblaze-elf mips64el-elf mips64-elf mips64orion-elf
mips64vr-elf mipsel-elf mipsisa32-elfoabi mipsisa64-elfoabi
mipsisa64r2el-elf mipsisa64r2-sde-elf mipsisa64sb1-elf
mipsisa64sr71k-elf mipstx39-elf mn10300-elf moxie-elf msp430-elf
nds32be-elf nds32le-elf nios2-elf pdp11-aout powerpc-eabialtivec
powerpc-eabi powerpc-eabisimaltivec powerpc-eabisim powerpc-eabispe
powerpcle-eabi powerpcle-eabisim powerpcle-elf powerpc-xilinx-eabi
ppc64-eabi ppc-eabi ppc-elf rl78-elf rx-elf sh64-elf sh-elf
sh-superh-elf sparc64-elf sparc-elf sparc-leon-elf spu-elf v850e-elf
v850-elf visium-elf xstormy16-elf xtensa-elf.  Not all of them succeeded
in building, but those that didn't failed at the very same spots before
and after this patch.


This patch doesn't really add much functionality.  It rather
reimplements a lot of the ugly and fragile stuff I put in in the
previous big patchset in a far more robust and pleasant way.  It fixes a
number of regressions in the process, mainly because, instead of
modifying assign_parms so as to let cfgexpand do part of its job, it
reverts all of the RTL assignment for parameters and results to
assign_parms.  cfgexpand now leaves the RTL assignment of partitions
containing default defs or parms and results to assign_parms, and
assign_parms uses a single callback, set_parm_rtl, to tell cfgexpand the
assignment for the partition containing the default def of each
parameter.

This required introducing default defs for all parms and results, even
if unused; we could refrain from creating them, and refrain from
initializing those parameters (at least when optimizing), but that would
require messing with the fragile bits in assign_parms again, and it
would bring little benefit, since RTL optimization will likely notice
the initialization is unused and drop it anyway.  Besides, adding the
default defs was actually needed to fix a regression in the previous
patch, and even with the current patch it helps make sure we don't
assign more than one default def to the same SSA partition (the previous
patch attempted to do that, but there was a bug, fixed in the current
patch).  Having unused default defs makes it easier for us to decide
whether to use an entry_value rtx for the initial debug insn of a parm.
We track partitions holding default defs for parms and results with a
bitmap; we used to have a bitmap that tracked partitions holding default
defs, but it was unused!  I just renamed it and repurposed it.

I've also added checking asserts to set_rtl, to verify that, when we
expect a REG, we get a REG, and that it has the expected mode.  set_rtl
was also adjusted to record anonymous SSA names or their base types in
attrs of REGs or MEMs, respectively, so that code that relied on the
attrs to detect properties of the decl types no longer regress just
because we no longer generate decls for anonymous SSA names.  Since
there were prior uses of types in MEM attrs, that was expected to go
smoothly, but I was surprised at how smoothly adding SSA names to REG
attrs went.  No adjustments required!

I also tightened a bit the conditions for coalescing: we used to require
the same canonical type; I've added tests for same alignment
requirements, and for same signedness.  OTOH, I've added a few more
coalesce candidates for RESULT_DECLs and the newly-added default defs of
parms and results.

Other relevant changes were in mode promotion.  TYPE_MODE would often
return BLKmode for some vector types, which was fine for some return
decl RTL with PARALLEL, but that didn't quite work for SSA partitions.
There were other cases of mode promotion of result decls that failed the
asserts in set_rtl, that revealed promote_decl_mode didn't call
promote_function_mode as expected for results.

The new assers brought additional requirements: promoting the mode of
the RTL generated for the static chain, arranging for result decls to be
assigned to a pseudo where it would formerly have got a BLKmode PARALLEL
(as mentioned 

Re: [PR64164] drop copyrename, integrate into expand

2015-09-18 Thread Alan Lawrence

On 02/09/15 23:12, Alexandre Oliva wrote:

On Sep  2, 2015, Alan Lawrence  wrote:


One more failure to report, I'm afraid. On AArch64 Bigendian,
aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from
r227348):


Thanks.  The failure mode was different in the current, revamped git
branch aoliva/pr64164, but I've just fixed it there.

I'm almost ready to post a new patch, with a new, simpler, less fragile
and more maintainable approach to integrate cfgexpand and assign_parms'
RTL assignment, so if you could give it a spin on big and little endian
aarch64 natives, that would be very much appreciated!



On trunk, aarch64_be is still ICEing in gcc.target/aarch64/aapcs64/func-ret-4.c 
(complex numbers).


With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on branch 
aoliva/pr64164, I am now able to build a cross toolchain for aarch64 and 
aarch64_be, and can confirm the ABI failure is fixed on the branch.


HTH,
Alan



Re: [PR64164] drop copyrename, integrate into expand

2015-09-03 Thread Alan Lawrence

On 02/09/15 23:12, Alexandre Oliva wrote:

On Sep  2, 2015, Alan Lawrence  wrote:


One more failure to report, I'm afraid. On AArch64 Bigendian,
aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from
r227348):


Thanks.  The failure mode was different in the current, revamped git
branch aoliva/pr64164, but I've just fixed it there.

I'm almost ready to post a new patch, with a new, simpler, less fragile
and more maintainable approach to integrate cfgexpand and assign_parms'
RTL assignment, so if you could give it a spin on big and little endian
aarch64 natives, that would be very much appreciated!



On aarch64_be, that branch fixes the ICE - but func-ret-4.c fails on execution, 
and now func-ret-3.c does too! Also it causes a bunch of errors building newlib 
using cross-built binutils, which I haven't tracked down yet:


/work/alalaw01/src2/binutils-gdb/newlib/libc/locale/locale.c: In function 
'__get_locale_env':
/work/alalaw01/src2/binutils-gdb/newlib/libc/locale/locale.c:911:1: internal 
compiler error: in insert_value_copy_on_edge, at tree-outof-ssa.c:308

 __get_locale_env(struct _reent *p, int category)
 ^
0xb4ecc4 insert_value_copy_on_edge
/work/alalaw01/src2/gcc/gcc/tree-outof-ssa.c:307
0xb4ecc4 eliminate_phi
/work/alalaw01/src2/gcc/gcc/tree-outof-ssa.c:780
0xb4ecc4 expand_phi_nodes(ssaexpand*)
/work/alalaw01/src2/gcc/gcc/tree-outof-ssa.c:943
0x6e74a6 execute
/work/alalaw01/src2/gcc/gcc/cfgexpand.c:6242
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
make[7]: *** [lib_a-locale.o] Error 1

--Alan



Re: [PR64164] drop copyrename, integrate into expand

2015-09-02 Thread Alan Lawrence

On 14/08/15 19:57, Alexandre Oliva wrote:


I'm glad it appears to be working to everyone's
satisfaction now.  I've just committed it as r226901, with only a
context adjustment to account for a change in use_register_for_decl in
function.c.  /me crosses fingers :-)

Here's the patch as checked in:


One more failure to report, I'm afraid. On AArch64 Bigendian, 
aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from r227348):


In file included from /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aa
pcs64/func-ret-4.c:14:0:
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-4.c: In
 function 'func_return_val_10':
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:12:2
4: internal compiler error: in simplify_subreg, at simplify-rtx.c:5808
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:13:4
0: note: in definition of macro 'FUNC_NAME_COMBINE'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:15:2
7: note: in expansion of macro 'FUNC_NAME_1'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:15:3
9: note: in expansion of macro 'FUNC_BASE_NAME'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:69:3
3: note: in expansion of macro 'FUNC_NAME'
/work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-4.c:23:
1: note: in expansion of macro 'FUNC_VAL_CHECK'
0xa7ba44 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
/work/alalaw01/src/gcc/gcc/simplify-rtx.c:5808
0xa7c4ef simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)
/work/alalaw01/src/gcc/gcc/simplify-rtx.c:6031
0x7ad097 operand_subword(rtx_def*, unsigned int, int, machine_mode)
/work/alalaw01/src/gcc/gcc/emit-rtl.c:1611
0x7def4e move_block_from_reg(int, rtx_def*, int)
/work/alalaw01/src/gcc/gcc/expr.c:1536
0x83a494 assign_parm_setup_block
/work/alalaw01/src/gcc/gcc/function.c:3117
0x841a43 assign_parms
/work/alalaw01/src/gcc/gcc/function.c:3857
0x842ffa expand_function_start(tree_node*)
/work/alalaw01/src/gcc/gcc/function.c:5286
0x6e7496 execute
/work/alalaw01/src/gcc/gcc/cfgexpand.c:6203
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c compilation,  -O1  (internal 
compiler error)


Also at -O2, -O3 -g, -Og -g, -Os. -O0 is OK.

simplify_subreg is called with outermode=DImode, op=

(concat:CHI (reg:HI 76 [ t ])
(reg:HI 77 [ t+2 ]))

innermode = BLKmode (which violates the assertion), byte=0.

move_block_from_reg (in expr.c) calls operand_subword(x, i, 1, BLKmode), here 
i=0 and x is the concat:CHI above, and operand_subword doesn't handle that case 
(well, it passes it onto simplify_subreg).


In assign_parm_setup_block, I see 'mem = validize_mem (copy_rtx (stack_parm))' 
where stack_parm is again the same concat:CHI.


This should be easily reproducible with a stage 1 compiler 
(aarch64_be-none-elf).

--Alan



Re: [PR64164] drop copyrename, integrate into expand

2015-09-02 Thread Alexandre Oliva
On Sep  2, 2015, Alan Lawrence  wrote:

> One more failure to report, I'm afraid. On AArch64 Bigendian,
> aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from
> r227348):

Thanks.  The failure mode was different in the current, revamped git
branch aoliva/pr64164, but I've just fixed it there.

I'm almost ready to post a new patch, with a new, simpler, less fragile
and more maintainable approach to integrate cfgexpand and assign_parms'
RTL assignment, so if you could give it a spin on big and little endian
aarch64 natives, that would be very much appreciated!

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-21 Thread Andreas Schwab
Alexandre Oliva aol...@redhat.com writes:

   PR rtl-optimization/64164
   PR rtl-optimization/67227
   * alias.c (memrefs_conflict_p): Handle VALUEs in PLUS better.
   (nonoverlapping_memrefs_p): Test offsets and sizes when given
   identical gimple_reg exprs.

I can confirm that this fixes the bootstrap.

Thanks, Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [PR64164] drop copyrename, integrate into expand

2015-08-21 Thread Richard Biener
On Fri, Aug 21, 2015 at 9:57 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Aug 19, 2015, Alexandre Oliva aol...@redhat.com wrote:

 I have verified in the expand dumps that both the gimple and the rtl
 representation in the relevant parts of the code are identical, except
 for the presence of debug stmts and insns.

 While comparing the dumps, I noticed -fdump-unnumbered-links no longer
 worked like it did back when I introduced it, with the very purpose of
 making it easier to compare dumps with and without debug insns.

 When the insn_uid was moved out of the u[] array, the indices that
 print-rtl tested to tell whether to omit the ids of the prev and next
 insns got off by one.

 This patch updates the test to match the current indices.

 Bootstrapping on ia64-linux-gnu.  Ok to install?

Ok.

Thanks,
Richard.

 fix -fdump-unnumbered-links

 From: Alexandre Oliva aol...@redhat.com

 for  gcc/ChangeLog

 * print-rtl.c (print_rtx): Check the correct range for
 flag_dump_unnumbered_links to behave as documented.
 ---
  gcc/print-rtl.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
 index aacadbb..b541d83 100644
 --- a/gcc/print-rtl.c
 +++ b/gcc/print-rtl.c
 @@ -550,7 +550,7 @@ print_rtx (const_rtx in_rtx)
   }

 if (flag_dump_unnumbered
 -   || (flag_dump_unnumbered_links  (i == 1 || i == 2)
 +   || (flag_dump_unnumbered_links  i = 1
  (INSN_P (in_rtx) || NOTE_P (in_rtx)
 || LABEL_P (in_rtx) || BARRIER_P (in_rtx
   fputs ( #, outfile);


 --
 Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
 You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- http://FSFLA.org/   FSF Latin America board member
 Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-21 Thread Alexandre Oliva
On Aug 19, 2015, Alexandre Oliva aol...@redhat.com wrote:

 I have verified in the expand dumps that both the gimple and the rtl
 representation in the relevant parts of the code are identical, except
 for the presence of debug stmts and insns.

While comparing the dumps, I noticed -fdump-unnumbered-links no longer
worked like it did back when I introduced it, with the very purpose of
making it easier to compare dumps with and without debug insns.

When the insn_uid was moved out of the u[] array, the indices that
print-rtl tested to tell whether to omit the ids of the prev and next
insns got off by one.

This patch updates the test to match the current indices.

Bootstrapping on ia64-linux-gnu.  Ok to install?

fix -fdump-unnumbered-links

From: Alexandre Oliva aol...@redhat.com

for  gcc/ChangeLog

* print-rtl.c (print_rtx): Check the correct range for
flag_dump_unnumbered_links to behave as documented.
---
 gcc/print-rtl.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/print-rtl.c b/gcc/print-rtl.c
index aacadbb..b541d83 100644
--- a/gcc/print-rtl.c
+++ b/gcc/print-rtl.c
@@ -550,7 +550,7 @@ print_rtx (const_rtx in_rtx)
  }
 
if (flag_dump_unnumbered
-   || (flag_dump_unnumbered_links  (i == 1 || i == 2)
+   || (flag_dump_unnumbered_links  i = 1
 (INSN_P (in_rtx) || NOTE_P (in_rtx)
|| LABEL_P (in_rtx) || BARRIER_P (in_rtx
  fputs ( #, outfile);


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-21 Thread Richard Biener
On Fri, Aug 21, 2015 at 9:46 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Aug 19, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Aug 19, 2015, Alexandre Oliva aol...@redhat.com wrote:
 I'm having some difficulty getting access to an ia64 box ATM, and for
 ada bootstraps, a cross won't do, so...  if you still have that build
 tree around, any chance you could recompile par.o with both stage1 and
 stage2, with -fdump-rtl-expand-details, and email me the compiler dump
 files?

 Thanks!

 In the mean time, I have been able to duplicate the problem myself.  As
 you say, it is triggered by -gtoggle.  However, it has nothing
 whatsoever to do with the recent patches I installed.  At most they
 expose some latent problem in the scheduler.

 The above was more likely wrong than right.  There may have been a
 latent problem in the scheduler indeed, but the patch actually made it
 worse, or even introduced it.

 The scheduler relies on alias analysis to tell whether a given pair of
 insns that read or modify memory should have a dependence set between
 them.  It looks both at the RTL proper (including cselib values) and at
 the MEM attrs.

 The problem was that, for ada/par.o, we computed different dependencies,
 and thus different sched priorities, for a pair of insns.  Specifically,
 one wrote to a stack spill slot, and another read from a neighbor spill
 slot.  Both had MEM_EXPRs pointing at a %sfp decl, and different
 offsets.  In the stage3 (-g) compilation, there were debug insns between
 them.

 They caused additional equivalent expressions to be added to some
 values, which in turn caused memrefs_conflict_p to return different
 values.

 In the stage2 compilation, both VALUEs resolved to PLUSes of two VALUEs,
 the first of each resolved to a constant, while the latter of each
 resolved to %sfp.  When the second operands of PLUSes match, we recurse
 and compare the first operands, resolving both to CONST_INTs and, in
 this case, concluding that there's no possible overlap.

 In the stage3 compilation, one VALUE resolved to a PLUS of a VALUE and a
 CONST_INT, whereas the other resolved to a PLUS of two VALUEs.  Without
 canonicalization of VALUE order in PLUSes, it just so happened that the
 VALUE that appeared as the second operand in the second PLUS was moved
 to the first operand in the first PLUS, and so memrefs_conflict_p
 couldn't tell whether or not there was an overlap.

 Before the initial pr64164 patch, we had another chance to detect the
 non-overlap analyzing the MEM attrs in nonoverlapping_memrefs_p: given
 the same MEM_EXPR, but different offsets, we used to conclude there was
 no overlap, so this got true_dependence to return the same value in both
 compilations.

 The pr64164 patch introduced an early exit from nonoverlapping_memrefs_p
 when either operand is a gimple_reg, because some of these wouldn't have
 a DECL_RTL set, and creating RTL for them at such points would not be
 appropriate.  The problem is that the early exit would only return false
 if the exprs were different.  If they were the same, we'd conclude an
 overlap was possible, even if offsets were enough to tell otherwise.

 My thought back then was that such exprs were not addressable anyway, so
 we'd always access the entire object, so offsets couldn't possible be
 different.  Right?  Well, no!  Think spill slots: %sfp (a gimple_reg
 decl) + constant offset!  Same base gimple_reg, non-overlapping memory
 addresses!


 This patch improves memrefs_conflict_p so as to handle more combinations
 of VALUEs in PLUSes: if both incoming addresses are PLUSes, check one
 operand of one against the other operand of the other; if one address is
 a PLUS and the other isn't, test the other against both operands of the
 PLUS.  This causes memrefs_conflict_p to return consistent results for
 that given pair of insns in both stage2 and stage3 compilation.

 Additionally, it fixes the regression in nonoverlapping_memrefs_p,
 adding code to check for non-overlapping offsets when the base expr is
 the same (as long as offsets and sizes are known for both MEMs).

 Either one would suffice to fix this particular case.  The latter would
 fix the regression proper, but the former is sufficiently lightweight
 (since comparing pointers is enough) that it's probably worth adding to
 get more accurate and consistent results earlier.

 I'm bootstrapping this on ia64-linux-gnu.  Ok to install?

Looks ok to me.

Thanks,
Richard.


 fix sched compare regression

 From: Alexandre Oliva aol...@redhat.com

 for  gcc/ChangeLog

 PR rtl-optimization/64164
 PR rtl-optimization/67227
 * alias.c (memrefs_conflict_p): Handle VALUEs in PLUS better.
 (nonoverlapping_memrefs_p): Test offsets and sizes when given
 identical gimple_reg exprs.
 ---
  gcc/alias.c |   23 ++-
  1 file changed, 22 insertions(+), 1 deletion(-)

 diff --git a/gcc/alias.c b/gcc/alias.c
 index 4681e3f..f12d9d1 100644
 --- 

Re: [PR64164] drop copyrename, integrate into expand

2015-08-21 Thread Alexandre Oliva
On Aug 19, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Aug 19, 2015, Alexandre Oliva aol...@redhat.com wrote:
 I'm having some difficulty getting access to an ia64 box ATM, and for
 ada bootstraps, a cross won't do, so...  if you still have that build
 tree around, any chance you could recompile par.o with both stage1 and
 stage2, with -fdump-rtl-expand-details, and email me the compiler dump
 files?

 Thanks!

 In the mean time, I have been able to duplicate the problem myself.  As
 you say, it is triggered by -gtoggle.  However, it has nothing
 whatsoever to do with the recent patches I installed.  At most they
 expose some latent problem in the scheduler.

The above was more likely wrong than right.  There may have been a
latent problem in the scheduler indeed, but the patch actually made it
worse, or even introduced it.

The scheduler relies on alias analysis to tell whether a given pair of
insns that read or modify memory should have a dependence set between
them.  It looks both at the RTL proper (including cselib values) and at
the MEM attrs.

The problem was that, for ada/par.o, we computed different dependencies,
and thus different sched priorities, for a pair of insns.  Specifically,
one wrote to a stack spill slot, and another read from a neighbor spill
slot.  Both had MEM_EXPRs pointing at a %sfp decl, and different
offsets.  In the stage3 (-g) compilation, there were debug insns between
them.

They caused additional equivalent expressions to be added to some
values, which in turn caused memrefs_conflict_p to return different
values.

In the stage2 compilation, both VALUEs resolved to PLUSes of two VALUEs,
the first of each resolved to a constant, while the latter of each
resolved to %sfp.  When the second operands of PLUSes match, we recurse
and compare the first operands, resolving both to CONST_INTs and, in
this case, concluding that there's no possible overlap.

In the stage3 compilation, one VALUE resolved to a PLUS of a VALUE and a
CONST_INT, whereas the other resolved to a PLUS of two VALUEs.  Without
canonicalization of VALUE order in PLUSes, it just so happened that the
VALUE that appeared as the second operand in the second PLUS was moved
to the first operand in the first PLUS, and so memrefs_conflict_p
couldn't tell whether or not there was an overlap.

Before the initial pr64164 patch, we had another chance to detect the
non-overlap analyzing the MEM attrs in nonoverlapping_memrefs_p: given
the same MEM_EXPR, but different offsets, we used to conclude there was
no overlap, so this got true_dependence to return the same value in both
compilations.

The pr64164 patch introduced an early exit from nonoverlapping_memrefs_p
when either operand is a gimple_reg, because some of these wouldn't have
a DECL_RTL set, and creating RTL for them at such points would not be
appropriate.  The problem is that the early exit would only return false
if the exprs were different.  If they were the same, we'd conclude an
overlap was possible, even if offsets were enough to tell otherwise.

My thought back then was that such exprs were not addressable anyway, so
we'd always access the entire object, so offsets couldn't possible be
different.  Right?  Well, no!  Think spill slots: %sfp (a gimple_reg
decl) + constant offset!  Same base gimple_reg, non-overlapping memory
addresses!


This patch improves memrefs_conflict_p so as to handle more combinations
of VALUEs in PLUSes: if both incoming addresses are PLUSes, check one
operand of one against the other operand of the other; if one address is
a PLUS and the other isn't, test the other against both operands of the
PLUS.  This causes memrefs_conflict_p to return consistent results for
that given pair of insns in both stage2 and stage3 compilation.

Additionally, it fixes the regression in nonoverlapping_memrefs_p,
adding code to check for non-overlapping offsets when the base expr is
the same (as long as offsets and sizes are known for both MEMs).

Either one would suffice to fix this particular case.  The latter would
fix the regression proper, but the former is sufficiently lightweight
(since comparing pointers is enough) that it's probably worth adding to
get more accurate and consistent results earlier.

I'm bootstrapping this on ia64-linux-gnu.  Ok to install?


fix sched compare regression

From: Alexandre Oliva aol...@redhat.com

for  gcc/ChangeLog

PR rtl-optimization/64164
PR rtl-optimization/67227
* alias.c (memrefs_conflict_p): Handle VALUEs in PLUS better.
(nonoverlapping_memrefs_p): Test offsets and sizes when given
identical gimple_reg exprs.
---
 gcc/alias.c |   23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/gcc/alias.c b/gcc/alias.c
index 4681e3f..f12d9d1 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2228,6 +2228,13 @@ memrefs_conflict_p (int xsize, rtx x, int ysize, rtx y, 
HOST_WIDE_INT c)
   rtx x0 = XEXP (x, 0);
   rtx x1 = XEXP (x, 1);
 

Re: [PR64164] drop copyrename, integrate into expand

2015-08-20 Thread Jeff Law

On 08/19/2015 06:00 PM, Alexandre Oliva wrote:

On Aug 19, 2015, Alexandre Oliva aol...@redhat.com wrote:


I'm having some difficulty getting access to an ia64 box ATM, and for
ada bootstraps, a cross won't do, so...  if you still have that build
tree around, any chance you could recompile par.o with both stage1 and
stage2, with -fdump-rtl-expand-details, and email me the compiler dump
files?


Thanks!

In the mean time, I have been able to duplicate the problem myself.  As
you say, it is triggered by -gtoggle.  However, it has nothing
whatsoever to do with the recent patches I installed.  At most they
expose some latent problem in the scheduler.

I have verified in the expand dumps that both the gimple and the rtl
representation in the relevant parts of the code are identical, except
for the presence of debug stmts and insns.  Indeed, compiling with
-fno-schedule-insns{,2}, no differences arise.
We did a couple fixes to this code earlier this year.  Presumably 
there's something still subtly wrong in there that your changes are 
exposing.


See Maxim's changes from Feb.

You might also look and see if any of those insns have SCHED_GROUP_P set.

Jeff


Re: [PR64164] drop copyrename, integrate into expand

2015-08-19 Thread Andreas Schwab
Alexandre Oliva aol...@redhat.com writes:

 [PR64164] fix regressions reported on m68k and armeb

 From: Alexandre Oliva aol...@redhat.com

 Defer stack slot address assignment for all parms that can't live in
 pseudos, and accept pseudos assignments in assign_param_setup_block.

That doesn't fix the ia64 Ada miscompilation though.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [PR64164] drop copyrename, integrate into expand

2015-08-19 Thread Andreas Schwab
Andreas Schwab sch...@linux-m68k.org writes:

 Alexandre Oliva aol...@redhat.com writes:

 [PR64164] fix regressions reported on m68k and armeb

 From: Alexandre Oliva aol...@redhat.com

 Defer stack slot address assignment for all parms that can't live in
 pseudos, and accept pseudos assignments in assign_param_setup_block.

 That doesn't fix the ia64 Ada miscompilation though.

I mean miscomparison, not miscompilation.  The difference is only in the
insn scheduling.

--- x1  2015-08-19 15:26:41.0 +0200
+++ x2  2015-08-19 15:26:46.0 +0200
@@ -1,5 +1,5 @@
 
-stage2-gcc/ada/par.o: file format elf64-ia64-little
+stage3-gcc/ada/par.o: file format elf64-ia64-little
 
 
 Disassembly of section .text:
@@ -29467,25 +29467,25 @@
214b2: PCREL21B atree__new_node
214b6:  00 00 00 02 00 00   nop.i 0x0
214bc:  08 00 00 50 br.call.sptk.many b0=214b0 
par__ch6__p_formal_part.2186+0xa30
-   214c0:  08 78 e0 01 80 24   [MMI]   mov r15=16504
-   214c6:  e0 80 03 00 49 20   mov r14=16496
-   214cc:  00 06 04 92 mov r1=16608
-   214d0:  0a 80 23 00 08 20   [MMI]   addp4 r112=r8,r0;;
-   214d6:  f0 78 30 00 40 c0   add r15=r15,r12
-   214dc:  e1 60 00 80 add r14=r14,r12
-   214e0:  0a 08 04 18 00 20   [MMI]   add r1=r1,r12;;
-   214e6:  f0 00 3c 20 20 00   ld4 r15=[r15]
-   214ec:  00 00 04 00 nop.i 0x0
+   214c0:  08 70 c0 01 80 24   [MMI]   mov r14=16496
+   214c6:  00 00 00 02 00 e0   nop.m 0x0
+   214cc:  81 07 00 92 mov r15=16504
+   214d0:  09 08 80 01 81 24   [MMI]   mov r1=16608
+   214d6:  00 00 00 02 00 00   nop.m 0x0
+   214dc:  8e 00 20 80 addp4 r112=r8,r0;;
+   214e0:  09 70 38 18 00 20   [MMI]   add r14=r14,r12
+   214e6:  f0 78 30 00 40 20   add r15=r15,r12
+   214ec:  10 60 00 80 add r1=r1,r12;;
214f0:  09 00 20 1c 90 11   [MMI]   st4 [r14]=r8
214f6:  10 00 04 30 20 00   ld8 r1=[r1]
214fc:  00 00 04 00 nop.i 0x0;;
-   21500:  01 00 00 00 01 00   [MII]   nop.m 0x0
-   21506:  e0 00 3c 2c 00 e0   sxt4 r14=r15
-   2150c:  01 61 00 84 adds r15=16,r12;;
-   21510:  0b 70 38 00 11 20   [MMI]   shladd r14=r14,2,r0;;
-   21516:  e0 78 38 00 40 00   add r14=r15,r14
+   21500:  02 78 00 1e 10 10   [MII]   ld4 r15=[r15]
+   21506:  00 00 00 02 00 c0   nop.i 0x0;;
+   2150c:  01 78 58 00 sxt4 r14=r15
+   21510:  0b 78 40 18 00 21   [MMI]   adds r15=16,r12;;
+   21516:  e0 70 00 22 40 00   shladd r14=r14,2,r0
2151c:  00 00 04 00 nop.i 0x0;;
-   21520:  09 00 00 00 01 00   [MMI]   nop.m 0x0
+   21520:  0b 70 3c 1c 00 20   [MMI]   add r14=r15,r14;;
21526:  e0 e0 3b 7e 46 00   adds r14=-4,r14
2152c:  00 00 04 00 nop.i 0x0;;
21530:  10 88 03 1c 10 10   [MIB]   ld4 r113=[r14]


Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


Re: [PR64164] drop copyrename, integrate into expand

2015-08-19 Thread Alexandre Oliva
On Aug 19, 2015, Andreas Schwab sch...@linux-m68k.org wrote:

 Andreas Schwab sch...@linux-m68k.org writes:
 Alexandre Oliva aol...@redhat.com writes:
 
 [PR64164] fix regressions reported on m68k and armeb
 
 From: Alexandre Oliva aol...@redhat.com
 
 Defer stack slot address assignment for all parms that can't live in
 pseudos, and accept pseudos assignments in assign_param_setup_block.
 
 That doesn't fix the ia64 Ada miscompilation though.

That's not surprising, it's the first I hear of it ;-)

 I mean miscomparison, not miscompilation.  The difference is only in the
 insn scheduling.

Interesting.  I have a hard time figuring out how this could follow from
the patchset at hand, but...  let's try to figure it out.

I'm having some difficulty getting access to an ia64 box ATM, and for
ada bootstraps, a cross won't do, so...  if you still have that build
tree around, any chance you could recompile par.o with both stage1 and
stage2, with -fdump-rtl-expand-details, and email me the compiler dump
files?  Maybe that will suffice to figure out where the difference might
come from.

Thanks in advance,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-19 Thread Alexandre Oliva
On Aug 19, 2015, Alexandre Oliva aol...@redhat.com wrote:

 I'm having some difficulty getting access to an ia64 box ATM, and for
 ada bootstraps, a cross won't do, so...  if you still have that build
 tree around, any chance you could recompile par.o with both stage1 and
 stage2, with -fdump-rtl-expand-details, and email me the compiler dump
 files?

Thanks!

In the mean time, I have been able to duplicate the problem myself.  As
you say, it is triggered by -gtoggle.  However, it has nothing
whatsoever to do with the recent patches I installed.  At most they
expose some latent problem in the scheduler.

I have verified in the expand dumps that both the gimple and the rtl
representation in the relevant parts of the code are identical, except
for the presence of debug stmts and insns.  Indeed, compiling with
-fno-schedule-insns{,2}, no differences arise.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-19 Thread Alexandre Oliva
On Aug 18, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Aug 17, 2015, Christophe Lyon christophe.l...@linaro.org wrote:
 Since this was committed (r226901), I can see that the compiler build
 fails for armeb targets, when building libgcc:

 This patch fixes this particular case.  I'll also add this configuration
 to the cross build tests I'm going to rerun shortly, before submitting a
 followup formally, to see whether other non-MEM mems need to be handled
 explicitly.

On Aug 17, 2015, Andreas Schwab sch...@linux-m68k.org wrote:

 Andreas Schwab sch...@linux-m68k.org writes:

 Alexandre Oliva aol...@redhat.com writes:
 
 Would you be so kind as to give it a spin on a m68k native?  TIA,
 
 I tried it on ia64, and it falls flat on the floor.

 It fixes the m68k failures, though.

On Aug 17, 2015, Alexandre Oliva aol...@redhat.com wrote:

 I tried it on ia64, and it falls flat on the floor.

 Doh, I see a logic flaw in the patch I posted.

There were other shortcomings in the snippets I posted before, revealed
by testing on on various other targets: remaining BLKmode asserts,
failure to deal with parms without a default def and split complex args
with an unassigned stack address.  This patch fixes them all.

It was regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu,
ppc64el-linux-gnu, and further tested with a compile-only 'make all' on
a binutils+gcc+newlib tree on all tens of cross targets mentioned
before, plus the armeb configuration Christophe mentioned.

Ok to install?


[PR64164] fix regressions reported on m68k and armeb

From: Alexandre Oliva aol...@redhat.com

Defer stack slot address assignment for all parms that can't live in
pseudos, and accept pseudos assignments in assign_param_setup_block.

for  gcc/ChangeLog

PR rtl-optimization/64164
* cfgexpand.c (parm_maybe_byref_p): Renamed to...
(parm_in_stack_slot_p): ... this.  Disregard mode, what
matters is whether the parm will live in a pseudo or a stack
slot.
(expand_one_ssa_partition): Deal with params without a default
def.  Disregard mode.
* cfgexpand.h: Renamed function declaration.
* tree-ssa-coalesce.c: Adjust.
* function.c (split_complex_args): Allocate stack slot for
unassigned parms before splitting.
(parm_in_unassigned_mem_p): New.  Use it instead of
parm_maybe_byref_p throughout this file.
(assign_parm_setup_block): Use it.  Accept pseudos in the
expand-assigned rtl.
(assign_parm_setup_reg): Drop BLKmode requirement.
(assign_parm_setup_stack): Allocate and fill in the address of
unassigned MEM parms.
---
 gcc/cfgexpand.c |   44 ++--
 gcc/cfgexpand.h |2 +
 gcc/function.c  |   74 ---
 gcc/tree-ssa-coalesce.c |4 +--
 4 files changed, 100 insertions(+), 24 deletions(-)

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 0bc20f6..d567a87 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -172,17 +172,23 @@ leader_merge (tree cur, tree next)
   return cur;
 }
 
-/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
+/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
+   assigned to a stack slot.  We can't have expand_one_ssa_partition
+   choose their address: the pseudo holding the address would be set
+   up too late for assign_params to copy the parameter if needed.
+
Such parameters are likely passed as a pointer to the value, rather
than as a value, and so we must not coalesce them, nor allocate
stack space for them before determining the calling conventions for
-   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
-   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
-   with NULL so as to make sure the MEM is not used before it is
-   adjusted in assign_parm_setup_reg.  */
+   them.
+
+   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
+   with pc_rtx as the address, and then it replaces the pc_rtx with
+   NULL so as to make sure the MEM is not used before it is adjusted
+   in assign_parm_setup_reg.  */
 
 bool
-parm_maybe_byref_p (tree var)
+parm_in_stack_slot_p (tree var)
 {
   if (!var || VAR_P (var))
 return false;
@@ -190,7 +196,7 @@ parm_maybe_byref_p (tree var)
   gcc_assert (TREE_CODE (var) == PARM_DECL
  || TREE_CODE (var) == RESULT_DECL);
 
-  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
+  return !use_register_for_decl (var);
 }
 
 /* Return the partition of the default SSA_DEF for decl VAR.  */
@@ -1343,17 +1349,35 @@ expand_one_ssa_partition (tree var)
 
   if (!use_register_for_decl (var))
 {
-  if (parm_maybe_byref_p (SSA_NAME_VAR (var))
-  ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
+  /* We can't risk having the parm assigned to a MEM location
+whose address references a pseudo, for the pseudo will 

Re: [PR64164] drop copyrename, integrate into expand

2015-08-19 Thread Richard Biener
On Wed, Aug 19, 2015 at 8:45 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Aug 18, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Aug 17, 2015, Christophe Lyon christophe.l...@linaro.org wrote:
 Since this was committed (r226901), I can see that the compiler build
 fails for armeb targets, when building libgcc:

 This patch fixes this particular case.  I'll also add this configuration
 to the cross build tests I'm going to rerun shortly, before submitting a
 followup formally, to see whether other non-MEM mems need to be handled
 explicitly.

 On Aug 17, 2015, Andreas Schwab sch...@linux-m68k.org wrote:

 Andreas Schwab sch...@linux-m68k.org writes:

 Alexandre Oliva aol...@redhat.com writes:

 Would you be so kind as to give it a spin on a m68k native?  TIA,

 I tried it on ia64, and it falls flat on the floor.

 It fixes the m68k failures, though.

 On Aug 17, 2015, Alexandre Oliva aol...@redhat.com wrote:

 I tried it on ia64, and it falls flat on the floor.

 Doh, I see a logic flaw in the patch I posted.

 There were other shortcomings in the snippets I posted before, revealed
 by testing on on various other targets: remaining BLKmode asserts,
 failure to deal with parms without a default def and split complex args
 with an unassigned stack address.  This patch fixes them all.

 It was regstrapped on x86_64-linux-gnu, i686-linux-gnu, ppc64-linux-gnu,
 ppc64el-linux-gnu, and further tested with a compile-only 'make all' on
 a binutils+gcc+newlib tree on all tens of cross targets mentioned
 before, plus the armeb configuration Christophe mentioned.

 Ok to install?

Ok.

Thanks,
Richard.


 [PR64164] fix regressions reported on m68k and armeb

 From: Alexandre Oliva aol...@redhat.com

 Defer stack slot address assignment for all parms that can't live in
 pseudos, and accept pseudos assignments in assign_param_setup_block.

 for  gcc/ChangeLog

 PR rtl-optimization/64164
 * cfgexpand.c (parm_maybe_byref_p): Renamed to...
 (parm_in_stack_slot_p): ... this.  Disregard mode, what
 matters is whether the parm will live in a pseudo or a stack
 slot.
 (expand_one_ssa_partition): Deal with params without a default
 def.  Disregard mode.
 * cfgexpand.h: Renamed function declaration.
 * tree-ssa-coalesce.c: Adjust.
 * function.c (split_complex_args): Allocate stack slot for
 unassigned parms before splitting.
 (parm_in_unassigned_mem_p): New.  Use it instead of
 parm_maybe_byref_p throughout this file.
 (assign_parm_setup_block): Use it.  Accept pseudos in the
 expand-assigned rtl.
 (assign_parm_setup_reg): Drop BLKmode requirement.
 (assign_parm_setup_stack): Allocate and fill in the address of
 unassigned MEM parms.
 ---
  gcc/cfgexpand.c |   44 ++--
  gcc/cfgexpand.h |2 +
  gcc/function.c  |   74 
 ---
  gcc/tree-ssa-coalesce.c |4 +--
  4 files changed, 100 insertions(+), 24 deletions(-)

 diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
 index 0bc20f6..d567a87 100644
 --- a/gcc/cfgexpand.c
 +++ b/gcc/cfgexpand.c
 @@ -172,17 +172,23 @@ leader_merge (tree cur, tree next)
return cur;
  }

 -/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
 +/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
 +   assigned to a stack slot.  We can't have expand_one_ssa_partition
 +   choose their address: the pseudo holding the address would be set
 +   up too late for assign_params to copy the parameter if needed.
 +
 Such parameters are likely passed as a pointer to the value, rather
 than as a value, and so we must not coalesce them, nor allocate
 stack space for them before determining the calling conventions for
 -   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
 -   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
 -   with NULL so as to make sure the MEM is not used before it is
 -   adjusted in assign_parm_setup_reg.  */
 +   them.
 +
 +   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
 +   with pc_rtx as the address, and then it replaces the pc_rtx with
 +   NULL so as to make sure the MEM is not used before it is adjusted
 +   in assign_parm_setup_reg.  */

  bool
 -parm_maybe_byref_p (tree var)
 +parm_in_stack_slot_p (tree var)
  {
if (!var || VAR_P (var))
  return false;
 @@ -190,7 +196,7 @@ parm_maybe_byref_p (tree var)
gcc_assert (TREE_CODE (var) == PARM_DECL
   || TREE_CODE (var) == RESULT_DECL);

 -  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
 +  return !use_register_for_decl (var);
  }

  /* Return the partition of the default SSA_DEF for decl VAR.  */
 @@ -1343,17 +1349,35 @@ expand_one_ssa_partition (tree var)

if (!use_register_for_decl (var))
  {
 -  if (parm_maybe_byref_p (SSA_NAME_VAR (var))
 -  

Re: [PR64164] drop copyrename, integrate into expand

2015-08-18 Thread Kyrill Tkachov


On 17/08/15 03:56, Alexandre Oliva wrote:

On Aug 16, 2015, Andreas Schwab sch...@linux-m68k.org wrote:


Alexandre Oliva aol...@redhat.com writes:

On Aug 15, 2015, Andreas Schwab sch...@linux-m68k.org wrote:


FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error)
In file included from
/opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:

Are you sure this is a regression introduced by my patch?

Yes, it reintroduces the ICE.

Ugh.  I see this testcase was introduced very recently, so presumably it
wasn't present in the tree that James Greenhalgh tested and confirmed
there were no regressions.

The hack in aarch64-builtins.c looks risky IMHO.  Changing the mode of a
decl after RTL is assigned to it (or to its SSA partitions) seems fishy.
The assert is doing just what it was supposed to do.  The only surprise
to me is that it didn't catch this unexpected and unsupported change
before.

Presumably if we just dropped the assert in expand_expr_real_1, this
case would work just fine, although the unsignedp bit would be
meaningless and thus confusing, since the subreg isn't about a
promotion, but about reflecting the mode change that was made from under
us.

May I suggest that you guys find (or introduce) other means to change
the layout and mode of the decl *before* RTL is assigned to the params?


Hmm, if in TARGET_SET_CURRENT_FUNCTION, which is called fairly
early on to set up cfun I do the relaying of the param decls
then it seems to work. Will do some more testing...



I think this would save us a ton of trouble down the road.  Just think
how much trouble you'd get if the different modes had different calling
conventions, alignment requirements, valid register assignments, or
anything that might make coalescing their SSA names with those of other
variables invalid.





Re: [PR64164] drop copyrename, integrate into expand

2015-08-17 Thread Alexandre Oliva
On Aug 17, 2015, Andreas Schwab sch...@linux-m68k.org wrote:

 Alexandre Oliva aol...@redhat.com writes:
 Would you be so kind as to give it a spin on a m68k native?  TIA,

 I tried it on ia64, and it falls flat on the floor.

Doh, I see a logic flaw in the patch I posted.  The hunk in
assign_parm_setup_stack that looked like this:

+ if (from_expand)
+   gcc_assert (GET_MODE (from_expand) == GET_MODE (data-entry_parm));
+ else if (!parm_in_unassigned_mem_p (parm, from_expand))
+   data-stack_parm = from_expand;

should look like this:

+ if (from_expand)
+   gcc_assert (GET_MODE (from_expand) == GET_MODE (data-entry_parm));
+ if (from_expand  !parm_in_unassigned_mem_p (parm, from_expand))
+   data-stack_parm = from_expand;

I'll give it some more testing before submitting a formal patch.

Meanwhile, thanks for confirming the m68k issues are fixed by that one;
this one shouldn't regress them; it would only fix the unintended crashes.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-17 Thread Alexandre Oliva
On Aug 17, 2015, Christophe Lyon christophe.l...@linaro.org wrote:

 Since this was committed (r226901), I can see that the compiler build
 fails for armeb targets, when building libgcc:

Any chance you could get me a preprocessed testcase for this failure, please?

Thanks in advance,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-17 Thread Kyrill Tkachov

Hi Alexandre,

On 17/08/15 03:56, Alexandre Oliva wrote:

On Aug 16, 2015, Andreas Schwab sch...@linux-m68k.org wrote:


Alexandre Oliva aol...@redhat.com writes:

On Aug 15, 2015, Andreas Schwab sch...@linux-m68k.org wrote:


FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error)
In file included from
/opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:

Are you sure this is a regression introduced by my patch?

Yes, it reintroduces the ICE.

Ugh.  I see this testcase was introduced very recently, so presumably it
wasn't present in the tree that James Greenhalgh tested and confirmed
there were no regressions.


Yeah, I introduced it as part of the SWITCHABLE_TARGET
work for aarch64. A bit of a mid-air collision :(


The hack in aarch64-builtins.c looks risky IMHO.  Changing the mode of a
decl after RTL is assigned to it (or to its SSA partitions) seems fishy.
The assert is doing just what it was supposed to do.  The only surprise
to me is that it didn't catch this unexpected and unsupported change
before.

Presumably if we just dropped the assert in expand_expr_real_1, this
case would work just fine, although the unsignedp bit would be
meaningless and thus confusing, since the subreg isn't about a
promotion, but about reflecting the mode change that was made from under
us.

May I suggest that you guys find (or introduce) other means to change
the layout and mode of the decl *before* RTL is assigned to the params?
I think this would save us a ton of trouble down the road.  Just think
how much trouble you'd get if the different modes had different calling
conventions, alignment requirements, valid register assignments, or
anything that might make coalescing their SSA names with those of other
variables invalid.


I'm not familiar with the intricacies in this area but
I'll have a look.
Perhaps we can somehow re-layout the SIMD types when
switching from a non-simd to a simd target...
Can you, or Andreas please file a PR so we don't forget?

Thanks,
Kyrill


Re: [PR64164] drop copyrename, integrate into expand

2015-08-17 Thread Alexandre Oliva
On Aug 17, 2015, Christophe Lyon christophe.l...@linaro.org wrote:

 On 17 August 2015 at 13:58, Alexandre Oliva aol...@redhat.com wrote:
 On Aug 17, 2015, Christophe Lyon christophe.l...@linaro.org wrote:
 
 Since this was committed (r226901), I can see that the compiler build
 fails for armeb targets, when building libgcc:
 
 Any chance you could get me a preprocessed testcase for this failure, please?
 
 Yes, here it is, attached.

Thanks.

This patch fixes this particular case.  I'll also add this configuration
to the cross build tests I'm going to rerun shortly, before submitting a
followup formally, to see whether other non-MEM mems need to be handled
explicitly.


--- a/gcc/function.c
+++ b/gcc/function.c
@@ -3017,6 +3017,11 @@ assign_parm_setup_block (struct assign_parm_data_all 
*all,
   else if (size == 0)
;
 
+  /* MEM may be a REG if coalescing assigns the param's partition
+to a pseudo.  */
+  else if (REG_P (mem))
+   emit_move_insn (mem, entry_parm);
+
   /* If SIZE is that of a mode no bigger than a word, just use
 that mode's store operation.  */
   else if (size = UNITS_PER_WORD)


-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-17 Thread Christophe Lyon
On 17 August 2015 at 13:58, Alexandre Oliva aol...@redhat.com wrote:
 On Aug 17, 2015, Christophe Lyon christophe.l...@linaro.org wrote:

 Since this was committed (r226901), I can see that the compiler build
 fails for armeb targets, when building libgcc:

 Any chance you could get me a preprocessed testcase for this failure, please?

Yes, here it is, attached.

My gcc is configured with:
--target=armeb-linux-gnueabihf--with-mode=arm --with-cpu=cortex-a9
--with-fpu=neon

Thanks,

Christophe.

 Thanks in advance,

 --
 Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
 You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- http://FSFLA.org/   FSF Latin America board member
 Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


fixed-bit.i.xz
Description: application/force-download


Re: [PR64164] drop copyrename, integrate into expand

2015-08-17 Thread Andrew Pinski
On Mon, Aug 17, 2015 at 5:20 PM, Kyrill Tkachov
kyrylo.tkac...@foss.arm.com wrote:
 Hi Alexandre,

 On 17/08/15 03:56, Alexandre Oliva wrote:

 On Aug 16, 2015, Andreas Schwab sch...@linux-m68k.org wrote:

 Alexandre Oliva aol...@redhat.com writes:

 On Aug 15, 2015, Andreas Schwab sch...@linux-m68k.org wrote:

 FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler
 error)
 In file included from

 /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:

 Are you sure this is a regression introduced by my patch?

 Yes, it reintroduces the ICE.

 Ugh.  I see this testcase was introduced very recently, so presumably it
 wasn't present in the tree that James Greenhalgh tested and confirmed
 there were no regressions.


 Yeah, I introduced it as part of the SWITCHABLE_TARGET
 work for aarch64. A bit of a mid-air collision :(

 The hack in aarch64-builtins.c looks risky IMHO.  Changing the mode of a
 decl after RTL is assigned to it (or to its SSA partitions) seems fishy.
 The assert is doing just what it was supposed to do.  The only surprise
 to me is that it didn't catch this unexpected and unsupported change
 before.

 Presumably if we just dropped the assert in expand_expr_real_1, this
 case would work just fine, although the unsignedp bit would be
 meaningless and thus confusing, since the subreg isn't about a
 promotion, but about reflecting the mode change that was made from under
 us.

 May I suggest that you guys find (or introduce) other means to change
 the layout and mode of the decl *before* RTL is assigned to the params?
 I think this would save us a ton of trouble down the road.  Just think
 how much trouble you'd get if the different modes had different calling
 conventions, alignment requirements, valid register assignments, or
 anything that might make coalescing their SSA names with those of other
 variables invalid.

 I'm not familiar with the intricacies in this area but
 I'll have a look.
 Perhaps we can somehow re-layout the SIMD types when
 switching from a non-simd to a simd target...
 Can you, or Andreas please file a PR so we don't forget?

How does x86 handle this case?  Because it should be handling this case somehow.

Thanks,
Andrew



 Thanks,
 Kyrill


Re: [PR64164] drop copyrename, integrate into expand

2015-08-17 Thread Christophe Lyon
On 14 August 2015 at 20:57, Alexandre Oliva aol...@redhat.com wrote:
 On Aug 11, 2015, Patrick Marlier patrick.marl...@gmail.com wrote:

 On Mon, Aug 10, 2015 at 5:14 PM, Jeff Law l...@redhat.com wrote:
 On 08/10/2015 02:23 AM, James Greenhalgh wrote:

 For what it is worth, I bootstrapped and tested the consolidated patch
 on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
 r226516 over the weekend, and didn't see any new issues.

 Thanks!

 Especially as the bug reporter, I am impressed how a slight problem
 can lead to such a patch! ;)
 Thanks a lot Alexandre!

 You're welcome.  I'm glad it appears to be working to everyone's
 satisfaction now.  I've just committed it as r226901, with only a
 context adjustment to account for a change in use_register_for_decl in
 function.c.  /me crosses fingers :-)

 Here's the patch as checked in:


Hi,

Since this was committed (r226901), I can see that the compiler build
fails for armeb targets, when building libgcc:
In file included from
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.c:55:0:
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.c:
In function '__gnu_addha3':
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.h:450:31:
internal compiler error: in simplify_subreg, at simplify-rtx.c:5790
 #define FIXED_OP(OP,MODE,NUM) __gnu_ ## OP ## MODE ## NUM
   ^
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.h:460:30:
note: in expansion of macro 'FIXED_OP'
 #define FIXED_ADD_TEMP(NAME) FIXED_OP(add,NAME,3)
  ^
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.h:492:19:
note: in expansion of macro 'FIXED_ADD_TEMP'
 #define FIXED_ADD FIXED_ADD_TEMP(MODE_NAME_S)
   ^
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.c:59:1:
note: in expansion of macro 'FIXED_ADD'
 FIXED_ADD (FIXED_C_TYPE a, FIXED_C_TYPE b)
 ^
0xa4bbc3 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:5790
0xa4bbc3 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:5790
0xa4ce2d simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:6013
0xa4ce2d simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int)

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:6013
0x784385 move_block_from_reg(int, rtx_def*, int)
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1536
0x784385 move_block_from_reg(int, rtx_def*, int)
/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1536
0x7e165d assign_parm_setup_block

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3076
0x7e165d assign_parm_setup_block

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3076
0x7e813a assign_parms

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3805
0x7e813a assign_parms

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3805
0x7e8f2e expand_function_start(tree_node*)

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:5234
0x7e8f2e expand_function_start(tree_node*)

/tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:5234

Christophe.

 for  gcc/ChangeLog

 PR rtl-optimization/64164
 PR bootstrap/66978
 PR middle-end/66983
 PR rtl-optimization/67000
 PR middle-end/67034
 PR middle-end/67035
 * Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
 * tree-ssa-copyrename.c: Removed.
 * opts.c (default_options_table): Drop -ftree-copyrename.  Add
 -ftree-coalesce-vars.
 * passes.def: Drop all occurrences of pass_rename_ssa_copies.
 * common.opt (ftree-copyrename): Ignore.
 (ftree-coalesce-inlined-vars): Likewise.
 * doc/invoke.texi: Remove the ignored options above.
 * gimple-expr.h (gimple_can_coalesce_p): Move declaration
 * tree-ssa-coalesce.h: ... here.
 * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
 headers required by it.
 * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
 across variables when flag_tree_coalesce_vars.  Check register
 use and promoted modes to allow coalescing.  Do not coalesce
 maybe-byref parms with SSA_NAMEs of other variables, or
 anonymous SSA_NAMEs.  Moved to tree-ssa-coalesce.c.
 * tree-ssa-live.c (struct tree_int_map_hasher): Move along
 with its member functions to tree-ssa-coalesce.c.
 

Re: [PR64164] drop copyrename, integrate into expand

2015-08-17 Thread Andreas Schwab
Alexandre Oliva aol...@redhat.com writes:

 Would you be so kind as to give it a spin on a m68k native?  TIA,

I tried it on ia64, and it falls flat on the floor.

../../../libgcc/config/ia64/unwind-ia64.c: In function ‘_Unwind_SetGR’:
../../../libgcc/config/ia64/unwind-ia64.c:1683:1: internal compiler error: 
Segmentation fault
 _Unwind_SetGR (struct _Unwind_Context *context, int index, _Unwind_Word val)
 ^
0x41807edf crash_signal
../../gcc/toplev.c:352
0x40d0ed60 parm_in_unassigned_mem_p
../../gcc/function.c:2940
0x40d23e8f assign_parm_setup_stack
../../gcc/function.c:3473
0x40d2b43f assign_parms
../../gcc/function.c:3830
0x40d2e24f expand_function_start(tree_node*)
../../gcc/function.c:5254
0x407bdabf execute
../../gcc/cfgexpand.c:6187

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [PR64164] drop copyrename, integrate into expand

2015-08-17 Thread Andreas Schwab
Andreas Schwab sch...@linux-m68k.org writes:

 Alexandre Oliva aol...@redhat.com writes:

 Would you be so kind as to give it a spin on a m68k native?  TIA,

 I tried it on ia64, and it falls flat on the floor.

It fixes the m68k failures, though.

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
And now for something completely different.


Re: [PR64164] drop copyrename, integrate into expand

2015-08-16 Thread Alexandre Oliva
On Aug 15, 2015, Andreas Schwab sch...@linux-m68k.org wrote:

 FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error)

 In file included from
 /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:

Are you sure this is a regression introduced by my patch?  The comments
at the top of this file seem to indicate it is a known problem in the
expansion of the crypto builtin, which is precisely what we see in the
backtrace?

If it is indeed a regression, would you please provide me with a
preprocessed testcase so that I can look into it without a native
environment?

Thanks in advance,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-16 Thread Andreas Schwab
On m68k:

FAIL: gcc.c-torture/execute/20050316-1.c   -O0  execution test
FAIL: gcc.c-torture/execute/20050316-2.c   -O0  execution test
FAIL: gcc.c-torture/execute/20050316-3.c   -O0  execution test
FAIL: gcc.c-torture/execute/simd-4.c   -O0  execution test
FAIL: gcc.c-torture/execute/simd-6.c   -O0  execution test
FAIL: gcc.dg/compat/vector-1 c_compat_x_tst.o-c_compat_y_tst.o execute

--- 20050316-1.s-good
+++ 20050316-1.s-bad
@@ -15,8 +15,17 @@
.type   test2, @function
 test2:
link.w %fp,#0
-   move.l 8(%fp),%d0
-   move.l 12(%fp),%d1
+   move.l 8(%fp),(%a0)
+   move.l 12(%fp),4(%a0)
+   lea (-16,%sp),%sp
+   move.l %sp,%d0
+   addq.l #7,%d0
+   lsr.l #3,%d0
+   move.l %d0,%d1
+   lsl.l #3,%d1
+   move.l %d1,%a0
+   move.l (%a0),%d0
+   move.l 4(%a0),%d1
move.l %d1,%d0
unlk %fp
rts
@@ -37,8 +46,9 @@
.globl  test4
.type   test4, @function
 test4:
-   link.w %fp,#0
-   move.l 8(%fp),%d0
+   link.w %fp,#-4
+   move.l 8(%fp),-4(%fp)
+   move.l -4(%fp),%d0
move.l %d0,%d1
smi %d0
extb.l %d0
@@ -54,8 +64,17 @@
.type   test5, @function
 test5:
link.w %fp,#0
-   move.l 8(%fp),%a0
-   move.l 12(%fp),%a1
+   move.l 8(%fp),(%a0)
+   move.l 12(%fp),4(%a0)
+   lea (-16,%sp),%sp
+   move.l %sp,%d0
+   addq.l #7,%d0
+   lsr.l #3,%d0
+   move.l %d0,%d1
+   lsl.l #3,%d1
+   move.l %d1,%a0
+   move.l 4(%a0),%a1
+   move.l (%a0),%a0
move.l %a0,%d0
move.l %a1,%d1
unlk %fp

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [PR64164] drop copyrename, integrate into expand

2015-08-16 Thread Alexandre Oliva
On Aug 16, 2015, Andreas Schwab sch...@linux-m68k.org wrote:

 Alexandre Oliva aol...@redhat.com writes:
 On Aug 15, 2015, Andreas Schwab sch...@linux-m68k.org wrote:
 
 FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler 
 error)
 
 In file included from
 /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:
 
 Are you sure this is a regression introduced by my patch?

 Yes, it reintroduces the ICE.

Ugh.  I see this testcase was introduced very recently, so presumably it
wasn't present in the tree that James Greenhalgh tested and confirmed
there were no regressions.

The hack in aarch64-builtins.c looks risky IMHO.  Changing the mode of a
decl after RTL is assigned to it (or to its SSA partitions) seems fishy.
The assert is doing just what it was supposed to do.  The only surprise
to me is that it didn't catch this unexpected and unsupported change
before.

Presumably if we just dropped the assert in expand_expr_real_1, this
case would work just fine, although the unsignedp bit would be
meaningless and thus confusing, since the subreg isn't about a
promotion, but about reflecting the mode change that was made from under
us.

May I suggest that you guys find (or introduce) other means to change
the layout and mode of the decl *before* RTL is assigned to the params?
I think this would save us a ton of trouble down the road.  Just think
how much trouble you'd get if the different modes had different calling
conventions, alignment requirements, valid register assignments, or
anything that might make coalescing their SSA names with those of other
variables invalid.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-16 Thread Alexandre Oliva
On Aug 16, 2015, Andreas Schwab sch...@linux-m68k.org wrote:

 On m68k:
 FAIL: gcc.c-torture/execute/20050316-1.c   -O0  execution test
 FAIL: gcc.c-torture/execute/20050316-2.c   -O0  execution test
 FAIL: gcc.c-torture/execute/20050316-3.c   -O0  execution test
 FAIL: gcc.c-torture/execute/simd-4.c   -O0  execution test
 FAIL: gcc.c-torture/execute/simd-6.c   -O0  execution test
 FAIL: gcc.dg/compat/vector-1 c_compat_x_tst.o-c_compat_y_tst.o execute

Thanks.  Interesting.  This exposes a more general situation than the
one I covered with the byref params: the general case does not require
the params to be passed by reference, but rather that the params require
a stack address that, if determined by cfgexpand, will cause them to be
computed too late for assign_parms' use.  The following patch appears to
fix the problem, applying the same logic of limited coalescing and
deferred address assignment to all params that can't live in pseudos,
and extending assign_parms' remaining case of copying incoming params to
new stack slots to fill in the blank address with that of the
newly-allocated stack slot.

Would you be so kind as to give it a spin on a m68k native?  TIA,


diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 0bc20f6..56571ce 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -172,17 +172,23 @@ leader_merge (tree cur, tree next)
   return cur;
 }
 
-/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode.
+/* Return true if VAR is a PARM_DECL or a RESULT_DECL that ought to be
+   assigned to a stack slot.  We can't have expand_one_ssa_partition
+   choose their address: the pseudo holding the address would be set
+   up too late for assign_params to copy the parameter if needed.
+
Such parameters are likely passed as a pointer to the value, rather
than as a value, and so we must not coalesce them, nor allocate
stack space for them before determining the calling conventions for
-   them.  For their SSA_NAMEs, expand_one_ssa_partition emits RTL as
-   MEMs with pc_rtx as the address, and then it replaces the pc_rtx
-   with NULL so as to make sure the MEM is not used before it is
-   adjusted in assign_parm_setup_reg.  */
+   them.
+
+   For their SSA_NAMEs, expand_one_ssa_partition emits RTL as MEMs
+   with pc_rtx as the address, and then it replaces the pc_rtx with
+   NULL so as to make sure the MEM is not used before it is adjusted
+   in assign_parm_setup_reg.  */
 
 bool
-parm_maybe_byref_p (tree var)
+parm_in_stack_slot_p (tree var)
 {
   if (!var || VAR_P (var))
 return false;
@@ -190,7 +196,7 @@ parm_maybe_byref_p (tree var)
   gcc_assert (TREE_CODE (var) == PARM_DECL
  || TREE_CODE (var) == RESULT_DECL);
 
-  return TYPE_MODE (TREE_TYPE (var)) == BLKmode;
+  return !use_register_for_decl (var);
 }
 
 /* Return the partition of the default SSA_DEF for decl VAR.  */
@@ -1343,13 +1349,15 @@ expand_one_ssa_partition (tree var)
 
   if (!use_register_for_decl (var))
 {
-  if (parm_maybe_byref_p (SSA_NAME_VAR (var))
+  /* We can't risk having the parm assigned to a MEM location
+whose address references a pseudo, for the pseudo will only
+be set up after arguments are copied to the stack slot.  */
+  if (parm_in_stack_slot_p (SSA_NAME_VAR (var))
   ssa_default_def_partition (SSA_NAME_VAR (var)) == part)
{
  expand_one_stack_var_at (var, pc_rtx, 0, 0);
  rtx x = SA.partition_to_pseudo[part];
  gcc_assert (GET_CODE (x) == MEM);
- gcc_assert (GET_MODE (x) == BLKmode);
  gcc_assert (XEXP (x, 0) == pc_rtx);
  /* Reset the address, so that any attempt to use it will
 ICE.  It will be adjusted in assign_parm_setup_reg.  */
diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h
index 987cf356..d168672 100644
--- a/gcc/cfgexpand.h
+++ b/gcc/cfgexpand.h
@@ -22,7 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 
 extern tree gimple_assign_rhs_to_tree (gimple);
 extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *);
-extern bool parm_maybe_byref_p (tree);
+extern bool parm_in_stack_slot_p (tree);
 extern rtx get_rtl_for_parm_ssa_default_def (tree var);
 
 
diff --git a/gcc/function.c b/gcc/function.c
index 715c19f..eccd8c6 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2934,6 +2934,16 @@ assign_parm_setup_block_p (struct assign_parm_data_one 
*data)
   return false;
 }
 
+static bool
+parm_in_unassigned_mem_p (tree decl, rtx from_expand)
+{
+  bool result = MEM_P (from_expand)  !XEXP (from_expand, 0);
+
+  gcc_assert (result == parm_in_stack_slot_p (decl));
+
+  return result;
+}
+
 /* A subroutine of assign_parms.  Arrange for the parameter to be
present and valid in DATA-STACK_RTL.  */
 
@@ -2956,8 +2966,7 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
 {
   DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD);
   rtx from_expand = rtl_for_parm (all, parm);
-  if (from_expand  

Re: [PR64164] drop copyrename, integrate into expand

2015-08-15 Thread Andreas Schwab
FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error)

In file included from 
/opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0:
/opt/gcc/gcc-20150815/Build/gcc/include/arm_neon.h: In function 
'test_vsha1cq_u32':
/opt/gcc/gcc-20150815/Build/gcc/include/arm_neon.h:21076:10: internal compiler 
error: in expand_expr_real_1, at expr.c:9532
0x7f060b expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:9532
0xdb1027 expand_normal
../../gcc/expr.h:261
0xdb1027 aarch64_simd_expand_args
../../gcc/config/aarch64/aarch64-builtins.c:944
0xdb1027 aarch64_simd_expand_builtin(int, tree_node*, rtx_def*)
../../gcc/config/aarch64/aarch64-builtins.c:1118
0x6cc667 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)
../../gcc/builtins.c:5931
0x7ecab7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, 
expand_modifier, rtx_def**, bool)
../../gcc/expr.c:10360
0x7f8547 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, tree_node*)
../../gcc/expr.c:5398
0x7fa9d3 expand_assignment(tree_node*, tree_node*, bool)
../../gcc/expr.c:5170
0x6f435f expand_call_stmt
../../gcc/cfgexpand.c:2621
0x6f435f expand_gimple_stmt_1
../../gcc/cfgexpand.c:3510
0x6f435f expand_gimple_stmt
../../gcc/cfgexpand.c:3671
0x6f69c7 expand_gimple_tailcall
../../gcc/cfgexpand.c:3718
0x6f69c7 expand_gimple_basic_block
../../gcc/cfgexpand.c:5651
0x6fc777 execute
../../gcc/cfgexpand.c:6260

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: [PR64164] drop copyrename, integrate into expand

2015-08-14 Thread Alexandre Oliva
On Aug 11, 2015, Patrick Marlier patrick.marl...@gmail.com wrote:

 On Mon, Aug 10, 2015 at 5:14 PM, Jeff Law l...@redhat.com wrote:
 On 08/10/2015 02:23 AM, James Greenhalgh wrote:

 For what it is worth, I bootstrapped and tested the consolidated patch
 on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
 r226516 over the weekend, and didn't see any new issues.

Thanks!

 Especially as the bug reporter, I am impressed how a slight problem
 can lead to such a patch! ;)
 Thanks a lot Alexandre!

You're welcome.  I'm glad it appears to be working to everyone's
satisfaction now.  I've just committed it as r226901, with only a
context adjustment to account for a change in use_register_for_decl in
function.c.  /me crosses fingers :-)

Here's the patch as checked in:

for  gcc/ChangeLog

PR rtl-optimization/64164
PR bootstrap/66978
PR middle-end/66983
PR rtl-optimization/67000
PR middle-end/67034
PR middle-end/67035
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename.  Add
-ftree-coalesce-vars.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.h (gimple_can_coalesce_p): Move declaration
* tree-ssa-coalesce.h: ... here.
* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
headers required by it.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across variables when flag_tree_coalesce_vars.  Check register
use and promoted modes to allow coalescing.  Do not coalesce
maybe-byref parms with SSA_NAMEs of other variables, or
anonymous SSA_NAMEs.  Moved to tree-ssa-coalesce.c.
* tree-ssa-live.c (struct tree_int_map_hasher): Move along
with its member functions to tree-ssa-coalesce.c.
(var_map_base_init): Likewise.  Renamed to
compute_samebase_partition_bases.
(partition_view_normal): Drop want_bases parameter.
(partition_view_bitmap): Likewise.
* tree-ssa-live.h: Adjust declarations.
* tree-ssa-coalesce.c: Include explow.h and cfgexpand.h.
(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
default defs at the entry point.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
of compute_samebase_partition_bases.  Adjust.
* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
* cfgexpand.c (leader_merge, parm_maybe_byref_p): New.
(ssa_default_def_partition): New.
(get_rtl_for_parm_ssa_default_def): New.
(align_local_variable, add_stack_var): Support anonymous SSA
names.
(defer_stack_allocation): Likewise.  Declare earlier.
(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
Do no record deferred-allocation marker in
SA.partition_to_pseudo.
(expand_stack_vars): Adjust check for the marker in it.
(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
redundant MEM attr setting.
(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
from...
(expand_one_stack_var): ... this.  New wrapper to check and
skip already expanded SSA partitions.
(record_alignment_for_reg_var): New, factored out of...
(expand_one_var): ... this.
(expand_one_ssa_partition): New.
(adjust_one_expanded_partition_var): New.
(expand_one_register_var): Check and skip already expanded SSA
partitions.
(expand_used_vars): Don't create DECLs for anonymous SSA
names.  Expand all SSA partitions, then adjust all SSA names.
(pass::execute): Replace the loops that set
SA.partition_to_pseudo from partition leaders and cleared
DECL_RTL for multi-location variables, and that which used to
rename vars and set attrs, with one that clears DECL_RTL and
checks that PARMs and RESULTs default_defs match DECL_RTL.
* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
* emit-rtl.c: Include stor-layout.h.
(set_reg_attrs_for_parm): Handle NULL decl.
(set_reg_attrs_for_decl_rtl): Take mode from expression if
it's not a DECL.
* stmt.c (emit_case_decision_tree): Pass it the SSA_NAME
rather than its possibly-NULL DECL.
* explow.c (promote_ssa_mode): New.
* explow.h (promote_ssa_mode): Declare.
* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
(read_complex_part): Export.
* 

Re: [PR64164] drop copyrename, integrate into expand

2015-08-10 Thread Patrick Marlier
On Mon, Aug 10, 2015 at 5:14 PM, Jeff Law l...@redhat.com wrote:
 On 08/10/2015 02:23 AM, James Greenhalgh wrote:

 On Tue, Aug 04, 2015 at 12:45:28AM +0100, Alexandre Oliva wrote:

 On Jul 30, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 aoliva/pr64164  is fine on x32.


 Thanks.  I have made a large number of changes since you tested it,
 fixing all the reported issues and then some.  Now, x86_64-linux-gnu
 (-m64 and -m32), i686-pc-linux-gnu, powerpc64-linux-gnu and
 powerpc64el-linux-gnu pass regstrap (r226317), and the many tens of
 targets I cross-tested still get the same 'make all' errors that the
 pristine tree did.


 For what it is worth, I bootstrapped and tested the consolidated patch
 on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
 r226516 over the weekend, and didn't see any new issues.

 Thanks -- I know it's been a long road on this patch.  I don't think anyone
 would have ever guessed fixing 64164 would be so complex.

Especially as the bug reporter, I am impressed how a slight problem
can lead to such a patch! ;)
Thanks a lot Alexandre!

I feel like I owe you something for this hard work!
Feel free to ping if I can help you with something or I owe you at
least a beer when you will be around Switzerland. :)
--
Pat


Re: [PR64164] drop copyrename, integrate into expand

2015-08-10 Thread Jeff Law

On 08/10/2015 02:23 AM, James Greenhalgh wrote:

On Tue, Aug 04, 2015 at 12:45:28AM +0100, Alexandre Oliva wrote:

On Jul 30, 2015, H.J. Lu hjl.to...@gmail.com wrote:


aoliva/pr64164  is fine on x32.


Thanks.  I have made a large number of changes since you tested it,
fixing all the reported issues and then some.  Now, x86_64-linux-gnu
(-m64 and -m32), i686-pc-linux-gnu, powerpc64-linux-gnu and
powerpc64el-linux-gnu pass regstrap (r226317), and the many tens of
targets I cross-tested still get the same 'make all' errors that the
pristine tree did.


For what it is worth, I bootstrapped and tested the consolidated patch
on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
r226516 over the weekend, and didn't see any new issues.
Thanks -- I know it's been a long road on this patch.  I don't think 
anyone would have ever guessed fixing 64164 would be so complex.


jeff


Re: [PR64164] drop copyrename, integrate into expand

2015-08-10 Thread James Greenhalgh
On Tue, Aug 04, 2015 at 12:45:28AM +0100, Alexandre Oliva wrote:
 On Jul 30, 2015, H.J. Lu hjl.to...@gmail.com wrote:
 
  aoliva/pr64164  is fine on x32.
 
 Thanks.  I have made a large number of changes since you tested it,
 fixing all the reported issues and then some.  Now, x86_64-linux-gnu
 (-m64 and -m32), i686-pc-linux-gnu, powerpc64-linux-gnu and
 powerpc64el-linux-gnu pass regstrap (r226317), and the many tens of
 targets I cross-tested still get the same 'make all' errors that the
 pristine tree did.

For what it is worth, I bootstrapped and tested the consolidated patch
on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at
r226516 over the weekend, and didn't see any new issues.

Thanks,
James



Re: [PR64164] drop copyrename, integrate into expand

2015-08-05 Thread Richard Biener
On Wed, Aug 5, 2015 at 2:38 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Aug  4, 2015, Richard Biener richard.guent...@gmail.com wrote:

 Though I wonder on whether splitting the patch into a first one with 
 disabling
 coalescing of parms (their default defs(?)) and a followup implementing the
 support for that.

 We can't disable coalescing of parms altogether.  With -O0, we must
 coalesce all SSA_NAMEs referencing each parm to a single partition.
 With optimization, we could coalesce parms in general, just not these
 special cases in which the parm is to live in a caller-supplied memory
 block.

 Now, it's not coalescing parms proper that brought so much risk to the
 patch, it is assigning rtl to SSA partitions, and having assign_parms*
 use that assignment.  Considering that sometimes a single param
 necessarily ends up in more than one partition, requiring two
 assignments, and that assign_parms* can't deal with that, I don't see
 how to easily disable the cfgexpand logic when it comes to parms, so as
 to be able to leave assign_parms alone.

 How about, if further problems arise that justify reverting the patch
 one more time, I'll look into splitting the patch as you suggested, but
 otherwise, I'll save myself the trouble, ok?

Sure.

 So - is my observation correct that this is only about coalescing of the
 default defs of parameters, not other SSA names based on parameter decls?

 It's more like the opposite, i.e., we *refrain* from coalescing other
 SSA_NAMEs related with byref params, so that we can easily tell when a
 partition references a byref param and whether that partition holds its
 default def.  We could have coalesced any other names that ended up in
 different partitions, and even the partition holding the default def, if
 we had other means to identify partitions with default defs of byref
 params.  For example, we could create a bitmap of byref param default
 def versions, and then, after partitioning, map those to the partitions
 they were assigned to.  In fact, I might do that as a followup.

 Do you think this splitting is feasible and my concern about the
 code-gen issues warranted?

 It is feasible but not exactly easy.

 As for codegen, I hope to have covered all cases now, but should we find
 out I haven't, I'll try the split and see what that gets us.  Did you
 have any special cases in mind that it looks like I may have missed?

It was just a hunch when you talked about BLKmode and params in memory.
As coalescing is about SSA name (thus register) coalescing I was thinking
that if you coalesce a register with incoming memory you'll end up with
more memory accesses?  But maybe I'm completely off here.

I also thought of the RTL expansion thing we do with at first copying
the hardreg incoming args to pseudos and how that interacts with
coalescing.

But I guess you have eyed code-gen changes a bit anyway.

Thanks,
Richard.

 Thanks,

 --
 Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
 You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- http://FSFLA.org/   FSF Latin America board member
 Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-05 Thread Alexandre Oliva
On Aug  5, 2015, Richard Biener richard.guent...@gmail.com wrote:

 It was just a hunch when you talked about BLKmode and params in memory.
 As coalescing is about SSA name (thus register) coalescing I was thinking
 that if you coalesce a register with incoming memory you'll end up with
 more memory accesses?

Since we only coalesce variables whose promoted mode is the same, if one
of them gets BLKmode and has to live in memory, so would all the others
it might coalesce with.  So, even though we have gimple_regs, we can't
have pseudos.  This was observed with vector types for which no native
vector mode is available.

It would still make sense to share them when possible, to reduce the
number of mem-to-mem copies.  And we don't want to copy incoming BLKmode
parms to *another* memory location if we can help it.

Now, maybe you're concerned about incoming parms passed by reference
that *can* be held in pseudos.  For those, we will perform a load from
memory to a pseudo and use that, even if the pseudo ends up allocated in
memory.

 I also thought of the RTL expansion thing we do with at first copying
 the hardreg incoming args to pseudos and how that interacts with
 coalescing.

Most of what changed now is who gets to choose the pseudo; it used to be
assign_parms, now it's cfgexpand.  The other significant change is that
now, when cfgexpand detects a BLKmode parm, it will choose MEM, but it
won't set up the address, so that assign_parms still does what it used
to, namely, copy the incoming hard reg to a pseudo, and then use the
pseudo as the MEM address.

 But I guess you have eyed code-gen changes a bit anyway.

Yeah.  Not much has changed in the before parm_birth area; expected
changes have to do with the pseudo numbering.  IIRC, anything else would
be unexpected.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-04 Thread Alexandre Oliva
On Aug  4, 2015, Richard Biener richard.guent...@gmail.com wrote:

 Though I wonder on whether splitting the patch into a first one with disabling
 coalescing of parms (their default defs(?)) and a followup implementing the
 support for that.

We can't disable coalescing of parms altogether.  With -O0, we must
coalesce all SSA_NAMEs referencing each parm to a single partition.
With optimization, we could coalesce parms in general, just not these
special cases in which the parm is to live in a caller-supplied memory
block.

Now, it's not coalescing parms proper that brought so much risk to the
patch, it is assigning rtl to SSA partitions, and having assign_parms*
use that assignment.  Considering that sometimes a single param
necessarily ends up in more than one partition, requiring two
assignments, and that assign_parms* can't deal with that, I don't see
how to easily disable the cfgexpand logic when it comes to parms, so as
to be able to leave assign_parms alone.

How about, if further problems arise that justify reverting the patch
one more time, I'll look into splitting the patch as you suggested, but
otherwise, I'll save myself the trouble, ok?

 So - is my observation correct that this is only about coalescing of the
 default defs of parameters, not other SSA names based on parameter decls?

It's more like the opposite, i.e., we *refrain* from coalescing other
SSA_NAMEs related with byref params, so that we can easily tell when a
partition references a byref param and whether that partition holds its
default def.  We could have coalesced any other names that ended up in
different partitions, and even the partition holding the default def, if
we had other means to identify partitions with default defs of byref
params.  For example, we could create a bitmap of byref param default
def versions, and then, after partitioning, map those to the partitions
they were assigned to.  In fact, I might do that as a followup.

 Do you think this splitting is feasible and my concern about the
 code-gen issues warranted?

It is feasible but not exactly easy.

As for codegen, I hope to have covered all cases now, but should we find
out I haven't, I'll try the split and see what that gets us.  Did you
have any special cases in mind that it looks like I may have missed?

Thanks,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-08-03 Thread Alexandre Oliva
On Jul 30, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 aoliva/pr64164  is fine on x32.

Thanks.  I have made a large number of changes since you tested it,
fixing all the reported issues and then some.  Now, x86_64-linux-gnu
(-m64 and -m32), i686-pc-linux-gnu, powerpc64-linux-gnu and
powerpc64el-linux-gnu pass regstrap (r226317), and the many tens of
targets I cross-tested still get the same 'make all' errors that the
pristine tree did.

The bulk of the incremental changes had to do with handling splitting
and unsplitting of complex args, and BLKmode types passed by reference.

For the former, I had naively assumed complex args would always be
represented as CONCATs.  I now use read_complex_part to split the
expand-assigned parm rtl into components, and I use it again at unsplit
time to make sure the expand-assigned parm rtl matches that of the split
components.

The latter, in turn, almost required me to give up the entire notion of
coalescing parms.  The problem is that, for arguments passed as a
BLKmode pointer, copying the argument to an expand-assigned stack slot
is not only wasteful, it doesn't really work: we'd expand the copy in
assign_parms* and insert it before the stack allocation performed by
expand_user_vars, so we'd initialize the pseudo holding the address of
the stack slot only after its first use.

The solution I came up with was to detect BLKmode parms and NOT allow
them to coalesce with other variables, so that we can easily detect
partitions that need special handling.  The special handling amounts to
not allocating a stack slot for the partition holding the param default
def, and leaving it for assign_parms to do so.  We do, however, allocate
a MEM, in theory assigned to all partition members (though they're all
the same parm ATM, but not necessarily all SSA_NAMEs of the same parm,
since optimization causes different versions to conflict).  We leave the
address of that MEM unset, so that assign_parm knows it is to fill it in
with a pseudo holding a copy of the incoming parm address, or with the
address of the local stack slot created to hold a copy of the parameter.

It took me several rounds of trial and error to get these to pass all
complex and vector tests on x86 and ppc.  The last remaining failure was
a regression in gcc.target/powerpc/pr16458-4, caused by our inability to
hold SSA_NAMEs as REG_EXPRs in pseudos.  emit_case_decision_tree
attempted to preserve the decl as the REG_EXPR of a pseudo holding a
copy of the switch expr, and its type appears to be used to decide
whether to emit signed or unsigned compares, even though we explicitly
pass mode and unsignedp down to the cmp_and_jump expanders.  I figured
there was no good reason to prevent SSA_NAMEs in REG_EXPRs, just like
MEM_EXPRs, so I went ahead and adjusted the DECL_MODE that prevented it,
and now we expand the case decision tree as intended.

Here's a consolidated patch, followed by the consolidated incremental
patch.  I don't intend to install it this week, even if approved,
because I'm going to be away Aug 5-10, and I would like to be around
should any further problems arise.  So, ok to install when I return?

[PR64164] Drop copyrename, use coalescible partition as base when optimizing.

From: Alexandre Oliva aol...@redhat.com

for  gcc/ChangeLog

PR rtl-optimization/64164
PR bootstrap/66978
PR middle-end/66983
PR rtl-optimization/67000
PR middle-end/67034
PR middle-end/67035
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename.  Add
-ftree-coalesce-vars.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.h (gimple_can_coalesce_p): Move declaration
* tree-ssa-coalesce.h: ... here.
* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
headers required by it.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across variables when flag_tree_coalesce_vars.  Check register
use and promoted modes to allow coalescing.  Do not coalesce
maybe-byref parms with SSA_NAMEs of other variables, or
anonymous SSA_NAMEs.  Moved to tree-ssa-coalesce.c.
* tree-ssa-live.c (struct tree_int_map_hasher): Move along
with its member functions to tree-ssa-coalesce.c.
(var_map_base_init): Likewise.  Renamed to
compute_samebase_partition_bases.
(partition_view_normal): Drop want_bases parameter.
(partition_view_bitmap): Likewise.
* tree-ssa-live.h: Adjust declarations.
* tree-ssa-coalesce.c: Include explow.h and cfgexpand.h.
(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
default defs at the entry point.

Re: [PR64164] drop copyrename, integrate into expand

2015-07-30 Thread H.J. Lu
On Wed, Jul 29, 2015 at 1:23 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Wed, Jul 29, 2015 at 1:13 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 23, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

 Thanks, both of these are also fixed (I merged your patch for x32, and I
 verified manually that another fix I just wrote fixes all the -m32
 -msse2 regressions) in the git branch aoliva/pr64164, but I'm going to
 investigate a few more issues affecting other targets before I start
 full regression tests all over the build farm ;-)


 I am building x32 on aoliva/pr64164 now.

aoliva/pr64164  is fine on x32.

Thanks.

-- 
H.J.


Re: [PR64164] drop copyrename, integrate into expand

2015-07-29 Thread Alexandre Oliva
On Jul 23, 2015, Segher Boessenkool seg...@kernel.crashing.org wrote:

 On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):

 Unfortunately it causes about a thousand test fails on powerpc64-linux
 (at least, it seems to be this patch, I haven't actually checked).

 Some representative backtraces:

Thanks, both of these are now fixed (at least in that they don't ICE any
more) in the git branch aoliva/pr64164, but I'm going to investigate a
few more issues before I start a regression test on gcc110 and gcc112.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-29 Thread Alexandre Oliva
On Jul 23, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

Thanks, both of these are also fixed (I merged your patch for x32, and I
verified manually that another fix I just wrote fixes all the -m32
-msse2 regressions) in the git branch aoliva/pr64164, but I'm going to
investigate a few more issues affecting other targets before I start
full regression tests all over the build farm ;-)

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-29 Thread H.J. Lu
On Wed, Jul 29, 2015 at 1:13 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 23, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

 Thanks, both of these are also fixed (I merged your patch for x32, and I
 verified manually that another fix I just wrote fixes all the -m32
 -msse2 regressions) in the git branch aoliva/pr64164, but I'm going to
 investigate a few more issues affecting other targets before I start
 full regression tests all over the build farm ;-)


I am building x32 on aoliva/pr64164 now.

-- 
H.J.


Re: [PR64164] drop copyrename, integrate into expand

2015-07-27 Thread Alexandre Oliva
On Jul 24, 2015, David Edelsohn dje@gmail.com wrote:

 On Fri, Jul 24, 2015 at 4:02 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 23, 2015, David Edelsohn dje@gmail.com wrote:
 
 I request that this patch be reverted (again).
 
 Might I kindly ask you to please do so for me.  I've just found out
 that, after yesterday's memory upgrade on my local build machine, the
 filesystem that I normally use for GCC development got corrupted, and I
 don't want to mess with it before running an fsck which will take me a
 while.

 I have reverted the patch.

Thank you very much.  Long story short, the filesystem got corrupted
beyond repair before I realized something was wrong, so I spend my
weekend backing up the bits I still could and recreating it all from
scratch.  *fun* :-/

I even ran memtest before booting up, but everything was fine in the
single-threaded tests it runs by default.  It was only with all cores
actively using memory intensely that something overheated (memory
modules?  chipset?  cpu?  no clue) and started randomly corrupting bits.
So, I'm now back at lower memory clock speeds, and everything appears to
be rock solid again.  Phew!  So, I'm back to debugging the
newly-reported problems and thinking how much further I should extend
testing coverage so that the next round doesn't have to be reverted
again ;-)

Thanks again,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-27 Thread H.J. Lu
On Mon, Jul 27, 2015 at 2:22 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 24, 2015, David Edelsohn dje@gmail.com wrote:

 On Fri, Jul 24, 2015 at 4:02 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 23, 2015, David Edelsohn dje@gmail.com wrote:

 I request that this patch be reverted (again).

 Might I kindly ask you to please do so for me.  I've just found out
 that, after yesterday's memory upgrade on my local build machine, the
 filesystem that I normally use for GCC development got corrupted, and I
 don't want to mess with it before running an fsck which will take me a
 while.

 I have reverted the patch.

 Thank you very much.  Long story short, the filesystem got corrupted
 beyond repair before I realized something was wrong, so I spend my
 weekend backing up the bits I still could and recreating it all from
 scratch.  *fun* :-/

 I even ran memtest before booting up, but everything was fine in the
 single-threaded tests it runs by default.  It was only with all cores
 actively using memory intensely that something overheated (memory
 modules?  chipset?  cpu?  no clue) and started randomly corrupting bits.
 So, I'm now back at lower memory clock speeds, and everything appears to
 be rock solid again.  Phew!  So, I'm back to debugging the

The exactly same thing happened to my machine.  It took me
several weeks before I lowered memory clock.  My machine has
been running fine for over a year under very heavy load.

BTW, this is what I use to test ia32 on Intel64:

PATH=/usr/local32/bin:/bin:/usr/bin
.../configure --prefix=/usr/6.0.0 --enable-clocale=gnu
--with-system-zlib --enable-shared --with-demangler-in-ld
--enable-libmpx i686-linux --with-fpmath=sse
--enable-languages=c,c++,fortran,java,lto,objc

where /usr/local32/bin has ia32 binutils.

-- 
H.J.


Re: [PR64164] drop copyrename, integrate into expand

2015-07-25 Thread Richard Biener
On July 24, 2015 10:47:37 PM GMT+02:00, H.J. Lu hjl.to...@gmail.com wrote:
On Fri, Jul 24, 2015 at 1:36 PM, Alexandre Oliva aol...@redhat.com
wrote:
 On Jul 24, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva
aol...@redhat.com wrote:
 On Jul 23, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

 My test logs with the patch have:

 PASS: c-c++-common/dfp/func-vararg-dfp.c execution test

 so I think this one is not caused by the PR64164 patch.

 I double checked.  r226113 failed and r226112 passed.

 Weird.  Here's the asm I got, after re-running the command used to
 compiled this test with -save-temps.  I checked that the produced
 executable is an ELF32 executable, and that it completes execution
 successfully.

 How does it compare to yours?

Please add -msse2:

Yes, the fails appear with -m32 multilib testing on x86_64.

Richard.

[hjl@gnu-ivb-1 gcc]$
/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/xgcc
-B/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/
/export/gnu/import/git/gcc-regression/gcc/gcc/testsuite/c-c++-common/dfp/func-vararg-dfp.c
-m32 -fno-diagnostics-show-caret -fdiagnostics-color=never -std=gnu99
-march=i686  -msse2
[hjl@gnu-ivb-1 gcc]$ ./a.out
Segmentation fault (core dumped)
[hjl@gnu-ivb-1 gcc]$
/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/xgcc
-B/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/
/export/gnu/import/git/gcc-regression/gcc/gcc/testsuite/c-c++-common/dfp/func-vararg-dfp.c
-m32 -fno-diagnostics-show-caret -fdiagnostics-color=never -std=gnu99
-march=i686
[hjl@gnu-ivb-1 gcc]$ ./a.out
[hjl@gnu-ivb-1 gcc]$




Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread Richard Biener
On Fri, Jul 24, 2015 at 1:19 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 4:14 PM, David Edelsohn dje@gmail.com wrote:
 On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
 seg...@kernel.crashing.org wrote:
 On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):

 Unfortunately it causes about a thousand test fails on powerpc64-linux
 (at least, it seems to be this patch, I haven't actually checked).


 It also caused:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978


 and maybe:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

 I request that this patch be reverted (again).

 And I request to test any new patches under x32 before checking in.
 You can use Ubuntu 14 to test x32.

x32 is neither primary nor secondary arch.

Richard.

 Thanks.

 --
 H.J.


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread Alexandre Oliva
On Jul 24, 2015, David Edelsohn dje@gmail.com wrote:

 Did you commit the final, complete version of the patches?

Yes, I have double-checked that reverting the patch I posted on the
r225979 tree I used for testing reverts it to the base revision, except
for unrelated libgfortran configury changes that I used to avoid halting
configure in cross builds, to increase the amount of code build during
the cross testing.

 Did you test the version of the patches that you committed?

Yes, I have just double-checked that the version I tested is identical
to the change introduced by commit 226113, except for the ChangeLogs and
the libgfortran configury change.

It could be that some other change between 225979 and 226112, combined
with 226113, caused the regression.  Or it could be that my
i686-pc-linux-gnu native testing on x86_64-pc-linux-gnu is somehow
faulty.  I'll look into it and try to find out what the source of the
difference in test results H.J. and I get could be.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread Alexandre Oliva
On Jul 24, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 23, 2015, H.J. Lu hjl.to...@gmail.com wrote:
 
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983
 
 My test logs with the patch have:
 
 PASS: c-c++-common/dfp/func-vararg-dfp.c execution test
 
 so I think this one is not caused by the PR64164 patch.

 I double checked.  r226113 failed and r226112 passed.

Weird.  Here's the asm I got, after re-running the command used to
compiled this test with -save-temps.  I checked that the produced
executable is an ELF32 executable, and that it completes execution
successfully.

How does it compare to yours?



func-vararg-dfp.s.gz
Description: application/gzip

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread H.J. Lu
On Fri, Jul 24, 2015 at 1:36 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 24, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 23, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

 My test logs with the patch have:

 PASS: c-c++-common/dfp/func-vararg-dfp.c execution test

 so I think this one is not caused by the PR64164 patch.

 I double checked.  r226113 failed and r226112 passed.

 Weird.  Here's the asm I got, after re-running the command used to
 compiled this test with -save-temps.  I checked that the produced
 executable is an ELF32 executable, and that it completes execution
 successfully.

 How does it compare to yours?

Please add -msse2:

[hjl@gnu-ivb-1 gcc]$
/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/xgcc
-B/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/
/export/gnu/import/git/gcc-regression/gcc/gcc/testsuite/c-c++-common/dfp/func-vararg-dfp.c
-m32 -fno-diagnostics-show-caret -fdiagnostics-color=never -std=gnu99
-march=i686  -msse2
[hjl@gnu-ivb-1 gcc]$ ./a.out
Segmentation fault (core dumped)
[hjl@gnu-ivb-1 gcc]$
/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/xgcc
-B/export/gnu/import/git/gcc-regression/master/226113/bld/gcc/
/export/gnu/import/git/gcc-regression/gcc/gcc/testsuite/c-c++-common/dfp/func-vararg-dfp.c
-m32 -fno-diagnostics-show-caret -fdiagnostics-color=never -std=gnu99
-march=i686
[hjl@gnu-ivb-1 gcc]$ ./a.out
[hjl@gnu-ivb-1 gcc]$


-- 
H.J.


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread Alexandre Oliva
On Jul 23, 2015, David Edelsohn dje@gmail.com wrote:

 I request that this patch be reverted (again).

Might I kindly ask you to please do so for me.  I've just found out
that, after yesterday's memory upgrade on my local build machine, the
filesystem that I normally use for GCC development got corrupted, and I
don't want to mess with it before running an fsck which will take me a
while.

Apologies for the breakage.  I'll add ppc64-linux-gnu regression testing
to the test list for this patchset.  Thanks,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread Alexandre Oliva
On Jul 23, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

My test logs with the patch have:

PASS: c-c++-common/dfp/func-vararg-dfp.c execution test

so I think this one is not caused by the PR64164 patch.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread H.J. Lu
On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 23, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

 My test logs with the patch have:

 PASS: c-c++-common/dfp/func-vararg-dfp.c execution test

 so I think this one is not caused by the PR64164 patch.

I double checked.  r226113 failed and r226112 passed.

-- 
H.J.


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread David Edelsohn
On Fri, Jul 24, 2015 at 3:10 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Fri, Jul 24, 2015 at 11:43 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 23, 2015, H.J. Lu hjl.to...@gmail.com wrote:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

 My test logs with the patch have:

 PASS: c-c++-common/dfp/func-vararg-dfp.c execution test

 so I think this one is not caused by the PR64164 patch.

 I double checked.  r226113 failed and r226112 passed.

Alexandre,

Did you commit the final, complete version of the patches?

Did you test the version of the patches that you committed?

Thanks, David


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread Alexandre Oliva
On Jul 23, 2015, Segher Boessenkool seg...@kernel.crashing.org wrote:

 On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):

 Unfortunately it causes about a thousand test fails on powerpc64-linux
 (at least, it seems to be this patch, I haven't actually checked).

Yeah, the backtrace suggests very strongly that it's my patch.
Apologies for the breakage.  I'm looking into it right now.

 I have the full testsuite logs if you want them.

Preprocessed testcases would probably help more than the logs, but at
least one of the testcases you mentioned doesn't require any libraries,
so I'm going to start with them.  Then, I'll find some ppc64-linux-gnu
box in the build farm and run some bootstrap and regression testing
there.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread David Edelsohn
On Fri, Jul 24, 2015 at 4:02 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 23, 2015, David Edelsohn dje@gmail.com wrote:

 I request that this patch be reverted (again).

 Might I kindly ask you to please do so for me.  I've just found out
 that, after yesterday's memory upgrade on my local build machine, the
 filesystem that I normally use for GCC development got corrupted, and I
 don't want to mess with it before running an fsck which will take me a
 while.

I have reverted the patch.

- David


Re: [PR64164] drop copyrename, integrate into expand

2015-07-24 Thread H.J. Lu
On Fri, Jul 24, 2015 at 2:22 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Fri, Jul 24, 2015 at 1:19 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 4:14 PM, David Edelsohn dje@gmail.com wrote:
 On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
 seg...@kernel.crashing.org wrote:
 On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):

 Unfortunately it causes about a thousand test fails on powerpc64-linux
 (at least, it seems to be this patch, I haven't actually checked).


 It also caused:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978


 and maybe:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

 I request that this patch be reverted (again).

 And I request to test any new patches under x32 before checking in.
 You can use Ubuntu 14 to test x32.

 x32 is neither primary nor secondary arch.


I suggested a way to reproduce the problem.  I checked in this testcase so
that the problem will show up on Linux/x86-64.

-- 
H.J.
---
Index: ChangeLog
===
--- ChangeLog (revision 226149)
+++ ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2015-07-24  H.J. Lu  hongjiu...@intel.com
+
+ PR bootstrap/66978
+ * gcc.target/i386/pr66978.c: New test.
+
 2015-07-24  Andreas Krebbel  kreb...@linux.vnet.ibm.com

  * gcc.target/s390/gpr2fprsavecfi.c: New test.
Index: gcc.target/i386/pr66978.c
===
--- gcc.target/i386/pr66978.c (revision 0)
+++ gcc.target/i386/pr66978.c (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-require-effective-target maybe_x32 } */
+/* { dg-options -O2 -mx32 -maddress-mode=short } */
+
+extern int foo (int *);
+int
+bar (int *p)
+{
+  __attribute__ ((noinline, noclone))
+  int hack_digit (void)
+{
+  return foo (p);
+}
+  return hack_digit ();
+}


Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread David Edelsohn
On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
 seg...@kernel.crashing.org wrote:
 On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):

 Unfortunately it causes about a thousand test fails on powerpc64-linux
 (at least, it seems to be this patch, I haven't actually checked).


 It also caused:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978


 and maybe:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

I request that this patch be reverted (again).

Thanks, David


Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread H.J. Lu
On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
seg...@kernel.crashing.org wrote:
 On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):

 Unfortunately it causes about a thousand test fails on powerpc64-linux
 (at least, it seems to be this patch, I haven't actually checked).


It also caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978

-- 
H.J.


Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread H.J. Lu
On Thu, Jul 23, 2015 at 4:14 PM, David Edelsohn dje@gmail.com wrote:
 On Thu, Jul 23, 2015 at 5:59 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
 seg...@kernel.crashing.org wrote:
 On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):

 Unfortunately it causes about a thousand test fails on powerpc64-linux
 (at least, it seems to be this patch, I haven't actually checked).


 It also caused:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978


 and maybe:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

 I request that this patch be reverted (again).

And I request to test any new patches under x32 before checking in.
You can use Ubuntu 14 to test x32.

Thanks.

-- 
H.J.


Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread Segher Boessenkool
On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):

Unfortunately it causes about a thousand test fails on powerpc64-linux
(at least, it seems to be this patch, I haven't actually checked).

Some representative backtraces:


/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr54713-1.c: In 
function 'f1':
/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr54713-1.c:13:1: 
internal compiler error: in expand_one_stack_var_1, at cfgexpand.c:1221
0x1030eae7 expand_one_stack_var_1
/home/segher/src/gcc/gcc/cfgexpand.c:1221
0x10320a23 expand_one_ssa_partition
/home/segher/src/gcc/gcc/cfgexpand.c:1295
0x10320a23 expand_used_vars
/home/segher/src/gcc/gcc/cfgexpand.c:1940
0x10322ea3 execute
/home/segher/src/gcc/gcc/cfgexpand.c:6084


/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr39928-1.c: In 
function 'vq_nbest':
/home/segher/src/gcc/gcc/testsuite/gcc.c-torture/compile/pr39928-1.c:6:1: 
internal compiler error: in emit_move_insn, at expr.c:3552
0x1046f587 emit_move_insn(rtx_def*, rtx_def*)
/home/segher/src/gcc/gcc/expr.c:3551
0x104daa67 assign_parm_setup_reg
/home/segher/src/gcc/gcc/function.c:3322
0x104dd063 assign_parms
/home/segher/src/gcc/gcc/function.c:3766
0x104e0aa7 expand_function_start(tree_node*)
/home/segher/src/gcc/gcc/function.c:5192
0x10322f07 execute
/home/segher/src/gcc/gcc/cfgexpand.c:6105


I have the full testsuite logs if you want them.


Segher


Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread H.J. Lu
On Thu, Jul 23, 2015 at 1:57 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Thu, Jul 23, 2015 at 1:31 PM, Segher Boessenkool
 seg...@kernel.crashing.org wrote:
 On Thu, Jul 23, 2015 at 12:29:14PM -0300, Alexandre Oliva wrote:
 Yeah.  Thanks, I've tested it with this change, and I'm now checking
 this in (full patch first; adjusted incremental patch at the end):

 Unfortunately it causes about a thousand test fails on powerpc64-linux
 (at least, it seems to be this patch, I haven't actually checked).


 It also caused:

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66978


and maybe:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66983

-- 
H.J.


Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread Alexandre Oliva
On Jul 23, 2015, Richard Biener richard.guent...@gmail.com wrote:

 Hmm, ok.  Does using

if (currently_expanding_to_rtl)

 work?  I think it's slightly more descriptive.

Yeah.  Thanks, I've tested it with this change, and I'm now checking
this in (full patch first; adjusted incremental patch at the end):

[PR64164] Drop copyrename, use coalescible partition as base when optimizing.

From: Alexandre Oliva aol...@redhat.com

for  gcc/ChangeLog

PR rtl-optimization/64164
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename.  Add
-ftree-coalesce-vars.
* passes.def: Drop all occurrences of pass_rename_ssa_copies.
* common.opt (ftree-copyrename): Ignore.
(ftree-coalesce-inlined-vars): Likewise.
* doc/invoke.texi: Remove the ignored options above.
* gimple-expr.h (gimple_can_coalesce_p): Move declaration
* tree-ssa-coalesce.h: ... here.
* tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other
headers required by it.
* gimple-expr.c (gimple_can_coalesce_p): Allow coalescing
across variables when flag_tree_coalesce_vars.  Check register
use and promoted modes to allow coalescing.  Moved to
tree-ssa-coalesce.c.
* tree-ssa-live.c (struct tree_int_map_hasher): Move along
with its member functions to tree-ssa-coalesce.c.
(var_map_base_init): Likewise.  Renamed to
compute_samebase_partition_bases.
(partition_view_normal): Drop want_bases parameter.
(partition_view_bitmap): Likewise.
* tree-ssa-live.h: Adjust declarations.
* tree-ssa-coalesce.c: Include explow.h.
(build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's
default defs at the entry point.
(dump_part_var_map): New.
(compute_optimized_partition_bases): New, called by...
(coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead
of compute_samebase_partition_bases.  Adjust.
* alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs.
* cfgexpand.c (leader_merge): New.
(get_rtl_for_parm_ssa_default_def): New.
(set_rtl): Merge exprs and attrs, even for MEMs and non-SSA
vars.  Update DECL_RTL for PARM_DECLs and RESULT_DECLs too.
(expand_one_stack_var_at): Handle anonymous SSA_NAMEs.  Drop
redundant MEM attr setting.
(expand_one_stack_var_1): Handle anonymous SSA_NAMEs.  Renamed
from...
(expand_one_stack_var): ... this.  New wrapper to check and
skip already expanded SSA partitions.
(record_alignment_for_reg_var): New, factored out of...
(expand_one_var): ... this.
(expand_one_ssa_partition): New.
(adjust_one_expanded_partition_var): New.
(expand_one_register_var): Check and skip already expanded SSA
partitions.
(expand_used_vars): Don't create DECLs for anonymous SSA
names.  Expand all SSA partitions, then adjust all SSA names.
(pass::execute): Replace the loops that set
SA.partition_to_pseudo from partition leaders and cleared
DECL_RTL for multi-location variables, and that which used to
rename vars and set attrs, with one that clears DECL_RTL and
checks that PARMs and RESULTs default_defs match DECL_RTL.
* cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare.
* emit-rtl.c (set_reg_attrs_for_parm): Handle NULL decl.
* explow.c (promote_ssa_mode): New.
* explow.h (promote_ssa_mode): Declare.
* expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs.
* function.c: Include cfgexpand.h.
(use_register_for_decl): Handle SSA_NAMEs, anonymous or not.
(use_register_for_parm_decl): Wrapper for the above to
special-case the result_ptr.
(rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def.
(split_complex_args): Take assign_parm_data_all argument.
Pass it to rtl_for_parm.  Set up rtl and context for split
args.
(assign_parms_augmented_arg_list): Adjust.
(maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with
multiple locations.  Recognize split complex args.
(assign_parm_adjust_stack_rtl): Add all and parm arguments,
for rtl_for_parm.  For SSA-assigned parms, zero stack_parm.
(assign_parm_setup_block): Prefer SSA-assigned location.
(assign_parm_setup_reg): Likewise.  Use entry_parm for equiv
if stack_parm is NULL.
(assign_parm_setup_stack): Prefer SSA-assigned location.
(assign_parms): Maybe reset DECL_RTL of params.  Adjust stack
rtl before testing for pointer bounds.  Special-case result_ptr.
(expand_function_start): Maybe reset DECL_RTL of result.
Prefer SSA-assigned location for result and static chain.
Factor out 

Re: [PR64164] drop copyrename, integrate into expand

2015-07-23 Thread Richard Biener
On Wed, Jul 22, 2015 at 7:33 PM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 21, 2015, Richard Biener richard.guent...@gmail.com wrote:

 On Sat, Jul 18, 2015 at 9:37 AM, Alexandre Oliva aol...@redhat.com wrote:
 + if (cfun-gimple_df)

 If the cfun-gimple_df check is to decide whether this is a call or a 
 function
 then no, this can't work reliably.  What is this test for else?

 It turns out it's not call or function, as I thought at first, but
 gimplifying or expanding the function.  split_complex_args is not used
 for calls.  So the above might actually work (minus the misleading
 comments I wrote), and I think it's cleaner than adding a bool
 expanding_p arg to split_complex_args and
 assign_parms_augmented_arg_list, called from gimplify_parameters (during
 gimplification of a function) and assign_parms (during its expansion).
 Do you agree, or would you prefer the explicit argument?

Hmm, ok.  Does using

   if (currently_expanding_to_rtl)

work?  I think it's slightly more descriptive.

Ok with that change.

Thanks,
Richard.

 --
 Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
 You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- http://FSFLA.org/   FSF Latin America board member
 Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-22 Thread Alexandre Oliva
On Jul 21, 2015, Richard Biener richard.guent...@gmail.com wrote:

 On Sat, Jul 18, 2015 at 9:37 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 16, 2015, Alexandre Oliva aol...@redhat.com wrote:
 + /* If we are assigning parameters for a function, rather
 +than for a call, propagate the RTL of the complex parm to
 +the split declarations, and set their contexts so that
 +maybe_reset_rtl_for_parm can recognize them and refrain
 +from resetting their RTL.  */
 + if (cfun-gimple_df)

 If the cfun-gimple_df check is to decide whether this is a call or a function
 then no, this can't work reliably.  What is this test for else?

That was the reason: call or function.

 You pass another argument to split_complex_arg, so why not pass in a bool
 on whether we split it for this or the other case?

There's only one call to split_complex_args.  I'll try to figure out
where the paths converge and see if it's reasonable to pass an argument
all the way to tell the two cases apart.

Thanks for the suggestion,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-22 Thread Alexandre Oliva
On Jul 21, 2015, Richard Biener richard.guent...@gmail.com wrote:

 On Sat, Jul 18, 2015 at 9:37 AM, Alexandre Oliva aol...@redhat.com wrote:
 + if (cfun-gimple_df)

 If the cfun-gimple_df check is to decide whether this is a call or a function
 then no, this can't work reliably.  What is this test for else?

It turns out it's not call or function, as I thought at first, but
gimplifying or expanding the function.  split_complex_args is not used
for calls.  So the above might actually work (minus the misleading
comments I wrote), and I think it's cleaner than adding a bool
expanding_p arg to split_complex_args and
assign_parms_augmented_arg_list, called from gimplify_parameters (during
gimplification of a function) and assign_parms (during its expansion).
Do you agree, or would you prefer the explicit argument?

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-07-21 Thread Richard Biener
On Sat, Jul 18, 2015 at 9:37 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jul 16, 2015, Alexandre Oliva aol...@redhat.com wrote:

 So, I decided to run a ppc64le-linux-gnu bootstrap, just in case, and
 there are issues with split complex parms that caused go and fortran
 libs to fail the build.

 This incremental patch, along with the previously-posted patches, fix
 split complex args handling with preassigned args RTL, and enables
 ppc64le-linux-gnu bootstrap to succeed.

 I'm not particularly happy with the abuse of DECL_CONTEXT to recognize
 split complex args and leave their RTL alone, but that was the best that
 occurred to me.  Any other suggestions?

 Is the combined patch ok, assuming further (re)testing of embedded
 targets passes?

 for  gcc/ChangeLog (to be integrated with the approved patches)

 * function.c (split_complex_args): Take assign_parm_data_all
 argument.  Pass it to rtl_for_parm.  Set up rtl and context
 for split args.
 (assign_parms_augmented_arg_list): Adjust.
 (maybe_reset_rtl_for_parm): Recognize split complex args.
 * stor-layout.c (layout_decl): Don't set mem attributes of
 non-MEMs.
 ---
  gcc/function.c|   39 +--
  gcc/stor-layout.c |3 ++-
  2 files changed, 39 insertions(+), 3 deletions(-)

 diff --git a/gcc/function.c b/gcc/function.c
 index 753d889..6fba001 100644
 --- a/gcc/function.c
 +++ b/gcc/function.c
 @@ -151,6 +151,8 @@ static bool contains (const_rtx, 
 hash_tableinsn_cache_hasher *);
  static void prepare_function_start (void);
  static void do_clobber_return_reg (rtx, void *);
  static void do_use_return_reg (rtx, void *);
 +static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
 +

  /* Stack of nested functions.  */
  /* Keep track of the cfun stack.  */
 @@ -2267,7 +2269,7 @@ assign_parms_initialize_all (struct 
 assign_parm_data_all *all)
 needed, else the old list.  */

  static void
 -split_complex_args (vectree *args)
 +split_complex_args (struct assign_parm_data_all *all, vectree *args)
  {
unsigned i;
tree p;
 @@ -2278,6 +2280,7 @@ split_complex_args (vectree *args)
if (TREE_CODE (type) == COMPLEX_TYPE
targetm.calls.split_complex_arg (type))
 {
 + tree cparm = p;
   tree decl;
   tree subtype = TREE_TYPE (type);
   bool addressable = TREE_ADDRESSABLE (p);
 @@ -2296,6 +2299,9 @@ split_complex_args (vectree *args)
   DECL_ARTIFICIAL (p) = addressable;
   DECL_IGNORED_P (p) = addressable;
   TREE_ADDRESSABLE (p) = 0;
 + /* Reset the RTL before layout_decl, or it may change the
 +mode of the RTL of the original argument copied to P.  */
 + SET_DECL_RTL (p, NULL_RTX);
   layout_decl (p, 0);
   (*args)[i] = p;

 @@ -2307,6 +2313,25 @@ split_complex_args (vectree *args)
   DECL_IGNORED_P (decl) = addressable;
   layout_decl (decl, 0);
   args-safe_insert (++i, decl);
 +
 + /* If we are assigning parameters for a function, rather
 +than for a call, propagate the RTL of the complex parm to
 +the split declarations, and set their contexts so that
 +maybe_reset_rtl_for_parm can recognize them and refrain
 +from resetting their RTL.  */
 + if (cfun-gimple_df)

If the cfun-gimple_df check is to decide whether this is a call or a function
then no, this can't work reliably.  What is this test for else?

You pass another argument to split_complex_arg, so why not pass in a bool
on whether we split it for this or the other case?

Richard.

 +   {
 + rtx rtl = rtl_for_parm (all, cparm);
 + gcc_assert (!rtl || GET_CODE (rtl) == CONCAT);
 + if (rtl)
 +   {
 + SET_DECL_RTL (p, XEXP (rtl, 0));
 + SET_DECL_RTL (decl, XEXP (rtl, 1));
 +
 + DECL_CONTEXT (p) = cparm;
 + DECL_CONTEXT (decl) = cparm;
 +   }
 +   }
 }
  }
  }
 @@ -2369,7 +2394,7 @@ assign_parms_augmented_arg_list (struct 
 assign_parm_data_all *all)

/* If the target wants to split complex arguments into scalars, do so.  */
if (targetm.calls.split_complex_arg)
 -split_complex_args (fnargs);
 +split_complex_args (all, fnargs);

return fnargs;
  }
 @@ -2823,6 +2848,16 @@ maybe_reset_rtl_for_parm (tree parm)
  {
gcc_assert (TREE_CODE (parm) == PARM_DECL
   || TREE_CODE (parm) == RESULT_DECL);
 +
 +  /* This is a split complex parameter, and its context was set to its
 + original PARM_DECL in split_complex_args so that we could
 + recognize it here and not reset its RTL.  */
 +  if (DECL_CONTEXT (parm)  TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
 +{
 +  DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
 +  return;
 +}
 +
if ((flag_tree_coalesce_vars
 || 

Re: [PR64164] drop copyrename, integrate into expand

2015-07-18 Thread Alexandre Oliva
On Jul 16, 2015, Alexandre Oliva aol...@redhat.com wrote:

 So, I decided to run a ppc64le-linux-gnu bootstrap, just in case, and
 there are issues with split complex parms that caused go and fortran
 libs to fail the build.

This incremental patch, along with the previously-posted patches, fix
split complex args handling with preassigned args RTL, and enables
ppc64le-linux-gnu bootstrap to succeed.

I'm not particularly happy with the abuse of DECL_CONTEXT to recognize
split complex args and leave their RTL alone, but that was the best that
occurred to me.  Any other suggestions?

Is the combined patch ok, assuming further (re)testing of embedded
targets passes?

for  gcc/ChangeLog (to be integrated with the approved patches)

* function.c (split_complex_args): Take assign_parm_data_all
argument.  Pass it to rtl_for_parm.  Set up rtl and context
for split args.
(assign_parms_augmented_arg_list): Adjust.
(maybe_reset_rtl_for_parm): Recognize split complex args.
* stor-layout.c (layout_decl): Don't set mem attributes of
non-MEMs.
---
 gcc/function.c|   39 +--
 gcc/stor-layout.c |3 ++-
 2 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/gcc/function.c b/gcc/function.c
index 753d889..6fba001 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -151,6 +151,8 @@ static bool contains (const_rtx, 
hash_tableinsn_cache_hasher *);
 static void prepare_function_start (void);
 static void do_clobber_return_reg (rtx, void *);
 static void do_use_return_reg (rtx, void *);
+static rtx rtl_for_parm (struct assign_parm_data_all *, tree);
+
 
 /* Stack of nested functions.  */
 /* Keep track of the cfun stack.  */
@@ -2267,7 +2269,7 @@ assign_parms_initialize_all (struct assign_parm_data_all 
*all)
needed, else the old list.  */
 
 static void
-split_complex_args (vectree *args)
+split_complex_args (struct assign_parm_data_all *all, vectree *args)
 {
   unsigned i;
   tree p;
@@ -2278,6 +2280,7 @@ split_complex_args (vectree *args)
   if (TREE_CODE (type) == COMPLEX_TYPE
   targetm.calls.split_complex_arg (type))
{
+ tree cparm = p;
  tree decl;
  tree subtype = TREE_TYPE (type);
  bool addressable = TREE_ADDRESSABLE (p);
@@ -2296,6 +2299,9 @@ split_complex_args (vectree *args)
  DECL_ARTIFICIAL (p) = addressable;
  DECL_IGNORED_P (p) = addressable;
  TREE_ADDRESSABLE (p) = 0;
+ /* Reset the RTL before layout_decl, or it may change the
+mode of the RTL of the original argument copied to P.  */
+ SET_DECL_RTL (p, NULL_RTX);
  layout_decl (p, 0);
  (*args)[i] = p;
 
@@ -2307,6 +2313,25 @@ split_complex_args (vectree *args)
  DECL_IGNORED_P (decl) = addressable;
  layout_decl (decl, 0);
  args-safe_insert (++i, decl);
+
+ /* If we are assigning parameters for a function, rather
+than for a call, propagate the RTL of the complex parm to
+the split declarations, and set their contexts so that
+maybe_reset_rtl_for_parm can recognize them and refrain
+from resetting their RTL.  */
+ if (cfun-gimple_df)
+   {
+ rtx rtl = rtl_for_parm (all, cparm);
+ gcc_assert (!rtl || GET_CODE (rtl) == CONCAT);
+ if (rtl)
+   {
+ SET_DECL_RTL (p, XEXP (rtl, 0));
+ SET_DECL_RTL (decl, XEXP (rtl, 1));
+
+ DECL_CONTEXT (p) = cparm;
+ DECL_CONTEXT (decl) = cparm;
+   }
+   }
}
 }
 }
@@ -2369,7 +2394,7 @@ assign_parms_augmented_arg_list (struct 
assign_parm_data_all *all)
 
   /* If the target wants to split complex arguments into scalars, do so.  */
   if (targetm.calls.split_complex_arg)
-split_complex_args (fnargs);
+split_complex_args (all, fnargs);
 
   return fnargs;
 }
@@ -2823,6 +2848,16 @@ maybe_reset_rtl_for_parm (tree parm)
 {
   gcc_assert (TREE_CODE (parm) == PARM_DECL
  || TREE_CODE (parm) == RESULT_DECL);
+
+  /* This is a split complex parameter, and its context was set to its
+ original PARM_DECL in split_complex_args so that we could
+ recognize it here and not reset its RTL.  */
+  if (DECL_CONTEXT (parm)  TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL)
+{
+  DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm));
+  return;
+}
+
   if ((flag_tree_coalesce_vars
|| (DECL_RTL_SET_P (parm)  DECL_RTL (parm) == pc_rtx))
is_gimple_reg (parm))
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 0d4f4a4..288227a 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -794,7 +794,8 @@ layout_decl (tree decl, unsigned int known_align)
 {
   PUT_MODE (rtl, DECL_MODE (decl));
   SET_DECL_RTL (decl, 0);
-  set_mem_attributes (rtl, decl, 1);
+  if (MEM_P (rtl))
+   set_mem_attributes 

Re: [PR64164] drop copyrename, integrate into expand

2015-07-16 Thread Alexandre Oliva
On Jun 10, 2015, Richard Biener richard.guent...@gmail.com wrote:

 On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva aol...@redhat.com wrote:
 This caused the sparc regression reported by Eric in
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37

 We need to match the mode of the rtl created for the partition and the
 promoted mode expected for the parm.  I recall working to make parm and
 result decls the partition leaders, so that promote_ssa_mode would DTRT,
 but this escaped my mind when revisiting the patch after some time on
 another project.

FWIW, during the development of this improvement, I dropped the notion
of making parm and result decls partition leaders, and instead only
considered eligible for coalescing into the same partition SSA_NAMEs
that promoted to the same mode.

 Alternatively not coalesce SSA names when promote_decl_mode gives
 different answers (for their underlying decl)?  It sounds wrong to do that
 (if that is really what happens).

Exactly.  I've now restored the promote_decl_mode behavior to
promote_ssa_mode for PARM_ and RESULT_DECLs, so that the strategy
described above works again.  This fixed the sparc regression.

On Jun  9, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Jun  9, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Jun  9, 2015, David Edelsohn dje@gmail.com wrote:
 This also broke bootstrap on PPC64 LE Linux with the same error.

 Thanks for your reports.  I'm looking into the problem.

 I'd appreciate a preprocessed testcase from either of you to confirm the
 fix, if not to help debug it.

 The first potential source for this problem that jumped at me would be
 silenced with this change:

 diff --git a/gcc/function.c b/gcc/function.c
 index 8bcc352..9201ed9 100644
 --- a/gcc/function.c
 +++ b/gcc/function.c
 @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all 
 *all,
   stack_parm = copy_rtx (stack_parm);
if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
   PUT_MODE (stack_parm, GET_MODE (entry_parm));
 -  set_mem_attributes (stack_parm, parm, 1);
 +  if (GET_CODE (stack_parm) == MEM)
 + set_mem_attributes (stack_parm, parm, 1);
  }
 
/* If a BLKmode arrives in registers, copy it to a stack slot.  Handle

I ended up fixing this in a slightly different way, running the original
code above, from assign_stack_local to set_mem_attributes, only when
rtl_for_parm does not obtain an assignment set up by out-of-ssa.

 but I suspect there might be other similar issues lurking in function.c
 after my attempt to turn parm assignment upside down ;-)

There weren't, after all.

On Jun  9, 2015, David Edelsohn dje@gmail.com wrote:

 This patch clearly should have been tested on more
 architectures than x86 before being approved and merged.

The following patch was regstrapped on x86_64-linux-gnu and
i686-pc-linux-gnu.  I've also cross-built all-target successfully for
targets aarch64-elf, arm-eabi, arm-symbianelf, avr-elf, bfin-elf,
cr16-elf, cris-elf, crisv32-elf, epiphany-elf, fido-elf, fr30-elf,
frv-elf, i686-elf, lm32-elf, m68k-elf, mcore-elf, microblaze-elf,
mips64el-elf, mips64-elf, mips64orion-elf, mipsel-elf,
mipsisa32-elfoabi, mipsisa64-elfoabi, mipsisa64r2el-elf,
mipsisa64r2-sde-elf, mipsisa64sb1-elf, mipstx39-elf, mn10300-elf,
moxie-elf, nds32be-elf, nds32le-elf, nios2-elf, powerpc-eabialtivec,
powerpc-eabisimaltivec, powerpc-eabisim, powerpc-eabispe, powerpc-eabi,
powerpcle-eabisim, powerpcle-eabi, powerpcle-elf, ppc-eabi, ppc-elf,
rx-elf, sh-elf, sh-superh-elf, sparc64-elf, sparc-elf, spu-elf, and
visium-elf, and got the same build failures before and after the patch
with targets c6x-elf, ft32-elf, h8300-elf, ia64-elf, iq2000-elf,
m32c-elf, m32r-elf, m32rle-elf, mep-elf, mips64vr-elf
(mips64vr-elf/mips16/newlib/libm/math/lib_a_e_hypot.o failed to build
with the patch and passed without it, but there were other invalid
operand failures for lwu insns without the patch, so I'm counting the
e_hypot failure as present but latent before), mipsisa64sr71k-elf,
msp430-elf, pdp11-aout, powerpc-xilinx-eabi, ppc64-eabi, rl78-elf,
sh64-elf, sparc-leon-elf, v850e-elf, v850-elf, xstormy16-elf, and
xtensa-elf.

This patch differs from the previous one in that I dropped the hunk I
had put in loop_exits_before_overflow, already noticed and fixed
independently (PR66638); I updated tree_int_map_hasher, that was updated
in the trunk in tree-ssa-live.c, but that the patch moved to
tree-ssa-coalesce.c; I resolved other conflicts in files that had
#includes added by the patch and by other changes; and I put in the two
fixes mentioned above.  After the full updated patch, I enclose a diff
with these two additional fixes, to ease the review.

Is this ok to install?


for  gcc/ChangeLog

PR rtl-optimization/64164
* Makefile.in (OBJS): Drop tree-ssa-copyrename.o.
* tree-ssa-copyrename.c: Removed.
* opts.c (default_options_table): Drop -ftree-copyrename.  Add

Re: [PR64164] drop copyrename, integrate into expand

2015-07-16 Thread Richard Biener
On Thu, Jul 16, 2015 at 9:29 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jun 10, 2015, Richard Biener richard.guent...@gmail.com wrote:

 On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva aol...@redhat.com wrote:
 This caused the sparc regression reported by Eric in
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37

 We need to match the mode of the rtl created for the partition and the
 promoted mode expected for the parm.  I recall working to make parm and
 result decls the partition leaders, so that promote_ssa_mode would DTRT,
 but this escaped my mind when revisiting the patch after some time on
 another project.

 FWIW, during the development of this improvement, I dropped the notion
 of making parm and result decls partition leaders, and instead only
 considered eligible for coalescing into the same partition SSA_NAMEs
 that promoted to the same mode.

 Alternatively not coalesce SSA names when promote_decl_mode gives
 different answers (for their underlying decl)?  It sounds wrong to do that
 (if that is really what happens).

 Exactly.  I've now restored the promote_decl_mode behavior to
 promote_ssa_mode for PARM_ and RESULT_DECLs, so that the strategy
 described above works again.  This fixed the sparc regression.

 On Jun  9, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Jun  9, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Jun  9, 2015, David Edelsohn dje@gmail.com wrote:
 This also broke bootstrap on PPC64 LE Linux with the same error.

 Thanks for your reports.  I'm looking into the problem.

 I'd appreciate a preprocessed testcase from either of you to confirm the
 fix, if not to help debug it.

 The first potential source for this problem that jumped at me would be
 silenced with this change:

 diff --git a/gcc/function.c b/gcc/function.c
 index 8bcc352..9201ed9 100644
 --- a/gcc/function.c
 +++ b/gcc/function.c
 @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all 
 *all,
   stack_parm = copy_rtx (stack_parm);
if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
   PUT_MODE (stack_parm, GET_MODE (entry_parm));
 -  set_mem_attributes (stack_parm, parm, 1);
 +  if (GET_CODE (stack_parm) == MEM)
 + set_mem_attributes (stack_parm, parm, 1);
  }

/* If a BLKmode arrives in registers, copy it to a stack slot.  Handle

 I ended up fixing this in a slightly different way, running the original
 code above, from assign_stack_local to set_mem_attributes, only when
 rtl_for_parm does not obtain an assignment set up by out-of-ssa.

 but I suspect there might be other similar issues lurking in function.c
 after my attempt to turn parm assignment upside down ;-)

 There weren't, after all.

 On Jun  9, 2015, David Edelsohn dje@gmail.com wrote:

 This patch clearly should have been tested on more
 architectures than x86 before being approved and merged.

 The following patch was regstrapped on x86_64-linux-gnu and
 i686-pc-linux-gnu.  I've also cross-built all-target successfully for
 targets aarch64-elf, arm-eabi, arm-symbianelf, avr-elf, bfin-elf,
 cr16-elf, cris-elf, crisv32-elf, epiphany-elf, fido-elf, fr30-elf,
 frv-elf, i686-elf, lm32-elf, m68k-elf, mcore-elf, microblaze-elf,
 mips64el-elf, mips64-elf, mips64orion-elf, mipsel-elf,
 mipsisa32-elfoabi, mipsisa64-elfoabi, mipsisa64r2el-elf,
 mipsisa64r2-sde-elf, mipsisa64sb1-elf, mipstx39-elf, mn10300-elf,
 moxie-elf, nds32be-elf, nds32le-elf, nios2-elf, powerpc-eabialtivec,
 powerpc-eabisimaltivec, powerpc-eabisim, powerpc-eabispe, powerpc-eabi,
 powerpcle-eabisim, powerpcle-eabi, powerpcle-elf, ppc-eabi, ppc-elf,
 rx-elf, sh-elf, sh-superh-elf, sparc64-elf, sparc-elf, spu-elf, and
 visium-elf, and got the same build failures before and after the patch
 with targets c6x-elf, ft32-elf, h8300-elf, ia64-elf, iq2000-elf,
 m32c-elf, m32r-elf, m32rle-elf, mep-elf, mips64vr-elf
 (mips64vr-elf/mips16/newlib/libm/math/lib_a_e_hypot.o failed to build
 with the patch and passed without it, but there were other invalid
 operand failures for lwu insns without the patch, so I'm counting the
 e_hypot failure as present but latent before), mipsisa64sr71k-elf,
 msp430-elf, pdp11-aout, powerpc-xilinx-eabi, ppc64-eabi, rl78-elf,
 sh64-elf, sparc-leon-elf, v850e-elf, v850-elf, xstormy16-elf, and
 xtensa-elf.

 This patch differs from the previous one in that I dropped the hunk I
 had put in loop_exits_before_overflow, already noticed and fixed
 independently (PR66638); I updated tree_int_map_hasher, that was updated
 in the trunk in tree-ssa-live.c, but that the patch moved to
 tree-ssa-coalesce.c; I resolved other conflicts in files that had
 #includes added by the patch and by other changes; and I put in the two
 fixes mentioned above.  After the full updated patch, I enclose a diff
 with these two additional fixes, to ease the review.

 Is this ok to install?

Yes.

Thanks again for taking care of this!

Richard.


 for  gcc/ChangeLog

 PR rtl-optimization/64164
 * 

Re: [PR64164] drop copyrename, integrate into expand

2015-07-16 Thread Alexandre Oliva
On Jul 16, 2015, Richard Biener richard.guent...@gmail.com wrote:

 Is this ok to install?

 Yes.

So, I decided to run a ppc64le-linux-gnu bootstrap, just in case, and
there are issues with split complex parms that caused go and fortran
libs to fail the build.

I will refrain from installing this for now, and I'll post a followup as
soon as I sort that out.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-06-10 Thread Richard Biener
On Wed, Jun 10, 2015 at 2:24 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Jun  5, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Apr 27, 2015, Richard Biener richard.guent...@gmail.com wrote:

 +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
 +   is the same as promote_decl_mode.  Otherwise, it is the promoted
 +   mode of a temp decl of same type as the SSA_NAME, if we had created
 +   one.  */
 +
 +machine_mode
 +promote_ssa_mode (const_tree name, int *punsignedp)
 +{
 +  gcc_assert (TREE_CODE (name) == SSA_NAME);
 +
 +  if (SSA_NAME_VAR (name))
 +return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);

 As above I'd rather not have different paths for anonymous vs. non-anonymous
 vars (so just delete the above two lines).

 Check

 This caused the sparc regression reported by Eric in
 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37

 We need to match the mode of the rtl created for the partition and the
 promoted mode expected for the parm.  I recall working to make parm and
 result decls the partition leaders, so that promote_ssa_mode would DTRT,
 but this escaped my mind when revisiting the patch after some time on
 another project.

 So we either restore promote_ssa_mode's check for an underlying decl, at
 least for PARM_ and RESULT_DECLs, or further massage function.c to deal
 with the mode difference.  Any preference?

Alternatively not coalesce SSA names when promote_decl_mode gives
different answers (for their underlying decl)?  It sounds wrong to do that
(if that is really what happens).

Richard.

 I'm reverting the patch for now, so that we don't have to rush to a fix
 on this, and I can have more time to test and fix other arches.  It was
 a terrible mistake to not do so before submitting the final version of
 the patch, or at least before installing it.  I apologize for that.

 --
 Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
 You must be the change you wish to see in the world. -- Gandhi
 Be Free! -- http://FSFLA.org/   FSF Latin America board member
 Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Christophe Lyon
On 8 June 2015 at 10:14, Richard Biener richard.guent...@gmail.com wrote:
 On Sat, Jun 6, 2015 at 3:14 AM, Alexandre Oliva aol...@redhat.com wrote:
 On Apr 27, 2015, Richard Biener richard.guent...@gmail.com wrote:

 This should also mention that is_gimple_reg vars do not have their
 address taken.

 check

 +static tree
 +leader_merge (tree cur, tree next)

 Ick - presumably you can't use sth better than a TREE_LIST here?

 The list was an experiment that never really worked, and when I tried to
 make it work after the patch, it proved to be unworkable, so I dropped
 it, and rewrote leader_merge to choose either of the params, preferring
 anonymous over ignored over named, so as to reduce the likelihood of
 misreading of debug dumps, since that's all they're used for.

 static void
 -expand_one_stack_var (tree var)
 +expand_one_stack_var_1 (tree var)
 {
 HOST_WIDE_INT size, offset;
 unsigned byte_align;

 -  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
 -  byte_align = align_local_variable (SSAVAR (var));
 +  if (TREE_CODE (var) != SSA_NAME || SSA_NAME_VAR (var))
 +{
 +  size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var)));
 +  byte_align = align_local_variable (SSAVAR (var));
 +}
 +  else

 I'd go here for all TREE_CODE (var) == SSA_NAME

 Check

 (and get rid of the SSAVAR macro?)

 There are remaining uses that don't seem worth dropping it for.

 +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
 +   is the same as promote_decl_mode.  Otherwise, it is the promoted
 +   mode of a temp decl of same type as the SSA_NAME, if we had created
 +   one.  */
 +
 +machine_mode
 +promote_ssa_mode (const_tree name, int *punsignedp)
 +{
 +  gcc_assert (TREE_CODE (name) == SSA_NAME);
 +
 +  if (SSA_NAME_VAR (name))
 +return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);

 As above I'd rather not have different paths for anonymous vs. non-anonymous
 vars (so just delete the above two lines).

 Check

 @@ -9668,6 +9678,11 @@ expand_expr_real_1 (tree exp, rtx target, 
 machine_mode tmode,
 pmode = promote_function_mode (type, mode, unsignedp,
 gimple_call_fntype (g),
 2);
 + else if (!exp)
 +   {
 + gcc_assert (code == SSA_NAME);

 promote_ssa_mode should assert this.

 + pmode = promote_ssa_mode (ssa_name, unsignedp);

 It does, so...  check.


 @@ -2121,6 +2122,15 @@ aggregate_value_p (const_tree exp, const_tree 
 fntype)
 bool
 use_register_for_decl (const_tree decl)
 {
 +  if (TREE_CODE (decl) == SSA_NAME)
 +{
 +  if (!SSA_NAME_VAR (decl))
 +   return TYPE_MODE (TREE_TYPE (decl)) != BLKmode
 +  !(flag_float_store  FLOAT_TYPE_P (TREE_TYPE (decl)));
 +
 +  decl = SSA_NAME_VAR (decl);

 See above.  Please drop the SSA_NAME_VAR != NULL path.

 Check, then taken back, after a bootstrap failure and some debugging
 made me realize this would be wrong.  Here are the nearly-added comments
 that explain why:

   /* We often try to use the SSA_NAME, instead of its underlying
  decl, to get type information and guide decisions, to avoid
  differences of behavior between anonymous and named
  variables, but in this one case we have to go for the actual
  variable if there is one.  The main reason is that, at least
  at -O0, we want to place user variables on the stack, but we
  don't mind using pseudos for anonymous or ignored temps.
  Should we take the SSA_NAME, we'd conclude all SSA_NAMEs
  should go in pseudos, whereas their corresponding variables
  might have to go on the stack.  So, disregarding the decl
  here would negatively impact debug info at -O0, enable
  coalescing between SSA_NAMEs that ought to get different
  stack/pseudo assignments, and get the incoming argument
  processing thoroughly confused by PARM_DECLs expected to live
  in stack slots but assigned to pseudos.  */


 +++ b/gcc/gimple-expr.h
 +/* Defined in tree-ssa-coalesce.c.   */
 +extern bool gimple_can_coalesce_p (tree, tree);

 Err, put it to tree-ssa-coalesce.h?

 Check.  Lots of additional headers required to be able to include
 tree-ssa-coalesce.h, though.


 -  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var)));
 +  gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var ? var : name)));

 The TREE_TYPE of name and its SSA_NAME_VAR are always the same.  So just
 use TREE_TYPE (name) here.

 Check

 gcc_assert (!REG_P (dest_rtx)
 - || dest_mode == promote_decl_mode (var, unsignedp));
 + || dest_mode == promote_ssa_mode (name, unsignedp));

 if (src_mode != dest_mode)
 {
 @@ -714,12 +715,12 @@ static rtx
 get_temp_reg (tree name)
 {
 tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name;
 -  tree type = TREE_TYPE (var);
 +  tree type = var ? TREE_TYPE (var) : TREE_TYPE (name);

 See above.

 Check


 Here's the revised patch, regstrapped on x86_64-linux-gnu and
 i686-linux-gnu.  

Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread David Edelsohn
This also broke bootstrap on PPC64 LE Linux with the same error.

- David


Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Alexandre Oliva
On Jun  9, 2015, David Edelsohn dje@gmail.com wrote:

 This also broke bootstrap on PPC64 LE Linux with the same error.

Thanks for your reports.  I'm looking into the problem.

I'd appreciate a preprocessed testcase from either of you to confirm the
fix, if not to help debug it.

Thanks in advance,

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Jakub Jelinek
On Tue, Jun 09, 2015 at 05:11:45PM -0300, Alexandre Oliva wrote:
 On Jun  9, 2015, Alexandre Oliva aol...@redhat.com wrote:
 
  On Jun  9, 2015, David Edelsohn dje@gmail.com wrote:
  This also broke bootstrap on PPC64 LE Linux with the same error.
 
  Thanks for your reports.  I'm looking into the problem.
 
  I'd appreciate a preprocessed testcase from either of you to confirm the
  fix, if not to help debug it.
 
 The first potential source for this problem that jumped at me would be
 silenced with this change:
 
 diff --git a/gcc/function.c b/gcc/function.c
 index 8bcc352..9201ed9 100644
 --- a/gcc/function.c
 +++ b/gcc/function.c
 @@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all 
 *all,
   stack_parm = copy_rtx (stack_parm);
if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
   PUT_MODE (stack_parm, GET_MODE (entry_parm));
 -  set_mem_attributes (stack_parm, parm, 1);
 +  if (GET_CODE (stack_parm) == MEM)

FYI, this is preferrably if (MEM_P (stack_parm)) these days.

 + set_mem_attributes (stack_parm, parm, 1);
  }

Jakub


Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Alexandre Oliva
On Jun  9, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Jun  9, 2015, David Edelsohn dje@gmail.com wrote:
 This also broke bootstrap on PPC64 LE Linux with the same error.

 Thanks for your reports.  I'm looking into the problem.

 I'd appreciate a preprocessed testcase from either of you to confirm the
 fix, if not to help debug it.

The first potential source for this problem that jumped at me would be
silenced with this change:

diff --git a/gcc/function.c b/gcc/function.c
index 8bcc352..9201ed9 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2974,7 +2974,8 @@ assign_parm_setup_block (struct assign_parm_data_all *all,
stack_parm = copy_rtx (stack_parm);
   if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size)
PUT_MODE (stack_parm, GET_MODE (entry_parm));
-  set_mem_attributes (stack_parm, parm, 1);
+  if (GET_CODE (stack_parm) == MEM)
+   set_mem_attributes (stack_parm, parm, 1);
 }
 
   /* If a BLKmode arrives in registers, copy it to a stack slot.  Handle

but I suspect there might be other similar issues lurking in function.c
after my attempt to turn parm assignment upside down ;-)

(namely, it used to assume it could pick stack slots and pseudos in a
whim, but after this change it must give way to out-of-SSA's partition
assignments.)

I'll look into cross-building some embedded targets and see if any
further issues surface.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread David Edelsohn
Alex, I sent you a pre-processed file off-list.  You could try
bootstrapping on PPC64 on the GCC Compile Farm.

The SPARC failure reports a different error than PPC and ARM.  The PPC
and ARM failures are the same message, but seem to be on different
files.

After the breakage from Aldy's patch all weekend, this failure is very
frustrating.  If this is not fixed within 24 hours, your patch must be
reverted.  This patch clearly should have been tested on more
architectures than x86 before being approved and merged.

Thanks, David


On Tue, Jun 9, 2015 at 5:27 PM, Eric Botcazou ebotca...@adacore.com wrote:
 I'll look into cross-building some embedded targets and see if any
 further issues surface.

 SPARC is also broken, see my message and the tescase under the PR.

 --
 Eric Botcazou


Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Eric Botcazou
 I'll look into cross-building some embedded targets and see if any
 further issues surface.

SPARC is also broken, see my message and the tescase under the PR.

-- 
Eric Botcazou


Re: [PR64164] drop copyrename, integrate into expand

2015-06-09 Thread Alexandre Oliva
On Jun  5, 2015, Alexandre Oliva aol...@redhat.com wrote:

 On Apr 27, 2015, Richard Biener richard.guent...@gmail.com wrote:

 +/* Return the promoted mode for name.  If it is a named SSA_NAME, it
 +   is the same as promote_decl_mode.  Otherwise, it is the promoted
 +   mode of a temp decl of same type as the SSA_NAME, if we had created
 +   one.  */
 +
 +machine_mode
 +promote_ssa_mode (const_tree name, int *punsignedp)
 +{
 +  gcc_assert (TREE_CODE (name) == SSA_NAME);
 +
 +  if (SSA_NAME_VAR (name))
 +return promote_decl_mode (SSA_NAME_VAR (name), punsignedp);

 As above I'd rather not have different paths for anonymous vs. non-anonymous
 vars (so just delete the above two lines).

 Check

This caused the sparc regression reported by Eric in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64164#c37

We need to match the mode of the rtl created for the partition and the
promoted mode expected for the parm.  I recall working to make parm and
result decls the partition leaders, so that promote_ssa_mode would DTRT,
but this escaped my mind when revisiting the patch after some time on
another project.

So we either restore promote_ssa_mode's check for an underlying decl, at
least for PARM_ and RESULT_DECLs, or further massage function.c to deal
with the mode difference.  Any preference?


I'm reverting the patch for now, so that we don't have to rush to a fix
on this, and I can have more time to test and fix other arches.  It was
a terrible mistake to not do so before submitting the final version of
the patch, or at least before installing it.  I apologize for that.

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


  1   2   >