Re: [PATCH][LRA] Fix PR87899: r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-14 Thread Christophe Lyon
On Tue, 13 Nov 2018 at 23:18, Peter Bergner  wrote:
>
> On 11/13/18 4:09 PM, Vladimir Makarov wrote:
> > On 11/13/2018 10:53 AM, Peter Bergner wrote:
> >> I think with the above results, I think the patch is ready for review.
> >> I'm attaching the latest updated patch below.
> >>
> >> Again, this passed bootstrap and regtesting on powerpc64le-linux with
> >> no regressions.  Ok for mainline?
> >>
> > Ok, Peter.
>
> Ok, this is committed now.  Thanks for the review and thank you
> to everyone who helped debug and test the plethora of patches!!!
>

Thanks, on our side I can confirm the problems with the ARM bootstrap
is now fixed by your patch.

Christophe

> Peter
>
>
>


Re: [PATCH][LRA] Fix PR87899: r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-13 Thread Peter Bergner
On 11/13/18 4:09 PM, Vladimir Makarov wrote:
> On 11/13/2018 10:53 AM, Peter Bergner wrote:
>> I think with the above results, I think the patch is ready for review.
>> I'm attaching the latest updated patch below.
>>
>> Again, this passed bootstrap and regtesting on powerpc64le-linux with
>> no regressions.  Ok for mainline?
>>
> Ok, Peter.

Ok, this is committed now.  Thanks for the review and thank you
to everyone who helped debug and test the plethora of patches!!!

Peter





Re: [PATCH][LRA] Fix PR87899: r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-13 Thread Vladimir Makarov

On 11/13/2018 10:53 AM, Peter Bergner wrote:

On 11/13/18 9:01 AM, Renlin Li wrote:

I could verify that, your patch fixes all the ICEs I saw with 
arm-linux-gnueabihf toolchain!
There are some differences on the test results, because I compare the latest 
results with something which is old.

I haven't test it on bare-metal toolchain yet. But will do to ensure all 
related issues are fixed.

Hi Renlin,

That's excellent news!  My guess on the testsuite results changes is that
they're probably caused by the combine changes/fixes that went in around
the same time as my patches.

If you want to disable the special copy handling, which should only help
things since it gives RA more freedom, you can apply the patch I mentioned
here:

   https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00379.html

which allows you to turn on and off the optimization with an option.


Jeff and Vlad,

I think with the above results, I think the patch is ready for review.
I'm attaching the latest updated patch below.

Again, this passed bootstrap and regtesting on powerpc64le-linux with
no regressions.  Ok for mainline?


Ok, Peter.


Additional point generation here is not important.  It is more important 
for IRA.  Therefore more efforts were spent to reduce their numbers and 
spans in IRA than in LRA.


Thanks for working on the PR.  LRA/reload patches need a few iteration 
as a rule.  So it is a normal situation.


gcc/
PR rtl-optimization/87899
* lra-lives.c (start_living): Update white space in comment.
(enum point_type): New.
(sparseset_contains_pseudos_p): New function.
(update_pseudo_point): Likewise.
(make_hard_regno_live): Use HARD_REGISTER_NUM_P macro.
(make_hard_regno_dead): Likewise.  Remove ignore_reg_for_conflicts
handling.  Move early exit after adding conflicts.
(mark_pseudo_live): Use HARD_REGISTER_NUM_P macro.  Add early exit
if regno is already live.  Remove all handling of program points.
(mark_pseudo_dead): Use HARD_REGISTER_NUM_P macro.  Add early exit
after adding conflicts.  Remove all handling of program points and
ignore_reg_for_conflicts.
(mark_regno_live): Use HARD_REGISTER_NUM_P macro.  Remove return value
and do not guard call to mark_pseudo_live.
(mark_regno_dead): Use HARD_REGISTER_NUM_P macro.  Remove return value
and do not guard call to mark_pseudo_dead.
(check_pseudos_live_through_calls): Use HARD_REGISTER_NUM_P macro.
(process_bb_lives): Use HARD_REGISTER_NUM_P and HARD_REGISTER_P macros.
Use new function update_pseudo_point.  Handle register copies by
removing the source register from the live set.  Handle INOUT operands.
Update to the next program point using the unused_set, dead_set and
start_dying sets.
(lra_create_live_ranges_1): Use HARD_REGISTER_NUM_P macro.





Re: [PATCH][LRA] Fix PR87899: r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-13 Thread Peter Bergner
On 11/13/18 9:01 AM, Renlin Li wrote:
> I could verify that, your patch fixes all the ICEs I saw with 
> arm-linux-gnueabihf toolchain!
> There are some differences on the test results, because I compare the latest 
> results with something which is old.
> 
> I haven't test it on bare-metal toolchain yet. But will do to ensure all 
> related issues are fixed.

Hi Renlin,

That's excellent news!  My guess on the testsuite results changes is that
they're probably caused by the combine changes/fixes that went in around
the same time as my patches.

If you want to disable the special copy handling, which should only help
things since it gives RA more freedom, you can apply the patch I mentioned
here:

  https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00379.html

which allows you to turn on and off the optimization with an option.


Jeff and Vlad,

I think with the above results, I think the patch is ready for review.
I'm attaching the latest updated patch below.

Again, this passed bootstrap and regtesting on powerpc64le-linux with
no regressions.  Ok for mainline?

Peter


gcc/
PR rtl-optimization/87899
* lra-lives.c (start_living): Update white space in comment.
(enum point_type): New.
(sparseset_contains_pseudos_p): New function.
(update_pseudo_point): Likewise.
(make_hard_regno_live): Use HARD_REGISTER_NUM_P macro.
(make_hard_regno_dead): Likewise.  Remove ignore_reg_for_conflicts
handling.  Move early exit after adding conflicts.
(mark_pseudo_live): Use HARD_REGISTER_NUM_P macro.  Add early exit
if regno is already live.  Remove all handling of program points.
(mark_pseudo_dead): Use HARD_REGISTER_NUM_P macro.  Add early exit
after adding conflicts.  Remove all handling of program points and
ignore_reg_for_conflicts.
(mark_regno_live): Use HARD_REGISTER_NUM_P macro.  Remove return value
and do not guard call to mark_pseudo_live.
(mark_regno_dead): Use HARD_REGISTER_NUM_P macro.  Remove return value
and do not guard call to mark_pseudo_dead.
(check_pseudos_live_through_calls): Use HARD_REGISTER_NUM_P macro.
(process_bb_lives): Use HARD_REGISTER_NUM_P and HARD_REGISTER_P macros.
Use new function update_pseudo_point.  Handle register copies by
removing the source register from the live set.  Handle INOUT operands.
Update to the next program point using the unused_set, dead_set and
start_dying sets.
(lra_create_live_ranges_1): Use HARD_REGISTER_NUM_P macro.

Index: gcc/lra-lives.c
===
--- gcc/lra-lives.c (revision 265971)
+++ gcc/lra-lives.c (working copy)
@@ -83,7 +83,7 @@ static HARD_REG_SET hard_regs_live;
 
 /* Set of pseudos and hard registers start living/dying in the current
insn.  These sets are used to update REG_DEAD and REG_UNUSED notes
-   in the insn. */
+   in the insn.  */
 static sparseset start_living, start_dying;
 
 /* Set of pseudos and hard regs dead and unused in the current
@@ -96,10 +96,6 @@ static bitmap_head temp_bitmap;
 /* Pool for pseudo live ranges. */
 static object_allocator lra_live_range_pool ("live ranges");
 
-/* If non-NULL, the source operand of a register to register copy for which
-   we should not add a conflict with the copy's destination operand.  */
-static rtx ignore_reg_for_conflicts;
-
 /* Free live range list LR.  */
 static void
 free_live_range_list (lra_live_range_t lr)
@@ -224,6 +220,57 @@ lra_intersected_live_ranges_p (lra_live_
   return false;
 }
 
+enum point_type {
+  DEF_POINT,
+  USE_POINT
+};
+
+/* Return TRUE if set A contains a pseudo register, otherwise, return FALSE.  
*/
+static bool
+sparseset_contains_pseudos_p (sparseset a)
+{
+  int regno;
+  EXECUTE_IF_SET_IN_SPARSESET (a, regno)
+if (!HARD_REGISTER_NUM_P (regno))
+  return true;
+  return false;
+}
+
+/* Mark pseudo REGNO as living or dying at program point POINT, depending on
+   whether TYPE is a definition or a use.  If this is the first reference to
+   REGNO that we've encountered, then create a new live range for it.  */
+
+static void
+update_pseudo_point (int regno, int point, enum point_type type)
+{
+  lra_live_range_t p;
+
+  /* Don't compute points for hard registers.  */
+  if (HARD_REGISTER_NUM_P (regno))
+return;
+
+  if (complete_info_p || lra_get_regno_hard_regno (regno) < 0)
+{
+  if (type == DEF_POINT)
+   {
+ if (sparseset_bit_p (pseudos_live, regno))
+   {
+ p = lra_reg_info[regno].live_ranges;
+ lra_assert (p != NULL);
+ p->finish = point;
+   }
+   }
+  else /* USE_POINT */
+   {
+ if (!sparseset_bit_p (pseudos_live, regno)
+ && ((p = lra_reg_info[regno].live_ranges) == NULL
+ || (p->finish != point && p->finish + 1 != point)))
+   

Re: [PATCH][LRA] Fix PR87899: r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-13 Thread Renlin Li

Hi Peter,

I could verify that, your patch fixes all the ICEs I saw with 
arm-linux-gnueabihf toolchain!
There are some differences on the test results, because I compare the latest 
results with something which is old.

I haven't test it on bare-metal toolchain yet. But will do to ensure all 
related issues are fixed.

Thanks for fixing it!

Regards,
Renlin



On 11/12/2018 08:25 PM, Peter Bergner wrote:

On 11/12/18 6:25 AM, Renlin Li wrote:

I tried to build a native arm-linuxeabihf toolchain with the patch.
But I got the following ICE:


Ok, the issue was a problem in handling the src reg from a register copy.
I thought I could just remove it from the dead_set, but forgot that the
updating of the program points looks at whether the pseudo is live or
not.  The change below on top of the previous patch fixes the ICE for me.
I now add the src reg back into pseudos_live before we process the insn's
input operands so it doesn't trigger a new program point being added.

Renlin and Jeff, can you apply this patch on top of the previous one
and see whether that is better?

Thanks.

Peter


--- gcc/lra-lives.c.orig2018-11-12 14:15:18.257657911 -0600
+++ gcc/lra-lives.c 2018-11-12 14:08:55.978795092 -0600
@@ -934,6 +934,18 @@
  || sparseset_contains_pseudos_p (start_dying))
next_program_point (curr_point, freq);
  
+  /* If we removed the source reg from a simple register copy from the

+live set above, then add it back now so we don't accidentally add
+it to the start_living set below.  */
+  if (ignore_reg != NULL_RTX)
+   {
+ int ignore_regno = REGNO (ignore_reg);
+ if (HARD_REGISTER_NUM_P (ignore_regno))
+   SET_HARD_REG_BIT (hard_regs_live, ignore_regno);
+ else
+   sparseset_set_bit (pseudos_live, ignore_regno);
+   }
+
sparseset_clear (start_living);
  
/* Mark each used value as live.	*/

@@ -959,11 +971,6 @@
  
sparseset_and_compl (dead_set, start_living, start_dying);
  
-  /* If we removed the source reg from a simple register copy from the

-live set, then it will appear to be dead, but it really isn't.  */
-  if (ignore_reg != NULL_RTX)
-   sparseset_clear_bit (dead_set, REGNO (ignore_reg));
-
sparseset_clear (start_dying);
  
/* Mark early clobber outputs dead.  */




Re: [PATCH][LRA] Fix PR87899: r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-12 Thread Peter Bergner
On 11/12/18 6:25 AM, Renlin Li wrote:
> I tried to build a native arm-linuxeabihf toolchain with the patch.
> But I got the following ICE:

Ok, the issue was a problem in handling the src reg from a register copy.
I thought I could just remove it from the dead_set, but forgot that the
updating of the program points looks at whether the pseudo is live or
not.  The change below on top of the previous patch fixes the ICE for me.
I now add the src reg back into pseudos_live before we process the insn's
input operands so it doesn't trigger a new program point being added.

Renlin and Jeff, can you apply this patch on top of the previous one
and see whether that is better?

Thanks.

Peter


--- gcc/lra-lives.c.orig2018-11-12 14:15:18.257657911 -0600
+++ gcc/lra-lives.c 2018-11-12 14:08:55.978795092 -0600
@@ -934,6 +934,18 @@
  || sparseset_contains_pseudos_p (start_dying))
next_program_point (curr_point, freq);
 
+  /* If we removed the source reg from a simple register copy from the
+live set above, then add it back now so we don't accidentally add
+it to the start_living set below.  */
+  if (ignore_reg != NULL_RTX)
+   {
+ int ignore_regno = REGNO (ignore_reg);
+ if (HARD_REGISTER_NUM_P (ignore_regno))
+   SET_HARD_REG_BIT (hard_regs_live, ignore_regno);
+ else
+   sparseset_set_bit (pseudos_live, ignore_regno);
+   }
+
   sparseset_clear (start_living);
 
   /* Mark each used value as live. */
@@ -959,11 +971,6 @@
 
   sparseset_and_compl (dead_set, start_living, start_dying);
 
-  /* If we removed the source reg from a simple register copy from the
-live set, then it will appear to be dead, but it really isn't.  */
-  if (ignore_reg != NULL_RTX)
-   sparseset_clear_bit (dead_set, REGNO (ignore_reg));
-
   sparseset_clear (start_dying);
 
   /* Mark early clobber outputs dead.  */



Re: [PATCH][LRA] Fix PR87899: r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-12 Thread Peter Bergner
On 11/12/18 6:25 AM, Renlin Li wrote:
> I tried to build a native arm-linuxeabihf toolchain with the patch.
> But I got the following ICE:

Why can't things ever be easy? :-)  I think we're getting closer though.

Anyway, can you please recompile the failing file but using -save-temps
and send me the resulting preprocessed source file?  Also, can you give
me the gcc configure options you used to build your GCC?  That should
give me enough info to debug this one.  Thanks.

Peter



Re: [PATCH][LRA] Fix PR87899: r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-12 Thread Renlin Li

Hi Peter,

Thanks for the patch! It makes much more sense to me to split those functions, 
and use them separately.

I tried to build a native arm-linuxeabihf toolchain with the patch. But I got 
the following ICE:

/home/renlin/try-new/./gcc/xgcc -B/home/renlin/try-new/./gcc/ -B/usr/local/arm-none-linux-gnueabihf/bin/ -B/usr/local/arm-none-linux-gnueabihf/lib/ 
-isystem /usr/local/arm-none-linux-gnueabihf/include -isystem /usr/local/arm-none-linux-gnueabihf/sys-include   -fno-checking -O2 -g -O0 -O2  -O2 -g 
-O0 -DIN_GCC-W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-format -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition 
-isystem ./include   -fPIC -fno-inline -g -DIN_LIBGCC2 -fbuilding-libgcc -fno-stack-protector   -fPIC -fno-inline -I. -I. -I../.././gcc 
-I../../../gcc/libgcc -I../../../gcc/libgcc/. -I../../../gcc/libgcc/../gcc -I../../../gcc/libgcc/../include  -DHAVE_CC_TLS  -o _negvdi2_s.o -MT 
_negvdi2_s.o -MD -MP -MF _negvdi2_s.dep -DSHARED -DL_negvdi2 -c ../../../gcc/libgcc/libgcc2.c

0x807eb3 lra(_IO_FILE*)
../../gcc/gcc/lra.c:2497
0x7c2755 do_reload
../../gcc/gcc/ira.c:5469
0x7c2c11 execute
../../gcc/gcc/ira.c:5653
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
make[3]: *** [Makefile:916: _gcov_merge_icall_topn.o] Error 1
make[3]: *** Waiting for unfinished jobs
make[3]: *** [Makefile:916: _gcov_merge_single.o] Error 1
during RTL pass: reload
../../../gcc/libgcc/libgcov-driver.c: In function 
‘gcov_sort_icall_topn_counter’:
../../../gcc/libgcc/libgcov-driver.c:436:1: internal compiler error: in 
remove_some_program_points_and_update_live_ranges, at lra-lives.c:1172
436 | }
| ^
0x829189 remove_some_program_points_and_update_live_ranges
../../gcc/gcc/lra-lives.c:1172
0x829683 compress_live_ranges
../../gcc/gcc/lra-lives.c:1301
0x829d45 lra_create_live_ranges_1
../../gcc/gcc/lra-lives.c:1454
0x829d7d lra_create_live_ranges(bool, bool)
../../gcc/gcc/lra-lives.c:1466
0x807eb3 lra(_IO_FILE*)
../../gcc/gcc/lra.c:2497
0x7c2755 do_reload
../../gcc/gcc/ira.c:5469
0x7c2c11 execute
../../gcc/gcc/ira.c:5653
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.


Regards,
Renlin

On 11/12/2018 04:34 AM, Peter Bergner wrote:

Renlin, Jeff and Vlad: requests and questions for you below...

PR87899 shows another latent LRA bug exposed by my r264897 commit.
In the bugzilla report, we have the following rtl in LRA:

   (insn 1 (set (reg:SI 1 r1) (reg/f:SI 2040)))
...
   (insn 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 1 r1)
   (plus:SI (reg:SI 1 r1)
(const_int 12
(reg:SI 1048))
   (expr_list:REG_INC (reg:SI 1 r1)))
...
   

My earlier patch now sees the reg copy in insn "1" and correctly skips
adding a conflict between r1 and r2040 due to the copy.  However, insn "2"
updates r1 and r2040 is live across that update and so we should create
a conflict between them, but we currently do not and that leads to us
assigning r1 to one of r2040's reload pseudos which gets clobbered by
the r1 update in insn "2".

The reason a conflict was never added between r1 and r2040 is that LRA
skips INOUT operands when computing conflicts and so misses the definition
of r1 in insn "2" and so never adds conflicts for it.  The reason the code
skips the INOUT operands is that LRA doesn't want to create new program
points for INOUT operands, since unnecessary program points can slow down
remove_some_program_points_and_update_live_ranges.  This was all fine
before when we had conservative conflict info, but now we cannot ignore
INOUT operands.

The heart of the problem is that the {make,mark}_*_{live,dead} routines
update the liveness, conflict and program point information for operands.
My solution to the problem was to pull out the updating of the program point
info from {make,mark}_*_{live,dead} and have them only update liveness and
conflict information.  I then created a separate function that is used for
updating an operand's program points.  This allowed me to modify the insn
operand scanning to handle all operand types (IN, OUT and INOUT) and always
call the {make,mark}_*_{live,dead} functions for all operand types, while
only calling the new program point update function for IN and OUT operands.

This change then allowed me to remove the hacky handling of conflicts for
reg copies and instead use the more common method of removing the src reg
of a copy from the live set before handling the copy's definition, thereby
skipping the unwanted conflict.  Bonus! :-)

This passes bootstrap and regtesting on powerpc64le-linux with no regressions.


[PATCH][LRA] Fix PR87899: r264897 cause mis-compiled native arm-linux-gnueabihf toolchain

2018-11-11 Thread Peter Bergner
Renlin, Jeff and Vlad: requests and questions for you below...

PR87899 shows another latent LRA bug exposed by my r264897 commit.
In the bugzilla report, we have the following rtl in LRA:

  (insn 1 (set (reg:SI 1 r1) (reg/f:SI 2040)))
...
  (insn 2 (set (mem/f/c:SI (pre_modify:SI (reg:SI 1 r1)
  (plus:SI (reg:SI 1 r1)
   (const_int 12
   (reg:SI 1048))
  (expr_list:REG_INC (reg:SI 1 r1)))
...
  

My earlier patch now sees the reg copy in insn "1" and correctly skips
adding a conflict between r1 and r2040 due to the copy.  However, insn "2"
updates r1 and r2040 is live across that update and so we should create
a conflict between them, but we currently do not and that leads to us
assigning r1 to one of r2040's reload pseudos which gets clobbered by
the r1 update in insn "2".

The reason a conflict was never added between r1 and r2040 is that LRA
skips INOUT operands when computing conflicts and so misses the definition
of r1 in insn "2" and so never adds conflicts for it.  The reason the code
skips the INOUT operands is that LRA doesn't want to create new program
points for INOUT operands, since unnecessary program points can slow down
remove_some_program_points_and_update_live_ranges.  This was all fine
before when we had conservative conflict info, but now we cannot ignore
INOUT operands.

The heart of the problem is that the {make,mark}_*_{live,dead} routines
update the liveness, conflict and program point information for operands.
My solution to the problem was to pull out the updating of the program point
info from {make,mark}_*_{live,dead} and have them only update liveness and
conflict information.  I then created a separate function that is used for
updating an operand's program points.  This allowed me to modify the insn
operand scanning to handle all operand types (IN, OUT and INOUT) and always
call the {make,mark}_*_{live,dead} functions for all operand types, while
only calling the new program point update function for IN and OUT operands.

This change then allowed me to remove the hacky handling of conflicts for
reg copies and instead use the more common method of removing the src reg
of a copy from the live set before handling the copy's definition, thereby
skipping the unwanted conflict.  Bonus! :-)

This passes bootstrap and regtesting on powerpc64le-linux with no regressions.

Renlin, can you please verify that this patch fixes the issue you ran into?

Jeff, could you please throw this on your test builders to see whether
it exposes any problems with the patch I didn't see?

Otherwise, Vlad and Jeff, what do you think of this solution?

I'll note that I have compiled quite a few largish test cases and so far
I am seeing the same exact program points being used in the LRA rtl dump
before and after my patch.  I'm continuing to do that with more tests
to make sure I have everything working correctly.

I'll also note I took the opportunity to convert lra-lives.c over to using
the HARD_REGISTER_P and HARD_REGISTER_NUM_P convenience macros.

Peter


gcc/
PR rtl-optimization/87899
* lra-lives.c (start_living): Update white space in comment.
(enum point_type): New.
(sparseset_contains_pseudos_p): New function.
(update_pseudo_point): Likewise.
(make_hard_regno_live): Use HARD_REGISTER_NUM_P macro.
(make_hard_regno_dead): Likewise.  Remove ignore_reg_for_conflicts
handling.  Move early exit after adding conflicts.
(mark_pseudo_live): Use HARD_REGISTER_NUM_P macro.  Add early exit
if regno is already live.  Remove all handling of program points.
(mark_pseudo_dead): Use HARD_REGISTER_NUM_P macro.  Add early exit
after adding conflicts.  Remove all handling of program points and
ignore_reg_for_conflicts.
(mark_regno_live): Use HARD_REGISTER_NUM_P macro.  Remove return value
and do not guard call to mark_pseudo_live.
(mark_regno_dead): Use HARD_REGISTER_NUM_P macro.  Remove return value
and do not guard call to mark_pseudo_dead.
(check_pseudos_live_through_calls): Use HARD_REGISTER_NUM_P macro.
(process_bb_lives): Use HARD_REGISTER_NUM_P and HARD_REGISTER_P macros.
Use new function update_pseudo_point.  Handle register copies by
removing the source register from the live set.  Handle INOUT operands.
Update to the next program point using the unused_set, dead_set and
start_dying sets.
(lra_create_live_ranges_1): Use HARD_REGISTER_NUM_P macro.

Index: gcc/lra-lives.c
===
--- gcc/lra-lives.c (revision 265971)
+++ gcc/lra-lives.c (working copy)
@@ -83,7 +83,7 @@ static HARD_REG_SET hard_regs_live;
 
 /* Set of pseudos and hard registers start living/dying in the current
insn.  These sets are used to update REG_DEAD and REG_UNUSED