Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-05 Thread Jan Hubicka
> > I had a brief look at find_modifiable_mems, which seems to be the only kind 
> > of
> > schedule-time dependency breaking that we do.  I started writing some new 
> > code
> > that would handle REG_ARGS_SIZE, but then considered that wasn't terribly
> > appropriate for stage4.
> > 
> > So I've left off with just the dependency links between the insns.  We'd 
> > need
> > these for the schedule-time adjustments anyway, and with them in place the
> > scheduler does not re-order the insns in a way that causes problems.
> > 
> > Testing for x86_64 has finished; i686 is nearly done.  I suppose the only
> > question I have is if anyone disagrees that OUTPUT is incorrect as the
> > dependency type.

>From i386  perspective, this seems like resonable compromise for now.
I did some re-testing of scheduler recently for generic.
Call sequences is one of few places where scheduling matters (in addition to
tight loops), but I do not think we will lose measurable % of perofmrance by
disabling scheduling here.

Thanks!
Honza
> > 
> > 
> > r~
> > 
> 

>   * combine-stack-adj.c: Revert r206943.
>   * sched-int.h (struct deps_desc): Add last_args_size.
>   * sched-deps.c (init_deps): Initialize it.
>   (sched_analyze_insn): Add OUTPUT dependencies between insns that
>   contain REG_ARGS_SIZE notes.
> 
> 
> diff --git a/gcc/combine-stack-adj.c b/gcc/combine-stack-adj.c
> index c591c60..69fd5ea 100644
> --- a/gcc/combine-stack-adj.c
> +++ b/gcc/combine-stack-adj.c
> @@ -567,7 +567,6 @@ combine_stack_adjustments_for_block (basic_block bb)
> && try_apply_stack_adjustment (insn, reflist, 0,
>-last_sp_adjust))
>   {
> -   rtx note;
> if (last2_sp_set)
>   maybe_move_args_size_note (last2_sp_set, last_sp_set, false);
> else
> @@ -577,11 +576,6 @@ combine_stack_adjustments_for_block (basic_block bb)
> reflist = NULL;
> last_sp_set = NULL_RTX;
> last_sp_adjust = 0;
> -   /* We no longer adjust stack size.  Whoever adjusted it earlier
> -  hopefully got the note right.  */
> -   note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX);
> -   if (note)
> - remove_note (insn, note);
> continue;
>   }
>   }
> diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
> index 7efc937..efc4223 100644
> --- a/gcc/sched-deps.c
> +++ b/gcc/sched-deps.c
> @@ -3470,6 +3470,15 @@ sched_analyze_insn (struct deps_desc *deps, rtx x, rtx 
> insn)
>  change_spec_dep_to_hard (sd_it);
>  }
>  }
> +
> +  /* We do not yet have code to adjust REG_ARGS_SIZE, therefore we must
> + honor their original ordering.  */
> +  if (find_reg_note (insn, REG_ARGS_SIZE, NULL))
> +{
> +  if (deps->last_args_size)
> + add_dependence (insn, deps->last_args_size, REG_DEP_OUTPUT);
> +  deps->last_args_size = insn;
> +}
>  }
>  
>  /* Return TRUE if INSN might not always return normally (e.g. call exit,
> @@ -3876,6 +3885,7 @@ init_deps (struct deps_desc *deps, bool lazy_reg_last)
>deps->sched_before_next_jump = 0;
>deps->in_post_call_group_p = not_post_call;
>deps->last_debug_insn = 0;
> +  deps->last_args_size = 0;
>deps->last_reg_pending_barrier = NOT_A_BARRIER;
>deps->readonly = 0;
>  }
> diff --git a/gcc/sched-int.h b/gcc/sched-int.h
> index 3b1106f..2cec624 100644
> --- a/gcc/sched-int.h
> +++ b/gcc/sched-int.h
> @@ -539,6 +539,9 @@ struct deps_desc
>/* The last debug insn we've seen.  */
>rtx last_debug_insn;
>  
> +  /* The last insn bearing REG_ARGS_SIZE that we've seen.  */
> +  rtx last_args_size;
> +
>/* The maximum register number for the following arrays.  Before reload
>   this is max_reg_num; after reload it is FIRST_PSEUDO_REGISTER.  */
>int max_reg;



Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-05 Thread Richard Henderson
This time with the patch...

On 02/05/2014 07:40 AM, Richard Henderson wrote:
> On 02/04/2014 09:35 AM, Jan Hubicka wrote:
>>> On 02/04/2014 08:48 AM, Jan Hubicka wrote:
 How things are supposed to work in this case? So perhaps we need scheduler 
 to
 understand and move around the ARGS_SIZE note?
>>>
>>> I believe we do need to have the scheduler move the notes around.
>>
>> If we need notes on non-stack adjusting insns as you seem to show in your 
>> testcase,
>> I guess it is the only way around.  Still combine stack adjust may be smart 
>> enough
>> to not produce redundant note in the case of go's ICE.
>>
>> I am not terribly familiar with the code, will you look into it or shall I 
>> give it a try?
> 
> I had a brief look at find_modifiable_mems, which seems to be the only kind of
> schedule-time dependency breaking that we do.  I started writing some new code
> that would handle REG_ARGS_SIZE, but then considered that wasn't terribly
> appropriate for stage4.
> 
> So I've left off with just the dependency links between the insns.  We'd need
> these for the schedule-time adjustments anyway, and with them in place the
> scheduler does not re-order the insns in a way that causes problems.
> 
> Testing for x86_64 has finished; i686 is nearly done.  I suppose the only
> question I have is if anyone disagrees that OUTPUT is incorrect as the
> dependency type.
> 
> 
> r~
> 

* combine-stack-adj.c: Revert r206943.
* sched-int.h (struct deps_desc): Add last_args_size.
* sched-deps.c (init_deps): Initialize it.
(sched_analyze_insn): Add OUTPUT dependencies between insns that
contain REG_ARGS_SIZE notes.


diff --git a/gcc/combine-stack-adj.c b/gcc/combine-stack-adj.c
index c591c60..69fd5ea 100644
--- a/gcc/combine-stack-adj.c
+++ b/gcc/combine-stack-adj.c
@@ -567,7 +567,6 @@ combine_stack_adjustments_for_block (basic_block bb)
  && try_apply_stack_adjustment (insn, reflist, 0,
 -last_sp_adjust))
{
- rtx note;
  if (last2_sp_set)
maybe_move_args_size_note (last2_sp_set, last_sp_set, false);
  else
@@ -577,11 +576,6 @@ combine_stack_adjustments_for_block (basic_block bb)
  reflist = NULL;
  last_sp_set = NULL_RTX;
  last_sp_adjust = 0;
- /* We no longer adjust stack size.  Whoever adjusted it earlier
-hopefully got the note right.  */
- note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX);
- if (note)
-   remove_note (insn, note);
  continue;
}
}
diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
index 7efc937..efc4223 100644
--- a/gcc/sched-deps.c
+++ b/gcc/sched-deps.c
@@ -3470,6 +3470,15 @@ sched_analyze_insn (struct deps_desc *deps, rtx x, rtx 
insn)
 change_spec_dep_to_hard (sd_it);
 }
 }
+
+  /* We do not yet have code to adjust REG_ARGS_SIZE, therefore we must
+ honor their original ordering.  */
+  if (find_reg_note (insn, REG_ARGS_SIZE, NULL))
+{
+  if (deps->last_args_size)
+   add_dependence (insn, deps->last_args_size, REG_DEP_OUTPUT);
+  deps->last_args_size = insn;
+}
 }
 
 /* Return TRUE if INSN might not always return normally (e.g. call exit,
@@ -3876,6 +3885,7 @@ init_deps (struct deps_desc *deps, bool lazy_reg_last)
   deps->sched_before_next_jump = 0;
   deps->in_post_call_group_p = not_post_call;
   deps->last_debug_insn = 0;
+  deps->last_args_size = 0;
   deps->last_reg_pending_barrier = NOT_A_BARRIER;
   deps->readonly = 0;
 }
diff --git a/gcc/sched-int.h b/gcc/sched-int.h
index 3b1106f..2cec624 100644
--- a/gcc/sched-int.h
+++ b/gcc/sched-int.h
@@ -539,6 +539,9 @@ struct deps_desc
   /* The last debug insn we've seen.  */
   rtx last_debug_insn;
 
+  /* The last insn bearing REG_ARGS_SIZE that we've seen.  */
+  rtx last_args_size;
+
   /* The maximum register number for the following arrays.  Before reload
  this is max_reg_num; after reload it is FIRST_PSEUDO_REGISTER.  */
   int max_reg;


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-05 Thread Richard Henderson
On 02/04/2014 09:35 AM, Jan Hubicka wrote:
>> On 02/04/2014 08:48 AM, Jan Hubicka wrote:
>>> How things are supposed to work in this case? So perhaps we need scheduler 
>>> to
>>> understand and move around the ARGS_SIZE note?
>>
>> I believe we do need to have the scheduler move the notes around.
> 
> If we need notes on non-stack adjusting insns as you seem to show in your 
> testcase,
> I guess it is the only way around.  Still combine stack adjust may be smart 
> enough
> to not produce redundant note in the case of go's ICE.
> 
> I am not terribly familiar with the code, will you look into it or shall I 
> give it a try?

I had a brief look at find_modifiable_mems, which seems to be the only kind of
schedule-time dependency breaking that we do.  I started writing some new code
that would handle REG_ARGS_SIZE, but then considered that wasn't terribly
appropriate for stage4.

So I've left off with just the dependency links between the insns.  We'd need
these for the schedule-time adjustments anyway, and with them in place the
scheduler does not re-order the insns in a way that causes problems.

Testing for x86_64 has finished; i686 is nearly done.  I suppose the only
question I have is if anyone disagrees that OUTPUT is incorrect as the
dependency type.


r~



Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-04 Thread Richard Henderson
On 02/04/2014 09:35 AM, Jan Hubicka wrote:
> I am not terribly familiar with the code, will you look into it or shall I 
> give it a try?

I'm looking at it.  Was there a PR for the Go bootstrap failure?


r~


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-04 Thread Jan Hubicka
> On 02/04/2014 08:48 AM, Jan Hubicka wrote:
> > How things are supposed to work in this case? So perhaps we need scheduler 
> > to
> > understand and move around the ARGS_SIZE note?
> 
> I believe we do need to have the scheduler move the notes around.

If we need notes on non-stack adjusting insns as you seem to show in your 
testcase,
I guess it is the only way around.  Still combine stack adjust may be smart 
enough
to not produce redundant note in the case of go's ICE.

I am not terribly familiar with the code, will you look into it or shall I give 
it a try?

Thanks!
Honza
> 
> 
> r~


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-04 Thread Richard Henderson
On 02/04/2014 08:48 AM, Jan Hubicka wrote:
> How things are supposed to work in this case? So perhaps we need scheduler to
> understand and move around the ARGS_SIZE note?

I believe we do need to have the scheduler move the notes around.


r~


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-04 Thread Jan Hubicka
> On 01/01/2014 06:23 AM, Jan Hubicka wrote:
> >   last_sp_adjust = 0;
> > + /* We no longer adjust stack size.  Whoever adjusted it earlier
> > +hopefully got the note right.  */
> > + note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX);
> > + if (note)
> > +   remove_note (insn, note);
> >   continue;
> 
> This is a totally unfounded hope, by the way.  I'm certain that I tried this

Sorry to hear that - that assumption seemed right in the testcase I loked into.

> the first time around, with similarly poor results.
> 
> See the g++.dg/opt/pr52727.C test case that fails with this patch.  I can only
> assume you didn't examine i686 test results when you committed this?
> 
> The problem is that the last adjustment _did_ apply the correct note
> adjustment, for the stack at that time.  But neither moving the note nor
> dropping the note is correct.
> 
> Consider
> 
>   // args_size 4
>   add $4, %esp// args_size 0
>   push%eax// no note; fpu, not call related
> 
> transformed by csa to
> 
>   mov %eax, (%esp)
> 
> If we move the note from the add to the mov, as we did prior to your patch, we
> retain the essence of the pop, that we're no longer queueing arguments.  But 
> if
> we drop the note, as we do with your patch, then we still believe that we have
> 4 bytes of arguments on the stack.
> 
> You say a bootstrap with Go was the testcase that sent you in this direction?
> I'll have a look...

Yes, if you remove the patch you should get ICE during go x86_64 bootstrap with
generic tuning.

I tried to describe the situation in original mail and the PR - the sequence was
same as you describe here, but comming from two consecutive calls, so args_size
was 8, then reduced to 0 by add and then increased to 8 by push.

I see in the testcase the psh comes from splitter - that is something I did
not consider indded. I am sorry for that.
How things are supposed to work in this case? So perhaps we need scheduler to
understand and move around the ARGS_SIZE note?

Honza


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-02-04 Thread Richard Henderson
On 01/01/2014 06:23 AM, Jan Hubicka wrote:
> last_sp_adjust = 0;
> +   /* We no longer adjust stack size.  Whoever adjusted it earlier
> +  hopefully got the note right.  */
> +   note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX);
> +   if (note)
> + remove_note (insn, note);
> continue;

This is a totally unfounded hope, by the way.  I'm certain that I tried this
the first time around, with similarly poor results.

See the g++.dg/opt/pr52727.C test case that fails with this patch.  I can only
assume you didn't examine i686 test results when you committed this?

The problem is that the last adjustment _did_ apply the correct note
adjustment, for the stack at that time.  But neither moving the note nor
dropping the note is correct.

Consider

// args_size 4
add $4, %esp// args_size 0
push%eax// no note; fpu, not call related

transformed by csa to

mov %eax, (%esp)

If we move the note from the add to the mov, as we did prior to your patch, we
retain the essence of the pop, that we're no longer queueing arguments.  But if
we drop the note, as we do with your patch, then we still believe that we have
4 bytes of arguments on the stack.

You say a bootstrap with Go was the testcase that sent you in this direction?
I'll have a look...


r~


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-01-23 Thread Jan Hubicka
> FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer
> -funroll-loops  line 21 i == v + 1
> FAIL: gcc.target/i386/pr35767-5.c scan-assembler-not movups
> 
> Should we fix
> 
> FAIL: gcc.target/i386/pr35767-5.c scan-assembler-not movups

I will look into this one.  It did not failed for me previously.
I wonder if the guality case is broken unwind info - will try to check.

Honza
> 
> 
> -- 
> H.J.


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-01-22 Thread H.J. Lu
On Tue, Jan 21, 2014 at 3:55 PM, Jan Hubicka  wrote:
>> On 01/17/14 14:32, Jan Hubicka wrote:
>> >>* combine-stack-adj.c (combine_stack_adjustments_for_block): Remove
>> >>ARG_SIZE note when adjustment was eliminated.
>> >
>> >Ping...  This patch prevents me from switching the accumulate-args default
>> >for generic and I am waiting for that witht he inliner tunning, so there is
>> >quite a dependency chain.
>> This is fine.
>>
>> Can you add a testcase for this?
>
> It occurs during bootstrap compiling go, so I think we should have this 
> covered.
> I will try to construct some artificial testcase in C, but it seems a bit
> tricky.
>

Enable X86_TUNE_ACCUMULATE_OUTGOING_ARGS for generic
caused:

FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-1.c  -O2  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -fomit-frame-pointer  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-1.c  -O3 -g  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O2 -flto -fno-use-linker-plugin
-flto-partition=none  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c  -O2  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -fomit-frame-pointer  line 23 z == 8
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 x == 36
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 y == 25
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 20 z == 6
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 x == 98
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 y == 117
FAIL: gcc.dg/guality/pr54519-3.c  -O3 -g  line 23 z == 8
FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions  line 21 i == v + 1
FAIL: gcc.dg/guality/pr54693-2.c  -O3 -fomit-frame-pointer
-funroll-loops  line 21 i == v + 1
FAIL: gcc.target/i386/pr35767-5.c scan-assembler-not movups

Should we fix

FAIL: gcc.target/i386/pr35767-5.c scan-assembler-not movups


-- 
H.J.


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-01-21 Thread Jan Hubicka
> On 01/17/14 14:32, Jan Hubicka wrote:
> >>* combine-stack-adj.c (combine_stack_adjustments_for_block): Remove
> >>ARG_SIZE note when adjustment was eliminated.
> >
> >Ping...  This patch prevents me from switching the accumulate-args default
> >for generic and I am waiting for that witht he inliner tunning, so there is
> >quite a dependency chain.
> This is fine.
> 
> Can you add a testcase for this?

It occurs during bootstrap compiling go, so I think we should have this covered.
I will try to construct some artificial testcase in C, but it seems a bit
tricky.

Thank you!
Jan
> 
> jeff


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-01-21 Thread Jeff Law

On 01/17/14 14:32, Jan Hubicka wrote:

* combine-stack-adj.c (combine_stack_adjustments_for_block): Remove
ARG_SIZE note when adjustment was eliminated.


Ping...  This patch prevents me from switching the accumulate-args default
for generic and I am waiting for that witht he inliner tunning, so there is
quite a dependency chain.

This is fine.

Can you add a testcase for this?

jeff



Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-01-17 Thread Jan Hubicka
>   * combine-stack-adj.c (combine_stack_adjustments_for_block): Remove
>   ARG_SIZE note when adjustment was eliminated.

Ping...  This patch prevents me from switching the accumulate-args default
for generic and I am waiting for that witht he inliner tunning, so there is
quite a dependency chain.

Thanks,
HOnza


Re: Fix bootstrap with -mno-accumulate-outgoing-args

2014-01-02 Thread Ryan Mansfield

On 14-01-01 09:23 AM, Jan Hubicka wrote:

Here insn 765 gets adjusted for insn766 wihtout updating REG_ARGS_SIZE.
This is IMO not bug in scheduler but bug in fact that insn 775 should not have 
the note on it
when it is not adjusting stack pointer. This is because originally it was push 
instruction,
but combine-stack-adjustments turned it into move.
I think it should remove the ARG_SIZE note when doing so, since argument block 
size should
be corrently annotated by preceeding stack adjustment insn.

Bootstrapped/regtested x86_64-linux. Did not double check it is solves the ARM 
issue
reported in PR, too.


FWIW, I tested your patch and it doesn't address the ICE on ARM.

Regards,

Ryan Mansfield


Fix bootstrap with -mno-accumulate-outgoing-args

2014-01-01 Thread Jan Hubicka
Hi,
this patch fixes ICE seen with -mno-accumulate-outgoing-args bootstrap building 
go runtime.
The ICE is in dwarf2cfi.c while checking that on all paths into given basic 
block stack
frames are same.  It goes away either with disabling crossjumping or sched2 but 
the problem
IMO really originates in combine-stack-adj.
Crossjumping combines two call seqences and breaks basic blocks in half of 
argument setups.

OK, sched2 reorders:

(insn 762 761 765 50 (set (mem/c:SI (plus:SI (reg/f:SI 7 sp)
(const_int 112 [0x70])) [17 MEM[(struct  *)&bs + 8B]+0 S4 A64])
(const_int 8 [0x8])) ../../../../libgo/go/encoding/binary/binary.go:273 
90 {*movsi_internal}
 (nil))

(insn 765 762 766 50 (set (mem:HI (reg/f:SI 7 sp) [49  S2 A16])
(reg/v:HI 1 dx [orig:232 v ] [232])) 
../../../../libgo/go/encoding/binary/binary.go:273 91 {*movhi_internal}
 (expr_list:REG_DEAD (reg/v:HI 1 dx [orig:232 v ] [232])
(expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
(nil
(insn 766 765 2326 50 (parallel [
(set (reg/f:SI 7 sp)
(plus:SI (reg/f:SI 7 sp)
(const_int -12 [0xfff4])))
(clobber (reg:CC 17 flags))
]) ../../../../libgo/go/encoding/binary/binary.go:273 265 {*addsi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags)
(expr_list:REG_ARGS_SIZE (const_int 28 [0x1c])
(nil

to
(insn 766 761 762 54 (parallel [
(set (reg/f:SI 7 sp)
(plus:SI (reg/f:SI 7 sp)
(const_int -12 [0xfff4])))
(clobber (reg:CC 17 flags))
]) ../../../../libgo/go/encoding/binary/binary.go:273 265 {*addsi_1}
 (expr_list:REG_UNUSED (reg:CC 17 flags) 
(expr_list:REG_ARGS_SIZE (const_int 28 [0x1c])
(nil
(insn:TI 762 766 765 54 (set (mem/c:SI (plus:SI (reg/f:SI 7 sp)
(const_int 124 [0x7c])) [17 MEM[(struct  *)&bs + 8B]+0 S4 A64])
(const_int 8 [0x8])) ../../../../libgo/go/encoding/binary/binary.go:273 
90 {*movsi_internal}
 (nil)) 
(insn 765 762 770 54 (set (mem:HI (plus:SI (reg/f:SI 7 sp)
(const_int 12 [0xc])) [49  S2 A16])
(reg/v:HI 1 dx [orig:232 v ] [232])) 
../../../../libgo/go/encoding/binary/binary.go:273 91 {*movhi_internal}
 (expr_list:REG_DEAD (reg/v:HI 1 dx [orig:232 v ] [232])
(expr_list:REG_ARGS_SIZE (const_int 16 [0x10])
(nil

Here insn 765 gets adjusted for insn766 wihtout updating REG_ARGS_SIZE.  
This is IMO not bug in scheduler but bug in fact that insn 775 should not have 
the note on it
when it is not adjusting stack pointer. This is because originally it was push 
instruction,
but combine-stack-adjustments turned it into move.
I think it should remove the ARG_SIZE note when doing so, since argument block 
size should
be corrently annotated by preceeding stack adjustment insn.

Bootstrapped/regtested x86_64-linux. Did not double check it is solves the ARM 
issue
reported in PR, too.
OK?

* combine-stack-adj.c (combine_stack_adjustments_for_block): Remove
ARG_SIZE note when adjustment was eliminated.
Index: combine-stack-adj.c
===
--- combine-stack-adj.c (revision 206233)
+++ combine-stack-adj.c (working copy)
@@ -567,6 +567,7 @@ combine_stack_adjustments_for_block (bas
  && try_apply_stack_adjustment (insn, reflist, 0,
 -last_sp_adjust))
{
+ rtx note;
  if (last2_sp_set)
maybe_move_args_size_note (last2_sp_set, last_sp_set, false);
  else
@@ -576,6 +577,11 @@ combine_stack_adjustments_for_block (bas
  reflist = NULL;
  last_sp_set = NULL_RTX;
  last_sp_adjust = 0;
+ /* We no longer adjust stack size.  Whoever adjusted it earlier
+hopefully got the note right.  */
+ note = find_reg_note (insn, REG_ARGS_SIZE, NULL_RTX);
+ if (note)
+   remove_note (insn, note);
  continue;
}
}